Seguir
Stephen McAleer
Stephen McAleer
OpenAI
Dirección de correo verificada de openai.com - Página principal
Título
Citado por
Citado por
Año
Highly accurate machine fault diagnosis using deep transfer learning
S Shao, S McAleer, R Yan, P Baldi
IEEE Transactions on Industrial Informatics 15 (4), 2446-2455, 2018
12702018
Language Models can Solve Computer Tasks
G Kim, P Baldi, S McAleer
Neural Information Processing Systems (NeurIPS), 2023
2702023
Solving the Rubik’s cube with deep reinforcement learning and search
F Agostinelli*, S McAleer*, A Shmakov*, P Baldi
Nature Machine Intelligence 1 (8), 356-363, 2019
2392019
Llemma: An Open Language Model for Mathematics
Z Azerbayev, H Schoelkopf, K Paster, M Dos Santos, S McAleer, AQ Jiang, ...
International Conference on Learning Representations (ICLR), 2023
2202023
Mastering the game of Stratego with model-free multiagent reinforcement learning
J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub, V de Boer, ...
Science 378 (6623), 990-996, 2022
2202022
AI Alignment: A Comprehensive Survey
J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang, Y Duan, Z He, J Zhou, ...
arXiv preprint arXiv:2310.19852, 2023
1832023
Solving the Rubik's Cube with Approximate Policy Iteration
S McAleer*, F Agostinelli*, A Shmakov*, P Baldi
International Conference on Learning Representations (ICLR), 2018
101*2018
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Y Chen, Y Yang, T Wu, S Wang, X Feng, J Jiang, SM McAleer, H Dong, ...
36th Conference on Neural Information Processing Systems (NeurIPS 2022 …, 2022
992022
Pipeline PSRO: A scalable approach for finding approximate nash equilibria in large games
S McAleer*, J Lanier*, R Fox, P Baldi
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
822020
Evolutionary reinforcement learning for sample-efficient multiagent coordination
S Majumdar, S Khadka, S Miret, S McAleer, K Tumer
International Conference on Machine Learning (ICML), 2020
722020
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
602024
Independent Natural Policy Gradient Always Converges in Markov Potential Games
R Fox, S McAleer, W Overman, I Panageas
AISTATS 2022, 2021
582021
XDO: A double oracle algorithm for extensive-form games
S McAleer, J Lanier, P Baldi, R Fox
Advances in Neural Information Processing Systems (NeurIPS), 2021
552021
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems (NeurIPS), 2021
49*2021
Online Double Oracle
LC Dinh, Y Yang, S McAleer, NP Nieves, O Slumbers, Z Tian, DH Mguni, ...
Transactions on Machine Learning Research, 2021
332021
Deep-learning-based reconstruction of the neutrino direction and energy for in-ice radio detectors
C Glaser, S McAleer, S Stjärnholm, P Baldi, SW Barwick
Astroparticle Physics 145, 102781, 2023
31*2023
White Paper: ARIANNA-200 high energy neutrino telescope
A Anker, P Baldi, SW Barwick, D Bergman, H Bernhoff, DZ Besson, ...
arXiv preprint arXiv:2004.09841, 2020
302020
Confronting Reward Model Overoptimization with Constrained RLHF
T Moskovitz, AK Singh, DJ Strouse, T Sandholm, R Salakhutdinov, ...
International Conference on Learning Representations (ICLR) spotlight, 2023
282023
Reducing variance in temporal-difference value estimation via ensemble of deep networks
L Liang, Y Xu, S McAleer, D Hu, A Ihler, P Abbeel, R Fox
International Conference on Machine Learning (ICML), 2022
26*2022
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games
S McAleer, JB Lanier, K Wang, P Baldi, R Fox, T Sandholm
International Conference on Learning Representations (ICLR), 2022
25*2022
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20