Odalric-Ambrym Maillard
Odalric-Ambrym Maillard
Inria Lille - Nord Europe
Verified email at inria.fr - Homepage
TitleCited byYear
Kullback–leibler upper confidence bounds for optimal sequential allocation
O Cappé, A Garivier, OA Maillard, R Munos, G Stoltz
The Annals of Statistics 41 (3), 1516-1541, 2013
2092013
Compressed least-squares regression
OA Maillard, R Munos
1012009
A finite-time analysis of multi-armed bandits problems with kullback-leibler divergences
OA Maillard, R Munos, G Stoltz
Proceedings of the 24th annual Conference On Learning Theory, 497-514, 2011
932011
Concentration inequalities for sampling without replacement
R Bardenet, OA Maillard
Bernoulli 21 (3), 1361-1385, 2015
762015
LSTD with random projections
M Ghavamzadeh, A Lazaric, OA Maillard, R Munos
572010
Latent Bandits.
OA Maillard, S Mannor
International Conference on Machine Learning, 136-144, 2014
432014
Linear regression with random projections
OA Maillard, R Munos
Journal of Machine Learning Research 13 (Sep), 2735-2772, 2012
402012
Finite-sample analysis of Bellman residual minimization
OA Maillard, R Munos, A Lazaric, M Ghavamzadeh
Proceedings of 2nd Asian Conference on Machine Learning, 299-314, 2010
332010
Sub-sampling for multi-armed bandits
A Baransi, OA Maillard, S Mannor
Joint European Conference on Machine Learning and Knowledge Discovery in …, 2014
302014
Selecting the state-representation in reinforcement learning
OA Maillard, D Ryabko, R Munos
Advances in Neural Information Processing Systems, 2627-2635, 2011
262011
Robust risk-averse stochastic multi-armed bandits
OA Maillard
International Conference on Algorithmic Learning Theory, 218-233, 2013
242013
Adaptive Bandits: Towards the best history-dependent strategy
OA Maillard, R Munos
242011
Online learning in adversarial Lipschitz environments
OA Maillard, R Munos
Joint European Conference on Machine Learning and Knowledge Discovery in …, 2010
212010
How hard is my MDP?" The distribution-norm to the rescue"
OA Maillard, TA Mann, S Mannor
Advances in Neural Information Processing Systems, 1835-1843, 2014
202014
Hybrid collaborative filtering with autoencoders
F Strub, J Mary, R Gaudel
arXiv preprint arXiv:1603.00806, 2016
182016
Optimal regret bounds for selecting the state representation in reinforcement learning
OA Maillard, P Nguyen, R Ortner, D Ryabko
International Conference on Machine Learning, 543-551, 2013
182013
Selecting near-optimal approximate state representations in reinforcement learning
R Ortner, OA Maillard, D Ryabko
International Conference on Algorithmic Learning Theory, 140-154, 2014
172014
The non-stationary stochastic multi-armed bandit problem
R Allesiardo, R Féraud, OA Maillard
International Journal of Data Science and Analytics 3 (4), 267-283, 2017
162017
Variance-aware regret bounds for undiscounted reinforcement learning in mdps
MS Talebi, OA Maillard
arXiv preprint arXiv:1803.01626, 2018
142018
Competing with an infinite set of models in reinforcement learning
P Nguyen, OA Maillard, D Ryabko, R Ortner
Artificial Intelligence and Statistics, 463-471, 2013
122013
The system can't perform the operation now. Try again later.
Articles 1–20