Title
Exploiting Best-Match Equations for Efficient Reinforcement Learning
Author
van Seijen, H.H.
Whiteson, S.
van Hasselt, H.
Wiering, M.
Publication year
2011
Abstract
This article presents and evaluates best-match learning, a new approach to reinforcement learning that trades off the sample efficiency of model-based methods with the space efficiency of model-free methods. Best-match learning works by approximating the solution to a set of best-match equations, which combine a sparse model with a model-free Q-value function constructed from samples not used by the model. We prove that, unlike regular sparse model-based methods, best-match learning is guaranteed to converge to the optimal Q-values in the tabular case. Empirical results demonstrate that best-match learning can substantially outperform regular sparse model-based methods, as well as several model-free methods that strive to improve the sample efficiency of temporal-difference methods. In addition, we demonstrate that best-match learning can be successfully combined with function approximation.
Subject
Reinforcement learning
On-line learning
Temporal-difference methods
Function approximation
Data reuse
Best match
Physics & Electronics
DSS - Distributed Sensor Systems
TS - Technical Sciences
To reference this document use:
http://resolver.tudelft.nl/uuid:2b3dd333-2e55-4957-9e22-e2eeae75ee11
TNO identifier
431020
Source
Journal of Machine Learning Research, 12 (12), 2045-2094
Document type
article