Print Email Facebook Twitter Efficient abstraction selection in reinforcement learning Title Efficient abstraction selection in reinforcement learning Author van Seijen, H.H. Whiteson, S. Kester, L.J.H.M. Publication year 2014 Abstract This article addresses reinforcement learning problems based on factored Markov decision processes (MDPs) in which the agent must choose among a set of candidate abstractions, each build up from a different combination of state components. We present and evaluate a new approach that can perform effective abstraction selection that is more resource-efficient and/or more general than existing approaches. The core of the approach is to make selection of an abstraction part of the learning agent's decision-making process by augmenting the agent's action space with internal actions that select the abstraction it uses. We prove that under certain conditions this approach results in a derived MDP whose solution yields both the optimal abstraction for the original MDP and the optimal policy under that abstraction. We examine our approach in three domains of increasing complexity: contextual bandit problems, episodic MDPs, and general MDPs with context-specific structure. Subject Physics & ElectronicsDSS - Distributed Sensor SystemsTS - Technical SciencesInfostructuresInformation SocietyAbstraction selectionModel-free learningReinforcement learningStructure learningAbstractingMarkov processesContext-specific structuresContextual banditsDecision making processModel-free learningResource-efficient To reference this document use: http://resolver.tudelft.nl/uuid:7482d37d-699b-4191-8971-b156313a4c86 DOI https://doi.org/10.1111/coin.12016 TNO identifier 520172 ISSN 0824-7935 Source Computational Intelligence, 30 (4), 657-699 Document type article Files To receive the publication files, please send an e-mail request to TNO Library.