Information Services banner Edinburgh Research Archive The University of Edinburgh crest

Edinburgh Research Archive >
Informatics, School of >
Informatics Publications >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1842/3644

This item has been viewed 11 times in the last year. View Statistics

Files in This Item:

File Description SizeFormat
Using Dimensionality Reduction to Exploit Constraints in Reinforcement Learning.pdf1.89 MBAdobe PDFView/Open
Title: Using Dimensionality Reduction to Exploit Constraints in Reinforcement Learning
Authors: Bitzer, Sebastian
Howard, Matthew
Vijayakumar, Sethu
Issue Date: 2010
Journal Title: Proc. IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS 2010), Taiwan (2010).
Abstract: Reinforcement learning in the high-dimensional, continuous spaces typical in robotics, remains a challenging problem. To overcome this challenge, a popular approach has been to use demonstrations to find an appropriate initialisation of the policy in an attempt to reduce the number of iterations needed to find a solution. Here, we present an alternative way to incorporate prior knowledge from demonstrations of individual postures into learning, by extracting the inherent problem structure to find an efficient state representation. In particular, we use probabilistic, nonlinear dimensionality reduction to capture latent constraints present in the data. By learning policies in the learnt latent space, we are able to solve the planning problem in a reduced space that automatically satisfies task constraints. As shown in our experiments, this reduces the exploration needed and greatly accelerates the learning. We demonstrate our approach for learning a bimanual reaching task on the 19-DOF KHR-1HV humanoid.
Keywords: Informatics
Computer Science
Robotics
URI: http://hdl.handle.net/1842/3644
Appears in Collections:Informatics Publications

Items in ERA are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback