Using perceptual classes to dream policies in open-ended learning robotics
Integrated Computer-Aided Engineering, 2023•content.iospress.com
Abstract Achieving Lifelong Open-ended Learning Autonomy (LOLA) is a key challenge in
the field of robotics to advance to a new level of intelligent response. Robots should be
capable of discovering goals and learn skills in specific domains that permit achieving the
general objectives the designer establishes for them. In addition, robots should reuse
previously learnt knowledge in different domains to facilitate learning and adaptation in new
ones. To this end, cognitive architectures have arisen which encompass different …
the field of robotics to advance to a new level of intelligent response. Robots should be
capable of discovering goals and learn skills in specific domains that permit achieving the
general objectives the designer establishes for them. In addition, robots should reuse
previously learnt knowledge in different domains to facilitate learning and adaptation in new
ones. To this end, cognitive architectures have arisen which encompass different …
Abstract
Achieving Lifelong Open-ended Learning Autonomy (LOLA) is a key challenge in the field of robotics to advance to a new level of intelligent response. Robots should be capable of discovering goals and learn skills in specific domains that permit achieving the general objectives the designer establishes for them. In addition, robots should reuse previously learnt knowledge in different domains to facilitate learning and adaptation in new ones. To this end, cognitive architectures have arisen which encompass different components to support LOLA. A key feature of these architectures is to implement a proper balance between deliberative and reactive processes that allows for efficient real time operation and knowledge acquisition, but this is still an open issue. First, objectives must be defined in a domain-independent representation that allows for the autonomous determination of domain-dependent goals. Second, as no explicit reward function is available, a method to determine expected utility must also be developed. Finally, policy learning may happen in an internal deliberative scale (dreaming), so it is necessary to provide an efficient way to infer relevant and reliable data for dreaming to be meaningful. The first two aspects have already been addressed in the realm of the e-MDB cognitive architecture. For the third one, this work proposes Perceptual Classes (P-nodes) as a metacognitive structure that permits generating relevant “dreamt” data points that allow creating “imagined” trajectories for deliberative policy learning in a very efficient way. The proposed structure has been tested by means of an experiment with a real robot in LOLA settings, where it has been shown how policy dreaming is possible in such a challenging realm.
content.iospress.com
以上显示的是最相近的搜索结果。 查看全部搜索结果