Anticipating Rewards in Continuous Time and Space: A Case Study in Developmental Robotics

TitleAnticipating Rewards in Continuous Time and Space: A Case Study in Developmental Robotics
Publication TypeBook Chapter
Year of Publication2007
AuthorsBlanchard, AJ, Cañamero, L
EditorButz, MV, Sigaud, O, Pezzulo, G, Baldassarre, G
Book TitleAnticipatory Behavior in Adaptive Learning Systems: From Brains to Individual and Social Behavior
Series TitleLecture Notes in Artificial Intelligence
CityBerlin, Heidelberg
ISBN Number978-3-540-74261-6

This paper presents the first basic principles, implementation and experimental results of what could be regarded as a new approach to reinforcement learning, where agents—physical robots interacting with objects and other agents in the real world—can learn to anticipate rewards using their sensory inputs. Our approach does not need discretization, notion of events, or classification, and instead of learning rewards for the different possible actions of an agent in all the situations, we propose to make agents learn only the main situations worth avoiding and reaching. However, the main focus of our work is not reinforcement learning as such, but modeling cognitive development on a small autonomous robot interacting with an “adult” caretaker, typically a human, in the real world; the control architecture follows a Perception-Action approach incorporating a basic homeostatic principle. This interaction occurs in very close proximity, uses very coarse and limited sensory-motor capabilities, and affects the “well-being” and affective state of the robot. The type of anticipatory behavior we are concerned with in this context relates to both sensory and reward anticipation. We have applied and tested our model on a real robot.