The Fifth International Workshop on
Intrinsically Motivated Open-ended Learning
4th-6th of April 2022
Max Planck Institute for Intelligent Systems,
Max Planck Institute for Intelligent Systems,
IMOL 2022 is the fifth international workshop on Intrinsically Motivated Open-ended Learning. Following previous editions, IMOL 2022 aims to further explore the promise of intrinsically motivated open-ended lifelong learning in robots and artificial systems.
Following the four previous workshops, IMOL 2022 will explore the advancements of intrinsically motivated open-ended lifelong learning. One of our goals is to bring together researchers coming from different fields related to open-ended learning and autonomous development. The workshop aims to be a highly interactive event with high-profile keynote presentations and the participation of an audience. We hope to foster close interactions among the participants with discussions, poster sessions, and collective round tables.
Abstract submission: Feb 14th 2022 (extended deadline)
Conference dates: April 4-6th 2022
My research focuses on how children and adults actively search for information when making decisions, drawing causal inferences and solving categorization tasks. Search strategies, as any other kind of strategies, are not always effective, because their usefulness and performance depends on the characteristics of the problem presented. In this sense, I am interested in how adaptive children and adults’ search for information strategies are, how sensitive and responsive they are to the structure of the tasks. I am especially interested in how actively searching for information, being able to generate the information we are interested in and to focus on what we consider most relevant, can impact our learning, understanding and explanations.
This talk will introduce the Ecological Learning framework, which focuses on children’s ability to adapt and tailor their active learning strategies to the particular structure and characteristics of a learning environment. In particular, I will present the results of several seminal studies indicating that efficient, adaptive search strategies emerge around 3 years of age, much earlier than previously assumed. This work highlights the importance of developing age-appropriate paradigms that capture children’s early competence to gain a more comprehensive and fair picture of their active learning abilities. Also, it offers a process-oriented theoretical framework that can accommodate and reconcile a sparse but growing body of work documenting children’s active and adaptive learning.
Daniel Polani is Professor of Artificial Intelligence, Director of the Centre for Computer Science and Informatics Research (CCSIR) and Head of the Adaptive Systems Research Group at the University of Hertfordshire. His research interests principles of cognition and intelligent decision-making, expressed in the language of information theory. In the past decades, he developed a fully information-theoretic account of the perception-action loop. Amongst other, he developed the concept of relevant information which permit an information-theoretic approach to decision-making under bounded informational resources to understand cognitive constraints to task execution. He pioneered the concept of empowerment, the external action-perception channel capacity of an agent, which has been shown to be a general and widely applicable model for generic taskless behaviour generation in ill-specified domains. This including assistance of third parties. The empowerment model is now also used by companies such as DeepMind, Volkswagen and others. The information-theoretic framework has been also more widely used to model multi-agent scenarios. He has been PI e.g. in the Horizon 2020 projects socSMCs (about social sensorimotor contingencies) and WiMUST (mission control of robotic submarine UAVs), FP7 CORBYS (cognitive control framework for robotic systems), and Co-PI in FP7 RoboSkin (artificial skin technology and its detection) and FEELIX GROWING (robots detecting and responding to emotional cue). He was President of the RoboCup Federation in 2017-2019.
Intrinsic motivations have now increasingly become established as a way to look at how agents find desirable things to do. Do we understand where this comes from? Hand-designed rewards are the way of traditional AI to do so, but the reality is that they are in general sparse, sometimes very, and often need a lot of additional tweaking to achieve the desired results. Not only is this clearly not what biological organisms do, but in fact, a lo of prior knowledge is infused in any reward shaping explicitly undertaken. Intrinsic motivations, on the other hand, seem to get much of this for free from the very structure of the environment. In my talk, I am going to review a few such relevant intrinsic motivations and what might make them work or fail. It is clear that the past of an agent and its actual (or potential) futures are crucial for this to work. We will discuss the manner in which this happens and, if there is time, also what this may tell us about the nature of decision-making in embodied agents.
Deepak Pathak is a faculty in the School of Computer Science at Carnegie Mellon University. He received his Ph.D. from UC Berkeley and his research spans computer vision, machine learning, and robotics. He is a recipient of the faculty awards from Google, Sony, GoodAI, Samsung, and graduate fellowship awards from Facebook, NVIDIA, Snapchat. His research has been featured in popular press outlets, including The Economist, The Wall Street Journal, Quanta Magazine, Washington Post, CNET, Wired, and MIT Technology Review among others. Deepak received his Bachelor’s from IIT Kanpur with a Gold Medal in Computer Science. He co-founded VisageMap Inc. later acquired by FaceFirst Inc. For details: https://www.cs.cmu.edu/~dpathak/
How can we train a robot that can generalize to perform thousands of tasks in thousands of environments? This question underscores the holy grail of robot learning research dominated by learning from demonstrations or reward-based learning. However, both of these paradigms fall short because it is difficult to supervise an agent for all possible situations it can encounter in the future. We posit that such an ability is only possible if the robot can learn continually and adapt rapidly to new situations. Unsupervised exploration provides a means to autonomously and continually discover new tasks and acquire intelligent behavior without relying on any experts. However, just discovering new skills is not enough, the agent needs to adapt them to each new environment in an online manner. In this talk, I will first talk about our early efforts in this direction by decoupling this general goal into two sub-problems: 1) continually discovering new tasks in the same environment, and 2) generalization to new environments for the same task. I will discuss how these sub-problems can be combined to build a framework for general-purpose embodied intelligence. Throughout the talk, I will present several results from case studies of real-world robot learning including legged robots walking in diverse unseen terrains, robotic arm performing a range of unseen diverse manipulation tasks in a zero-shot manner, and robots that are able to write on a white-board from visual input.
Jochen Triesch received his Diploma and Ph.D. degrees in Physics from the University of Bochum, Germany, in 1994 and 1999, respectively. After two years as a post-doctoral fellow at the Computer Science Department of the University of Rochester, NY, USA, he joined the faculty of the Cognitive Science Department at UC San Diego, USA as an Assistant Professor in 2001. In 2005 he became a Fellow of the Frankfurt Institute for Advanced Studies (FIAS), in Frankfurt am Main, Germany. In 2006 he received a Marie Curie Excellence Center Award of the European Union. Since 2007 he is the Johanna Quandt Research Professor for Theoretical Life Sciences at FIAS. He also holds professorships at the Department of Physics and the Department of Computer Science and Mathematics at the Goethe University in Frankfurt am Main, Germany. In 2019 he obtained a visiting professorship at the Université Clermont Auvergne, France. His research interests span Computational Neuroscience, Machine Learning, and Developmental AI.
Jun Tani received the D.Eng. degree from Sophia University, Tokyo in 1995. He started his research career with Sony Computer Science Lab. in 1993. He became a Team Leader of the Laboratory for Behavior and Dynamic Cognition, RIKEN Brain Science Institute, Saitama, Japan in 2001. He became a Full Professor with the Electrical Engineering Department, Korea Advanced Institute of Science and Technology, Daejeon, South Korea in 2012. He is currently a Full Professor with the Okinawa Institute of Science and Technology, Okinawa, Japan. He is also a visiting professor of The Technical University of Munich. His current research interests include cognitive neuroscience, developmental psychology, phenomenology, complex adaptive systems, and robotics. He is an author of “Exploring Robotic Minds: Actions, Symbols, and Consciousness as Self-Organizing Dynamic Phenomena.” published from Oxford Univ. Press in 2016.
The focus of my research has been to investigate how cognitive agents can develop structural representation and functions via iterative interaction with the world, exercising agency and learning from resultant perceptual experience. For this purpose, my team has investigated various models analogous to predictive coding and active inference frameworks using the free energy principle (FEP). In the current talk, I will introduce two recent studies. One is about online dynamic goal-directed action plan generation and its execution using a physical humanoid robot which was conducted under a supervised learning setting. The other is an extension of the aforementioned study by incorporating an exploratory reinforcement learning scheme using extrinsic rewards. I will discuss the future perspective of how these studies can be further extended by introducing intrinsic motivations such as curiosity under and beyond the FEP.
Kathryn Kasmarik is a Professor of Computer Science at the University of New South Wales, Australian Defence Force Academy (UNSW Canberra). Kathryn completed a Bachelor of Computer Science and Technology at the University of Sydney, including a study exchange at the University of California, Los Angeles (UCLA). She graduated with First Class Honours and the University Medal in 2002. She completed a PhD in Computer Science through the National ICT Australia and the University of Sydney in 2007. She moved to UNSW Canberra in 2008. Kathryn’s research interests lie in the area of autonomous mental development for computers and robots. Her speciality in this area is computational models of motivation, such as curiosity, interest, achievement, affiliation and power motivation. She has published over 130 articles on these topics in peer reviewed conference and journals, and two books. Her research has been funded by the Australian Research Council and Defence Science and Technology Group, among other sources. She was the Deputy Head of School (Teaching) for the School of Engineering and IT at UNSW Canberra from 2018-2021 and is currently Secretary of the IEEE Australian Capital Territory Section.
Collective behaviours such as swarm formations of autonomous agents offer the advantages of efficient movement, redundancy, and potential for human guidance of a single swarm organism. However, with the explosion in hardware platforms for autonomous vehicles, swarm robotic programming requires significant manual input for each new platform. This talk introduces two developmental approaches to evolving collective behaviours whereby the developmental process is guided by a task-non-specific value system. Two value systems will be considered: the first based on a survey of human perception of swarming and the second based on a computational model of curiosity. Unlike traditional approaches, these value systems do not need in advance the precise characteristics of the intended swarming behaviours. Rather they reward the emergence of structured collective motions, permitting the emergence of multiple collective behaviours, including aggregation and navigation behaviours. This talk will examine the performance of these value systems in a series of controlled experiments on point-mass ‘boids’ and simulated robots. We will see how the value systems can recognise multiple “interesting” structured collective behaviours and distinguish them from random movement patterns. We will also see how the value systems can be used to tune random motions into structured collective motions.
Kaushik Subramanian is a Senior Research Scientist at Sony AI working on the flagship Gaming project. Before joining Sony, Kaushik was at Cogitai, focusing on continual learning approaches for industrial applications and building a scalable, easy-to-use SaaS platform. He received his Ph.D. in 2017 from Georgia Institute of Technology, USA working on human-in-the-loop systems and exploration for reinforcement learning. His research interests broadly span algorithms for intelligent decision making and interactive learning.
Prof. Dr. Martin V. Butz has been a professor at the Department of Computer Science and the Department of Psychology, Faculty of Science, University of Tübingen, Germany, since 2011. His main background lies in computer science and machine learning (Diploma from Würzburg University and PhD from the University of Illinois at Urbana-Champaign, IL, USA; both in Computer Science). Throughout his research career he has been collaborating with researchers from various other disciplines, including cognitive and developmental psychologists, computational neuroscientists, roboticists, and linguists. His research focuses on neuro-computational, cognitive modeling and cognitive science more generally. A current main focus lies in uncovering conceptual, compositional, causal structures from sensorimotor experiences in humans and artificial systems. Important recent publications include his co-authored monograph: “How the Mind Comes Into Being: Introducing Cognitive Science from a Functional and Computational Perspective” (Oxford University Press, 2017), a special issue on “Event-Predictive Cognition” (2021, Topics in Cognitive Science), and a computational perspective on human and artificial intelligence (“Towards Strong AI”, 2021, Künstliche Intelligenz).
Hierarchical, compositionally-recombinable models and behavioral primitives constitute crucial components for interacting with our world in a versatile, flexible, adaptive manner. Event-predictive cognition offers a theoretical framework on how such models may develop and may be invoked in a self-motivated manner. In the presentation, I selectively introduce some of our recent recurrent artificial neural network models, sketching-out a pathway on how to develop event-predictive Gestalten and how their anticipatory, self-motivated activation can model human-like behavior. First, I introduce an RNN that infers Gestalten from dynamic motion stimuli, modeling the bi-stable perception of the Silhouette illusion. Next, I briefly show how the sensory information about our world may by structured in an affordance-oriented manner. Finally, on a more abstract level, I show how event-predictive latent codes can develop and can be used not only to solve state-of-the-art reinforcement learning benchmarks but also to mimic the development of anticipatory behavior in infants.
Martin Riedmiller is a research scientist and team-lead at DeepMind, London. Before joining DeepMind fulltime in spring 2015, he held several professor positions in machine learning and neuro-informatics from 2002 to 2015 at Dortmund, Osnabrück and Freiburg University. From 1998 to 2009 he lead the robot soccer team ‘Brainstormers’ that participated in the internationally renowned RoboCup competitions. As an early proof of the power of neural reinforcement learning techniques, the Brainstormers won the world championships for five times in both simulation and real robot leagues. He has contributed over 20 years in the fields of reinforcement learning, neural networks and learning control systems. He is author and co-author of some early and ground-lying work on efficient and robust supervised learning and reinforcement learning algorithms, including work on one of the first deep reinforcement learning systems.
Being able to autonomously learn control with minimal prior knowledge is a key ability of intelligent systems. A particular challenge in real world control scenarios are methods that are at the same time highly data-efficient and robust, since data-collection on real systems is time intensive and often expensive. I will discuss the collect & infer paradigm for Reinforcement Learning, that takes a fresh look at data collection and exploitation in data-efficient agents. I will give examples of learning agent designs that can learn increasingly complex tasks from scratch in simulation and reality.
Richard J. Duro is a Full Professor of Computer Science and Artificial Intelligence at the University of Coruña in Spain and the Coordinator of the Integrated Group for Engineering Research at this university since 1999. He received a PhD. in Physics from the University of Santiago de Compostela, Spain, in 1992, with work related to novel instrumentation systems in a collaboration with San Diego State University and the University of California San Diego where he did post-doctoral work. His teaching and research interests are in intelligent systems and autonomous robotics and his current work concentrates on motivational systems and developmental cognitive robotic architectures. Dr. Duro has published more than 250 articles in refereed journals, edited books and refereed conference proceedings, and co-authored 8 books. He is also the holder of 6 patents and 35 software registers. He is a Senior member of the IEEE and is currently a member of the Board of Governors of the International Neural Network Society (INNS) and its Vicepresident for Conferences and Technical Activities. He has been the Principal Investigator of more than 20 publicly funded research projects and over 100 contracts with industry.
Sao Mai Nguyen specializes in cognitive developmental learning, reinforcement learning, imitation learning, automatic curriculum learning for robots : she develops algorithms for robots to learn multi-task controls by designing themselves their curriculum and by active imitation learning. She received her PhD from Inria in 2013 in computer science, for her machine learning algorithms combining reinforcement learning and active imitation learning for interactive and multi-task learning. She holds an Engineer degree from Ecole Polytechnique, France and a master’s degree in adaptive machine systems from Osaka University, Japan. She has coordinated project KERAAL to enable a robot to coach physical rehabilitation. She has participated in project AMUSAAL, for analysing human activities of daily living, and CPER VITAAL for developing assistive technologies for the elderly and disabled. She is currently an assistant professor at Ensta IP Paris, France and was previously with IMT Atlantique. She also acts as an associate editor of the journal IEEE TCDS and the co-chair of the Task force “Action and Perception” of the IEEE Technical Committee on Cognitive and Developmental Systems. For more information visit her webpage: https://nguyensmai.free.fr.
Stephane Doncieux is Professor in Computer Science at ISIR (Institute of Intelligent Systems and Robotics), Sorbonne University, CNRS, in Paris, France. He is deputy director of the ISIR, a multidisciplinary robotics laboratory with researchers in mechatronics, signal processing computer science and neuroscience. Until that date, he was in charge of the AMAC multidisciplinary research team (Architectures and Models of Adaptation and Cognition). He was coordinator of the DREAM FET H2020 project from 2015 to 2018 (http://robotsthatdream.eu/). His research is in cognitive robotics, with a focus on learning and adaptation with a developmental approach.
Georg Martius is leading a research group on Autonomous Learning at the Max Planck Institute for Intelligent Systems in Tübingen, Germany. Before joining the MPI in Tübingen, he was a postdoc fellow at the IST Austria in the groups of Christoph Lampert and Gašper Tkačik after being a postdoc at the Max Planck Institute for Mathematics in the Sciences in Leipzig. He pursues research in autonomous learning, that is how an embodied agent can determine what to learn, how to learn, and how to judge the learning success. His research focus is on machine learning for robotics, including internal model learning, reinforcement learning, representation learning, differentiable reasoning and haptics.
Vieri Giuliano Santucci is a researcher at the Institute of Cognitive Sciences and Technologies (CNR, Rome). He holds a Ph.D. in computer science at the University of Plymouth (UK), an M.S. degree in theories and techniques of knowledge, faculty of Philosophy (University of Rome “La Sapienza”) and a B.Sc. degree in Philosophy (University of Pisa). His interests range from autonomous open-ended learning processes to motivations in both biological and artificial agents, as well as to the impact of new technologies on society and cognition. His current work focuses on the development of robotic architectures allowing artificial agents to autonomously improve their competences on the basis of the biologically-inspired construct of intrinsic motivations. He published in peer-reviewed journals and attended many international conferences, and actively contributed to the European Projects ‘IM-CLeVeR – Intrinsically-Motivated Cumulative-Learning Versatile Robots’ and ‘GOAL – Robots’, within which he began to develop the GRAIL architecture.
Autonomously learning multiple tasks in potentially unknown and unstructured environments is a paramount challenge for the development of versatile and adaptive artificial agents that have to be employed in real world scenarios. Even more interesting are the scenarios in which interdependencies exist between the different goals, forcing the robot not only to acquire the low-level skills necessary to reach the different desired states, but also the sequence of tasks that determine the preconditions necessary to obtain the goals themselves. In this presentation we will face these issues from an architectural perspective, presenting the last developments of GRAIL architecture, and showing how dividing the different learning processes into a hierarchy of mechanisms helps the robot in autonomously learning different interdependent tasks, even in scenarios where interdependencies between goals might change over time.
Rania Rayyes received a PhD degree in Computer Science (specialization: AI & Robotics) from Technische Universitat Braunschweig, Germany, in 2020 and she is currently a research scientist at the Institute for Robotics and Process Control at the Technische Universitat Braunschweig. She did her research internship at Sony Computer Science Laboratories, Inc. Tokyo, Japan for five months 10.2019 – 03.2020. She joined Google Get Ahead summer program in 2019. R. Rayyes has received several awards during her academic studies, e.g., “Robotik Talent” award for her Ph.D dissertation, the Superior Graduate award for her Mechatronics Bachelor degree, German Academic Exchange Service (DAAD) Scholarship for PhD study, and ANITA B.ORG Scholarship for Grace Hopper Celebration 2020. Her research interests include sensorimotor learning, efficient online learning, deep learning and Developmental Robots.
Christian Gumbsch studied cognitive science (B.Sc. and M.Sc.) at the University of Tübingen. He is currently pursuing the Ph.D. degree at the Max Planck Institute for Intelligent Systems and the University of Tübingen. His research interests include the learning of sensorimotor abstractions, hierarchical goal-directed planning, and event cognition.
IMOL2022 will also host the REAL 2021 competition with a hands-on micro-workshop on how to participate. The “Robot open-Ended Autonomous Learning” (REAL) competition is focused on systems that acquire sensorimotor competence that allows them to interact with their physical environments in a full autonomous way.
The REAL 2021 competition is currently open for submissions. For more information on how to participate visit the competition website.