What do you really want? The robot challenge of recognizing human intention.

Imagine that your friends are visiting you at home for dinner today, and you need to re-organize the space to make room for everyone. However, you are alone and the furniture is too heavy. Lucky for you, you just purchased a robot whose strength you could use for moving tables, sofa, and so on; and you actually see that it knows how to perform those tasks. You then command the robot to corner the sofa on the wall, but you see that the trajectory executed is not the one desired, and neither is the end goal. Therefore, you decide to help carry the sofa to guide the robot to perform the task. “This robot does not take me into account”, you think, while observing that it is still moving the sofa without following your guidance. “Would it be great if the robot could not only execute tasks but understand what I want?”

In this previous example, a typical Human-Robot Collaboration (HRC) situation has emerged that did not satisfy the expectations of the user. Even though the robot knew the task and how to execute the steps, it lacked the understanding of what the human really wanted. To enhance Human-Robot Collaboration, robots need to recognize and predict human intention to adapt accordingly and assist better in the task at hand.

Currently researchers tend to address human intention by answering (i) ‘What are we doing?’, (ii) ‘Where are we?, (iii) ‘What are we interested in?’, (iv) ‘What are we interacting with?’ or (v) ‘What are we saying?’ . The answers to these questions are not trivial at all, and each is alone a long-standing challenge in the research community. However, we claim that human intention is not about dealing with each topic separately, but to understand the underlying dependencies that coexists among them.

Our work then consists of not only answering these different questions from a human-like intuitive perspective but also understanding the relationship between each question and answer to construct a higher-level system, which will lead us to recognize and predict human intention.

Then, the robot will be able to understand what you really want and to match its behavior to your intention. You will be able to corner the sofa by guiding the robot based on your gaze, your movement and your language and finally your friends will be able to come for dinner!

Leave a Reply Cancel reply