Since ancient times, robots have been carrying the great dreams of human beings. With the rapid development of sensors, voice interaction, machine recognition, SLAM and other technologies, robots have begun to step out of science fiction and enter people’s lives.

The global robot market is expected to reach $29.82 billion in 2018, with an average growth rate of about 15.1% from 2013 to 2018, according to the China Robot Industry Development Report (2018) released at the World Robot Conference on Aug. 16. The robot market continues to heat up, and the era of robots is coming.


Baidu Artificial Intelligence Interactive Design Institute has conducted a series of studies on human-robot interactive experience by taking “robot” as the research object. This time we will share our research and thinking with the industry under the theme of “Natural Interaction between human and service robots in public Places”.

 

takeaway

The goal of human-robot interaction is to be natural and close to people’s cognitive habits

Research methods for human-robot interaction: natural observation, participatory design, and experimental methods

Design Recommendations for human-robot interaction: incremental interaction


preface


In public places, when people interact with machines (such as ATM machines), they tend to regard machines as tools, requiring interaction but not expecting communication. However, when faced with robots, especially humanoid robots, people tend to regard them as humanoid robots, with expectation of communication and longing for natural interaction.



At present, the research of human-robot interaction mainly focuses on the “dialogue stage”, such as speech recognition, semantic understanding, demand satisfaction and so on, while the “pre-dialogue stage” is rarely studied. The pre-dialogue stage is also crucial, as it is related to the “first impression” of the robot, and is the premise and basis for the smooth opening of the dialogue.


So, in the process of human going from far to near to robot, robot:

▪︎ Do YOU need to interact with people? Wait, be proactive, or do something to get people’s attention?

▪︎ How to interact with people? Smile or wink, wave or say hello, or use sensors and AI capabilities to find the right moment to interact?

 

These questions are what we will focus on this time.

The goal of human-robot interaction

Natural/close to people’s cognitive habits


Bill Gates once said that “the cognitive habit and form of communication with nature naturally formed by human beings must be the development direction of human-computer interaction”, and we will eventually communicate with robots in a natural way.

 

At the beginning of the study, we sorted out the relevant research results of psychology and sociology, as well as the practical results of the service industry. Based on this, we trace the source of the interaction between people and service personnel in public places, and refine the basic rules of human interaction.

 

Before you get to the rules, think back to your own experiences with service workers in public places. What made you feel good, what made you feel annoyed or even annoyed? According to our survey, there are two main types of bad service experience:

 

▪︎ Type 1: Overzealous and overserved

All the while watching/following/introducing /…


▪︎ Type 2: Cold and indifferent

He barely looked at me/gave me a cold look/ignored my questions /…

And we generally feel better service experience also shows a common character: active, enthusiastic, measured.

Just saw I will nod and smile/take the initiative to say hello/take the initiative to come forward if necessary/no need will not disturb /……


How to let people feel active, enthusiastic, build appropriate proportion feeling on psychology again? As a public service staff should follow at least the following two rules:


■ It’s important to maintain the right amount of space

Psychologists have found that everyone has an instinct to protect their personal space, and it’s especially strong in public. Once personal space is invaded, people will feel uncomfortable and even angry.


In his classic book Silent Language (1959), Edward Hall divided the spatial distance between people in daily life into four categories: intimate distance, personal distance, social distance and public distance. Each kind of distance has “near end” and “far end”.




Intimate distance is the minimum distance between people. It is usually reserved for personal situations, such as at home, and is used only between people with a high level of emotional connection. The proximal end, in particular, is usually only allowed for couples or children.


Personal distance is the proper distance between friends. Usually a stranger entering the proximal end would constitute an assault; On the far side, acquaintances and strangers can enter. However, acquaintances were more likely to move closer to the far side (75cm), while strangers were more likely to move closer to the far side (1.2m).


Social distance, also known as courtesy distance, reflects a social or ceremonial formal relationship. Normal social activities are generally kept at the near end of social distancing; Remote is suitable for more formal scenarios such as interviews and negotiations.


Public distance is the distance between the speaker and the audience in public speech, which is not suitable for interpersonal communication. Beyond 7.5 meters, there is room for almost anyone. It’s a space where people can completely ignore other people.


■ Etiquette is very important, facial expressions are the essence

Confucius said, “If you don’t learn etiquette, you can’t stand up.” Of all the industries, the service industry pays particular attention to etiquette. Service personnel follow the etiquette is divided into four modules: appearance, expression, language, common etiquette.



In these service etiquette, facial expressions play a vital role. Smile is the most popular, attractive and valuable expression in social intercourse. Sincere, warm and natural smile can effectively shorten the distance between people. Eye contact is the most vivid and good at conveying emotions. Different duration of eye contact, different place of stay and different changes in eye contact will bring different feelings.

 

Common etiquette sets out the customary forms and specific requirements to show respect in different situations (such as meeting, greeting and introduction), which is very operable.

 

There are many successful examples in the service industry of defining interactions based on distance and etiquette, such as:


▪︎ Wal-mart’s “three meter Smile Rule” : Whenever you meet a customer within three meters, smile, greet him by looking him in the eye and ask what you can do for him.

▪︎ Marriott International’s “15/5 Rule” : Smile and greet guests when you are 15 steps away from them; When you are five steps away from a guest, try to stop, stand aside, nod and say hello.


So, do the rules of distance and etiquette in human interaction apply to human-robot interaction? How to extract and transform these rules into a way suitable for robot expression? This is a question that needs to be considered deeply, and we have done research and verification for this.


Research methods for human-robot interaction

Natural observation, participatory design, and experimental methods


Nature observation, participatory design, in-depth interviews, and experimental methods were used to explore the real expectations of the robot as users walked towards it in public.

 

In this study, the small robot is used as the research carrier. Relying on Baidu’s artificial intelligence, Xiaodu robot integrates natural language processing, dialogue system, voice vision and other technologies, and can smoothly communicate with users in information, services, emotions and other aspects. In addition, As an “official employee” of Baidu, Xiaodu robot plays an important role in welcoming guests in baidu’s hall.




First of all, in a real public place (BAIDU Science Park K2), we observed different users’ behaviors and ways of interacting with The robot without human interference, and extracted, coded and analyzed these behaviors.

 

Subsequently, we invited several users to conduct in-depth interviews and participatory design. In the process of guiding users towards the small robot, they are required to self-report their needs and expectations for small robot, and jointly discuss the more ideal way of expression of small robot.


Finally, we integrated users’ needs and expectations and all kinds of rules in interpersonal communication, and transformed them into small “behavior language”, and carried out experimental verification. We set up a variety of experimental scenes, users will experience different human-computer interaction in the process of moving towards the small robot.



During the experiment, we tracked users’ facial expressions, body movements and speech behaviors. After the experiment, users were asked to evaluate various types of interaction, including emotional experience, cognitive evaluation, subjective satisfaction and so on. Finally, the results of this study are obtained.


Design suggestions for human-robot interaction

Incremental interaction


We found that in the interaction with small-degree robots, users expect small-degree robots to actively release interactive signals, and the release of such interactive signals is a gradual and continuously enhanced process, which we call “progressive interaction”.


In particular, this “gradual” is not just a change in physical distance from far to near, but a gradual change in the user’s “psychological field.” The change of the user’s “psychological field” is mainly divided into the following three stages, which are named as far field, middle field and near field according to the order of their appearance in the “psychological world”.

 

Far field stage: The robot needs to attract the user’s attention and make the user clearly aware that “he sees me”. This is a crucial step. Without the user’s attention, subsequent human-machine communication will become obtrusive or even impossible.


The intermediate stage: the robot needs to further “initiate interaction demands”, so that the user clearly realizes that “only I am in the Ta’s eyes” and the Ta has further interaction demands with me. This will also push users subconsciously further towards robots.


Near field stage: If the robot “starts the conversation”, the user will deeply feel the initiative and friendliness of the robot, “Ta hit on me”, and the dialogue between the human and the robot will start naturally.



■ Representation of mental fields in the physical world: Progression in distance



At the same time, we find that the representation of the user’s psychological field in the physical world also reflects the gradual distance. Among them, the distance corresponding to the far field is about 2.7-4.2 meters, and within this distance, the user expects the small robot to transmit signals that attract attention.


The corresponding distance between the midfield is about 1.2-2.7 meters. Within this distance, it is best for the robot to make the user aware that the robot wants to interact with him/her further.


While the distance corresponding to the near field is about 1.2 meters, at this time, the user has reached the appropriate distance to start the conversation with the small degree.


The distance requirement of human-robot interaction is slightly different from that of interpersonal communication. The far field distance (2.7m-4.2m) exceeds the far end of social distance (3.6m) and falls at the near end of public distance. The reason for this is speculated to be related to the characteristics of the robot itself. For example, the body width of the robot is 1.1m, which is much wider than ordinary people.

 

■ Representation of psychological field in the physical world: The interaction form of expectation reflects the appeal to etiquette

In different psychological fields, users expect small robots to make different forms of interaction. And these forms of interaction have obvious etiquette requirements.



In the far field, users are more likely to use “expressions” and “body movements” to attract attention to a lesser degree. Such as smiling, friendly eye contact; Wave, tilt, nod, etc.


In the midfield, users expect that Small degree can send interactive signals in various forms, so that users clearly realize that the interactive object of small degree is themselves. For example, using language to say hello (good morning, how do you do); Facial expressions and gestures such as smiling and waving are expected to continue.


And in the near field, the role of language is even more pronounced. At this point, users expect Him to “start a conversation”, such as introducing himself or asking if he needs help. At this point, users expect more enthusiastic smiles and physical gestures (such as handshakes, hugs, etc.).


For different interaction schemes, we have carried out experimental verification, including but not limited to the following problems:

▪︎ Which emoji is best to use at different distances, and universal principles for emoji design;

▪︎ in which psychological field language appears better, and the impact of speech channel on user experience;

▪︎ Whether the application of various sensors and AI capabilities (such as face recognition) can bring user experience gains;


Baidu AI Interactive Design Institute will share and disclose more of the research results and interactive suggestions on the above issues in future articles or on appropriate occasions.

summary

 

In this study, we focus on public places and take small robots as the carrier to explore the natural interaction modes before human-robot dialogue, and propose a new interaction concept — progressive interaction. We believe that the core of this interaction concept is the interpretation of user psychological field and the exploration of various representations of psychological field in the physical world.


This paper TIPS

1. The goal of human-robot interaction is to be natural and close to human cognitive habits. In public places, people place a high value on maintaining proper space and observing etiquette.

2. In the interaction with the robot, the user expects more “progressive interaction”, that is, the robot actively releases the interactive signal, and the interactive signal is a gradual and continuously enhanced process.

3. Progressive interaction core satisfies users’ “change of psychological field”, which can be divided into the following categories according to the sequence of their appearance in the psychological world:

▪︎ Far field: Robots need to “attract attention” and have a physical distance of about 2.7-4.2m from humans, which is more suitable for facial expressions and body movements.

▪︎ Midfield: The robot needs to “initiate interaction needs”, and the physical distance between the robot and the human is about 1.2-2.7m, which is more suitable for a variety of ways such as body movements, expressions and language.

▪︎ Near field: The robot needs to “initiate dialogue”, and the physical distance between the robot and the human is about 1.2m, where the role of language becomes more important.


The research of human-robot interaction has deep connotation and its extension is also expanding. This study mainly focuses on public places, and the appearance and products of small robots will also have a certain impact on users. In the future, we will carry out more extended researches, such as exploring human-computer interaction in the family scene and the influence of different forms of robots on user experience.

 

Paolo Dario, a professor at the University of Santa Ana in Pisa, Italy, said at the forum of the World Robot Conference 2018 that “the next era is not the Internet, but robots”, and there will be great potential and development space in the field of robotics in the future. We will continue to explore the field of human-robot interaction and carefully refine every detail of human-robot interaction.