Modeling Conversations

The long term focus of this project is to gain computational understanding of human social behavior including verbal, vocal, and nonverbal cues. The current work, however, focuses on nonverbal cues in face-to-face comunication such as listener backchannel feedback. An example of a backchannel feedback is head nod by the listener to show attention and communicate engagement towards the speaker.
The data driven approach, followed in this project, is based on applying machine learning algorithms to a small set of inofmrative features that are extracted automatically from muli-modal recordings of human social interaction behavior. The end goal is to be able to simulate such behavior in socially interactive interfaces such as robots.
The visulization on the right shows a typical setup of the learning task. The listener nods are being predicted and other behavioral features such as speacker speech time, the pitch of their speech are used as input features.

Related Publications

Khan, Faisal , Mutlu, Bilge and Zhu, Xiaojin (2010). Modeling Social Behavior: Efficient Features for Predicting Listener Nods. In Proceedings of the NIPS Workshop on Modeling Human Communication Dynamics. [PDF][Presentation][Poster (7mb)]