enter search term and/or author name
Volume 6, Issue 1
(Special issue editors: Yukiko Nakano, Roman Bednarik, Hung-Hsuan Huang, and Kristiina Jokinen)
By Yukiko Nakano, Roman Bednarik, Hung-Hsuan Huang, and Kristiina Jokinen
Eye gaze has been used broadly in interactive intelligent systems. The research area has grown in recent years to cover emerging topics that go beyond the traditional focus on interaction between a single user and an interactive system. This special issue presents five articles that explore new directions of gaze-based interactive intelligent systems, ranging from communication robots in dyadic and multiparty conversations to a driving simulator that uses eye gaze evidence to critique learners’ behavior.
By Tian (Linger) Xu, Hui Zhang, and Chen Yu
We focus on a fundamental looking behavior in human-robot interactions gazing at each other’s face. Eye contact and mutual gaze between two social partners are critical in smooth human-human interactions. Therefore, investigating at what moments and in what ways a robot should look at a human user’s face as a response to the human’s gaze behavior is an important topic. Toward this goal, we developed a gaze-contingent human-robot interaction system, which relied on momentary gaze behaviors from a human user to control an interacting robot in real time. Using this system, we conducted an experiment in which human participants interacted with the robot in a joint attention task. In the experiment, we systematically manipulated the robot’s gaze toward the human partner’s face in real time and then analyzed the human’s gaze behavior as a response to the robot’s gaze behavior. We found that more face looks from the robot led to more look-backs (to the robot’s face) from human participants and consequently created more mutual gaze and eye contact between the two. Moreover, participants demonstrated more coordinated and synchronized multimodal behaviors between speech and gaze when more eye contact was successfully established and maintained.
By Joshua Wade, Lian Zhang, Dayi Bian, Jing Fan, Amy Swanson, Amy Weitlauf, Medha Sarkar, Zachary Warren, and Nilanjan Sarkara
In addition to social and behavioral deficits, individuals with Autism Spectrum Disorder (ASD) often struggle to develop the adaptive skills necessary to achieve independence. Driving intervention in individuals with ASD is a growing area of study, but it is still widely under-researched. We present the development and preliminary assessment of a gaze-contingent adaptive virtual reality driving simulator that uses real-time gaze information to adapt the driving environment with the aim of providing a more individualized method of driving intervention. We conducted a small pilot study of 20 adolescents with ASD using our system: 10 with the adaptive gaze-contingent version of the system and 10 in a purely performance-based version. Preliminary results suggest that the novel intervention system may be beneficial in teaching driving skills to individuals with ASD.
By Ryo Ishii, Kazuhiro Otsuka, Shiro Kumano, and Junji Yamato
In multi-party meetings, participants need to predict the end of the speaker’s utterance and who will start speaking next and to consider a strategy for good timing to speak next. Gaze behavior plays an important role for smooth turn-changing. This article proposes a prediction model that features three processing steps to predict (I) whether turn-changing or turn-keeping will occur, (II) who will be the next speaker in turn-changing, and (III) the timing of the start of the next speaker’s utterance. For the feature values of the model, we focused on gaze transition patterns and the timing structure of eye contact between a speaker and a listener near the end of the speaker’s utterance. Gaze transition patterns provide information about the order in which gaze behavior changes. The timing structure of eye contact is defined as who looks at whom and who looks away first, the speaker or listener, when eye contact between the speaker and a listener occurs. We collected corpus data of multi-party meetings and used the data to demonstrate relationships between the gaze transition patterns and timing structure and the situations of (I), (II), and (III). The results of our analyses indicate that the gaze transition pattern of the speaker and listener and the timing structure of eye contact have a strong association with turn-changing, the next speaker in turn-changing, and the start time of the next utterance. On the basis of the results, we constructed prediction models using the gaze transition patterns and timing structure. The gaze transition patterns were found to be useful in predicting turn-changing, the next speaker in turn-changing, and the start time of the next utterance. Contrary to expectations, we did not find that the timing structure is useful for predicting the next speaker and the start time. This study opens up new possibilities for predicting the next speaker and the timing of the next utterance using the gaze transition patterns in multi-party meetings.
By Floriane Dardard, Giorgio Gnecco, and Donald Glowinski
The aim of the present work is to analyze automatically the leading interactions between the musicians of a string quartet, using machine learning techniques applied to nonverbal features of the musicians behavior, which are detected through the help of a motion capture system. We represent these interactions by a graph of influence of the musicians, which displays the relations is following and is not following with weighted directed arcs. The goal of the machine learning problem investigated is to assign weights to these arcs in an optimal way. Since only a subset of the available training examples are labeled, a semisupervised support vector machine is used, which is based on a linear kernel to limit its model complexity. Specific potential applications within the field of human-computer interaction are also discussed, such as e-learning, networked music performance, and social active listening.
By Stefano Piana, Alessandra Staglianò, Francesca Odone, and Antonio Camurri
We present a computational model and a system for the automated recognition of emotions starting from full-body movement. Three-dimensional motion data of full-body movements are obtained either from professional optical motion capture systems (Qualisys) or from low-cost RGB-D sensors (Kinect and Kinect2). A number of features are then automatically extracted at different levels, from kinematics of a single joint to more global expressive features inspired by psychology and humanistic theories (e.g., contraction index, fluidity, and impulsiveness). An abstraction layer based on dictionary learning further processes these movement features to increase the model generality and to deal with intraclass variability, noise, and incomplete information characterizing emotion expression in human movement. The resulting feature vector is the input for a classifier performing real-time automatic emotion recognition based on linear support vector machines. The recognition performance of the proposed model is presented and discussed, including the trade-off between precision of the tracking measures (we compare the Kinect RGB-D sensor and the Qualisys motion capture system) vs. dimension of the training dataset. The resulting model and system have been successfully applied in the development of serious games for helping autistic children to learn to recognize and express emotions by means of their full-body movement.
(Special issue editors: Giuseppe Carenini and Shimei Pan)
By Enamul Hoque and Giuseppe Carenini
In the last decade, there has been an exponential growth of asynchronous online conversations, thanks to the rise of social media. Analyzing and gaining insights from such conversations can be quite challenging for a user, especially when the discussion becomes very long. A promising solution to this problem is topic modeling, since it may help the user to understand quickly what was discussed in a long conversation and to explore the comments of interest. However, the results of topic modeling can be noisy, and they may not match the user s current information needs. To address this problem, we propose a novel topic modeling system for asynchronous conversations that revises the model on the fly on the basis of users feedback. We then integrate this system with interactive visualization techniques to support the user in exploring long conversations, as well as in revising the topic model when the current results are not adequate to fulfill the user s information needs. Finally, we report on an evaluation with real users that compared the resulting system with both a traditional interface and an interactive visual interface that does not support human-in-the-loop topic modeling. Both the quantitative results and the subjective feedback from the participants illustrate the potential benefits of our interactive topic modeling approach for exploring conversations, relative to its counterparts.
By Dietmar Jannach, Michael Jugovac, and Lukas Lerche
Machine learning and data analytics tasks in practice require several consecutive processing steps. RapidMiner is a widely used software tool for the development and execution of such analytics workflows. Unlike many other algorithm toolkits, it comprises a visual editor that allows the user to design processes on a conceptual level. This conceptual and visual approach helps the user to abstract from the technical details during the development phase and to retain a focus on the core modeling task. The large set of preimplemented data analysis and machine learning operations available in the tool, as well as their logical dependencies, can, however, be overwhelming, in particular for novice users. In this work, we present an add-on to the RapidMiner framework that supports the user during the modeling phase by recommending additional operations to insert into the currently developed machine learning workflow. First, we propose different recommendation techniques and evaluate them in an offline setting using a pool of several thousand existing workflows. Second, we present the results of a laboratory study, which show that our tool helps users to significantly increase the efficiency of the modeling process. Finally, we report on analyses using data that were collected during the real-world deployment of the plug-in component and compare the results of the live deployment of the tool with the results obtained through an offline analysis and a replay simulation.
By Sana Malik, Fan Du, Catherine Plaisant, and Ben Shneiderman
Cohort comparison studies have traditionally been hypothesis-driven and conducted in carefully controlled environments (such as clinical trials). Given two groups of event sequence data, researchers test a single hypothesis (e.g., does the group taking Medication A exhibit more deaths than the group taking Medication B?). Recently, however, researchers have been moving toward more exploratory methods of retrospective analysis with existing data. In this article, we begin by showing that the task of cohort comparison is specific enough to support automatic computation against a bounded set of potential questions and objectives, a method that we refer to as High-Volume Hypothesis Testing (HVHT). From this starting point, we demonstrate that the diversity of these objectives, both across and within different domains, as well as the inherent complexities of real-world datasets, still require human involvement to determine meaningful insights. We explore how visualization and interaction better support the task of exploratory data analysis and the understanding of HVHT results (how significant they are, why they are meaningful, and whether the entire dataset has been exhaustively explored). Through interviews and case studies with domain experts, we iteratively design and implement visualization and interaction techniques in a visual analytics tool, CoCo. As a result of our evaluation, we propose six design guidelines for enabling users to explore large result sets of HVHT systematically and flexibly in order to glean meaningful insights more quickly. Finally, we illustrate the utility of this method with three case studies in the medical domain.