Today, intelligent machines interact and collaborate with humans in a way that demands a greater level of trust between human and machine. A first step towards building intelligent machines that are capable of building and maintaining trust with humans is the design of a sensor that will enable machines to estimate human trust level in real-time. In this paper, two approaches for developing classifier-based empirical trust sensor models are presented that specifically use electroencephalography (EEG) and galvanic skin response (GSR) measurements. Human subject data collected from 45 participants is used for feature extraction, feature selection, classifier training, and model validation. The first approach considers a general set of psychophysiological features across all participants as the input variables and trains a classifier-based model for each participant, resulting in a trust sensor model based on the general feature set (i.e., a "general trust sensor model"). The second approach considers a customized feature set for each individual and trains a classifier-based model using that feature set, resulting in improved mean accuracy but at the expense of an increase in training time. This work represents the first use of real-time psychophysiological measurements for the development of a human trust sensor. Implications of the work, in the context of trust management algorithm design for intelligent machines, are also discussed.
Audience response is an important indicator of the quality of performing arts. Psychophysiological measurements enable researchers to perceive and understand audience response by collecting their bio-signals during live performance. However, how the audience respond, and how the performance is affected by these responses are the key elements but hard to implement. To address this issue, we designed a brain-computer interactive system called Brain-Adaptive Digital Performance (BADP) for the measurement and analysis of audience engagement level through an interactive three-dimensional virtual theatre. The BADP system monitors audience engagement in real time using electroencephalography (EEG) measurement and tries to improve it by applying content-related performing cues when the engagement level decreased. In this article, we generate EEG-based engagement level and build thresholds to determine the decrease and re-engage moments. In the experiment, we simulated two types of theatre performance to provide participants a high fidelity virtual environment using BADP system. We also create content-related performing cues for each performance in three different modes. The results of these evaluations show that our algorithm could accurately detect the engagement status and the performing cues have positive impacts on regain audience engagement across different performance types. Our findings open new perspectives in audience-based theatre performance design.
Cognitive computing systems require human labeled data for evaluation, and often for training. The standard practice used in gathering this data minimizes disagreement between annotators, and we have found this results in data that fails to account for the ambiguity inherent in language. We have proposed the CrowdTruth method for collecting ground truth through crowdsourcing, that reconsiders the role of people in machine learning based on the observation that disagreement between annotators provides a useful signal for phenomena such as ambiguity in the text. We report on using this method to build an annotated data set for medical relation extraction for the "cause" and "treat" relations, and how this data performed in a supervised training experiment. We demonstrate that by modeling ambiguity, labeled data gathered from crowd workers can (1) reach the level of quality of domain experts for this task while reducing the cost, and (2) provide better training data at scale than distant supervision. We further propose and validate new weighted measures for precision, recall, and F-measure, that account for ambiguity in both human and machine performance on this task.
When searching on the web, results are often returned as lists of hundreds to thousands of items, making it difficult for users to understand or navigate the space of results. Research has demonstrated that using clustering to partition search results into coherent, topical clusters can aid in both exploration and discovery. Yet clusters generated by an algorithm for this purpose are often of poor quality and do not satisfy users. As a result, experts must manually evaluate and refine the clustered results for each search query, a process that does not scale to large numbers of search queries. In this work, we investigate using crowd-based human evaluation to inspect, evaluate, and improve clusters to create high-quality clustered search results at scale. We introduce a workflow that begins by using a collection of well-known clustering algorithms to produce a set of clustered search results for a given query. Then, we use crowd workers to holistically assess the quality of each clustered search result in order to find the best one. Finally, the workflow has the crowd spot and fix problems in the best result in order to produce a final output. We evaluate this workflow on 120 top search queries from the Google Play Store, some of whom have clustered search results as a result of evaluations and refinements by experts. Our evaluations demonstrate that the workflow is effective at reproducing the evaluation of expert judges and also improves clusters in a way that agrees with experts and crowds alike.
The advent of mobile health (mHealth) technologies challenges the capabilities of current visualizations, interactive tools, and algorithms. We present Chronodes, an interactive system that unifies data mining and human-centric visualization techniques to support explorative analysis of longitudinal mHealth data. Chronodes extracts and visualizes frequent event sequences that reveal chronological patterns across multiple participant timelines of mHealth data. It then combines novel interaction and visualization techniques to enable multi-focus event sequence analysis, which allows health researchers to interactively define, explore, and compare groups of participant behaviors using event sequence combinations. Through summarizing insights gained from a pilot study with 20 behavioral and biomedical health experts, we discuss Chronodess efficacy and potential impact in themHealth domain. Ultimately we outline important open challenges in mHealth, and offer recommendations and design guidelines for future research. For a video demonstration of Chronodes, please refer to the provided video figure.
This paper reports on the design and evaluation of a co-creative drawing partner called the Drawing Apprentice, which was designed to improvise and collaborate on abstract sketches with users in real time. The system qualifies as a new genre of creative technologies termed casual creators that are meant to creatively engage users and provide enjoyable creative experiences rather than necessarily helping users make a higher quality creative product. We introduce the conceptual framework of participatory sense-making and describe how it can help model and understand open-ended collaboration. We report the results of user studies evaluating different prototypes of the system during an iterative design process. Based on insights from the user studies, we present design recommendations for co-creative agents.
This paper presents a novel smart eyewear that recognizes a wearer's facial expression in daily scenes. We evaluated our device and showed the robustness to the noise from a wearer's facial direction change, repeatability and the positional drift of the glasses. Our device uses embedded photo reflective sensors and machine learning to recognize a wearer's facial expressions. We leverage the skin deformation when a wearer changes their facial expressions. With small photo reflective sensors, we measure the proximity between the skin surface on a face and the eyewear frame where 17 sensors are integrated. A Support Vector Machine (SVM) algorithm was applied for the sensor information. The sensors can cover various facial muscle movements and can be integrated into everyday glasses.There are various possible scenarios of our devices such as a care system for older adults and mental management. The main contributions of our work are as follows. (1) We evaluated the recognition accuracy in daily scenes. We showed 92.8% accuracy regardless of facial direction, taking on/off by learning those data. Our device can recognize facial expressions with 78.1% accuracy for repeatability, with 87.7% accuracy in case of its positional drift. (2) It is designed and implemented considering social acceptability. The device looks like normal eyewear, so users can wear it anytime, anywhere. (3) Initial field trials in daily life were undertaken. Our work is one of the first attempts to recognize and evaluate a variety of facial expressions in the form of an unobtrusive wearable.
Full-body human movement is characterized by fine-grain expressive qualities that humans are easily capable to exhibit and recognize in others movement. In sports (e.g., martial arts) as well as in performing arts (e.g., dance), the same sequence of movements can be performed in a wide range of ways characterized by different qualities, often in terms of subtle (spatial and temporal) perturbations of the movement. Even a non-expert observer can distinguish between a top-level and an average performance by a dancer or martial artist. The difference is not in the performed movements - the same in both cases - but in the quality of their performance. In this paper, we present a computational framework aiming at an automated approximate measure of movement quality in full-body physical activities. Starting from motion capture data, the framework computes low-level (e.g., a limb velocity) and high-level (e.g., synchronization between different limbs) movement features. Then, this vector of features is integrated to compute a value aiming at providing a quantitative assessment of movement quality, approximating the evaluation an external expert observer would give of the same sequence of movements. Next, a system representing a concrete implementation of the framework is proposed. Karate is adopted as a testbed. We selected two different katas (i.e., detailed choreographies of movements in karate), characterized by different overall attitude and expression (aggressiveness, meditation), and we asked seven athletes, having various levels of experience and age, to perform them. Motion capture data were collected from the performances and were analyzed with the system. The results of the automated analysis were compared with the scores given by fourteen karate experts who rated the same performances. Results show that the movement quality scores computed by the system and the ratings given by the human observers are highly correlated (Pearsons correlations r = 0.84, p = 0.001 and r = 0.75, p = 0.005).
Research impact plays a critical role in evaluating the research quality and influence of a scholar, a journal, or a conference. Many researchers have attempted to quantify research impact by introducing different types of metrics based on citation data, such as h-index, citation count, and impact factor. These metrics are widely used in academic community. However, quantitative metrics are highly aggregated in most cases and sometimes biased, which probably results in the loss of impact details that are important for comprehensively understanding research impact. For example, which research area does a researcher have great research impact on? How does the research impact change over time? How do the collaborators take effect on the research impact of an individual? Simple quantitative metrics can hardly help answer such kind of questions, since more detailed exploration of the citation data is needed. Previous work on visualizing citation data usually only shows limited aspects of research impact and may suffer from other problems including visual clutter and scalability issues. To fill this gap, we propose an interactive visualization tool ImpactVis for better exploration of research impact through citation data. Case studies and in-depth expert interviews are conducted to demonstrate the effectiveness of ImpactVis.
User grouping in asynchronous online forums is a common phenomenon nowadays. People with similar backgrounds or shared interests like to get together in group discussions. As tens of thousands of archived conversational posts accumulate, challenges emerge for forum administrators and analysts to effectively explore user groups in large-volume threads and gain meaningful insight into hierarchical discussions. Simply identifying and comparing groups in a single thread are nontrivial tasks as the number of users and posts increases with time and noises hamper detections of user groups. Researchers in data mining field have proposed a large body of algorithms to explore user grouping, however the result is not revealing to laymen. To address these problems, we present VisForum, a visual analytic system which allows people to interactively explore user groups in a forum. We work closely with two educators who have released courses in MOOC platforms and compile a list of design goals to guide our design. Following a set of design rationales and tasks, we design and implement a multi-coordinated interface as well as several novel glyphs, i.e. group glyph, user glyph and set glyph, with different granularities. Accordingly, we propose the Group Detecting & Sorting Algorithm to reduce noises in a collection of posts, and employ the concept of `Forum-index' for end users to identify high-impact forum members. Two case studies using different real-world datasets demonstrate the usefulness of the system and the effectiveness of novel glyph designs. Furthermore, we conduct an in-lab user study to present the usability of VisForum.
In 2015, the top ten largest amusement park corporations saw a combined annual attendance of over 400 million visitors. Daily average attendance in some of the most popular theme parks in the world can average 44,000 visitors per day. These visitors ride attractions, shop for souvenirs and dine at local establishments; however, a critical component of their visit is the overall park experience. This experience depends on the wait time for rides, the crowd flow in the park and various other factors linked to the crowd dynamics and human behavior. As such, better insight into visitor behavior can help theme parks devise competitive strategies for improved customer experience. Research into the use of attractions, facilities and exhibits can be studied, and as behavior profiles emerge, park operators can also identify anomalous behaviors of visitors which can improve safety and operations. In this paper, we present a visual analytics framework for analyzing crowd dynamics in theme parks. Our proposed framework is designed to support behavioral analysis by summarizing patterns and detecting anomalies. We provide methodologies to link visitor movement data, communication data, and park infrastructure data. This combination of data sources enables a semantic analysis of who, what, when and where analysis, enabling analysts to explore visitor-visitor interactions and visitor-infrastructure interactions. Analysts can explore behaviors at the macro level through semantic trajectory clustering views for group behavior dynamics, as well as at the micro level using trajectory traces and a novel visitor network analysis view. We demonstrate the efficacy of our framework through two case studies of simulated theme park visitors.
Exploring high-dimensional data is challenging. Dimension reduction algorithms, such as weighted multi- dimensional scaling, support data explorations by projecting datasets to two dimensions for visualization. These projections can be explored through parametric interaction, tweaking underlying parameterizations, and observation-level interaction, directly interacting with the points within the projection. In this paper, we present the results of a controlled usability study determining the differences, advantages, and drawbacks among parametric interaction, observation-level interaction, and their combination. The study assesses both interaction techniques affects on domain-specific high-dimensional data analyses performed by non-experts of statistical algorithms. This study is performed using Andromeda, a tool that enables both parametric and observation-level interaction to provide in-depth data exploration. The results indicate that the two forms of interaction serve different, but complementary, purposes in gaining insight through steerable dimension reduction algorithms.
This paper presents a conceptual framework for human-robot trust which uses game theory to represent a definition of trust, derived from social psychology. This conceptual framework generates several testable hypotheses related to human-robot trust. This paper examines these hypotheses and a series of experiments we have conducted which both provide support for and also conflict with our framework for trust. We also discuss the methodological challenges associated with investigating trust. The paper concludes with a description of the important areas for future research on the topic of human-robot trust.