5,885 research outputs found

    Machine Understanding of Human Behavior

    Get PDF
    A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior

    A time series feature of variability to detect two types of boredom from motion capture of the head and shoulders

    Get PDF
    Boredom and disengagement metrics are crucial to the correctly timed implementation of adaptive interventions in interactive systems. psychological research suggests that boredom (which other HCI teams have been able to partially quantify with pressure-sensing chair mats) is actually a composite: lethargy and restlessness. Here we present an innovative approach to the measurement and recognition of these two kinds of boredom, based on motion capture and video analysis of changes in head and shoulder positions. Discrete, three-minute, computer-presented stimuli (games, quizzes, films and music) covering a spectrum from engaging to boring/disengaging were used to elicit changes in cognitive/emotional states in seated, healthy volunteers. Interaction with the stimuli occurred with a handheld trackball instead of a mouse, so movements were assumed to be non-instrumental. Our results include a feature (standard deviation of windowed ranges) that may be more specific to boredom than mean speed of head movement, and that could be implemented in computer vision algorithms for disengagement detection

    Analyzing Input and Output Representations for Speech-Driven Gesture Generation

    Full text link
    This paper presents a novel framework for automatic speech-driven gesture generation, applicable to human-agent interaction including both virtual agents and robots. Specifically, we extend recent deep-learning-based, data-driven methods for speech-driven gesture generation by incorporating representation learning. Our model takes speech as input and produces gestures as output, in the form of a sequence of 3D coordinates. Our approach consists of two steps. First, we learn a lower-dimensional representation of human motion using a denoising autoencoder neural network, consisting of a motion encoder MotionE and a motion decoder MotionD. The learned representation preserves the most important aspects of the human pose variation while removing less relevant variation. Second, we train a novel encoder network SpeechE to map from speech to a corresponding motion representation with reduced dimensionality. At test time, the speech encoder and the motion decoder networks are combined: SpeechE predicts motion representations based on a given speech signal and MotionD then decodes these representations to produce motion sequences. We evaluate different representation sizes in order to find the most effective dimensionality for the representation. We also evaluate the effects of using different speech features as input to the model. We find that mel-frequency cepstral coefficients (MFCCs), alone or combined with prosodic features, perform the best. The results of a subsequent user study confirm the benefits of the representation learning.Comment: Accepted at IVA '19. Shorter version published at AAMAS '19. The code is available at https://github.com/GestureGeneration/Speech_driven_gesture_generation_with_autoencode

    Exploring the movement dynamics of deception

    Get PDF
    Both the science and the everyday practice of detecting a lie rest on the same assumption: hidden cognitive states that the liar would like to remain hidden nevertheless influence observable behavior. This assumption has good evidence. The insights of professional interrogators, anecdotal evidence, and body language textbooks have all built up a sizeable catalog of non-verbal cues that have been claimed to distinguish deceptive and truthful behavior. Typically, these cues are discrete, individual behaviors—a hand touching a mouth, the rise of a brow—that distinguish lies from truths solely in terms of their frequency or duration. Research to date has failed to establish any of these non-verbal cues as a reliable marker of deception. Here we argue that perhaps this is because simple tallies of behavior can miss out on the rich but subtle organization of behavior as it unfolds over time. Research in cognitive science from a dynamical systems perspective has shown that behavior is structured across multiple timescales, with more or less regularity and structure. Using tools that are sensitive to these dynamics, we analyzed body motion data from an experiment that put participants in a realistic situation of choosing, or not, to lie to an experimenter. Our analyses indicate that when being deceptive, continuous fluctuations of movement in the upper face, and somewhat in the arms, are characterized by dynamical properties of less stability, but greater complexity. For the upper face, these distinctions are present despite no apparent differences in the overall amount of movement between deception and truth. We suggest that these unique dynamical signatures of motion are indicative of both the cognitive demands inherent to deception and the need to respond adaptively in a social context
    corecore