156,769 research outputs found

    Large language models in textual analysis for gesture selection

    Full text link
    Gestures perform a variety of communicative functions that powerfully influence human face-to-face interaction. How this communicative function is achieved varies greatly between individuals and depends on the role of the speaker and the context of the interaction. Approaches to automatic gesture generation vary not only in the degree to which they rely on data-driven techniques but also the degree to which they can produce context and speaker specific gestures. However, these approaches face two major challenges: The first is obtaining sufficient training data that is appropriate for the context and the goal of the application. The second is related to designer control to realize their specific intent for the application. Here, we approach these challenges by using large language models (LLMs) to show that these powerful models of large amounts of data can be adapted for gesture analysis and generation. Specifically, we used ChatGPT as a tool for suggesting context-specific gestures that can realize designer intent based on minimal prompts. We also find that ChatGPT can suggests novel yet appropriate gestures not present in the minimal training data. The use of LLMs is a promising avenue for gesture generation that reduce the need for laborious annotations and has the potential to flexibly and quickly adapt to different designer intents

    Reclaiming human machine nature

    Get PDF
    Extending and modifying his domain of life by artifact production is one of the main characteristics of humankind. From the first hominid, who used a wood stick or a stone for extending his upper limbs and augmenting his gesture strength, to current systems engineers who used technologies for augmenting human cognition, perception and action, extending human body capabilities remains a big issue. From more than fifty years cybernetics, computer and cognitive sciences have imposed only one reductionist model of human machine systems: cognitive systems. Inspired by philosophy, behaviorist psychology and the information treatment metaphor, the cognitive system paradigm requires a function view and a functional analysis in human systems design process. According that design approach, human have been reduced to his metaphysical and functional properties in a new dualism. Human body requirements have been left to physical ergonomics or "physiology". With multidisciplinary convergence, the issues of "human-machine" systems and "human artifacts" evolve. The loss of biological and social boundaries between human organisms and interactive and informational physical artifact questions the current engineering methods and ergonomic design of cognitive systems. New developpment of human machine systems for intensive care, human space activities or bio-engineering sytems requires grounding human systems design on a renewed epistemological framework for future human systems model and evidence based "bio-engineering". In that context, reclaiming human factors, augmented human and human machine nature is a necessityComment: Published in HCI International 2014, Heraklion : Greece (2014

    Toward a model of computational attention based on expressive behavior: applications to cultural heritage scenarios

    Get PDF
    Our project goals consisted in the development of attention-based analysis of human expressive behavior and the implementation of real-time algorithm in EyesWeb XMI in order to improve naturalness of human-computer interaction and context-based monitoring of human behavior. To this aim, perceptual-model that mimic human attentional processes was developed for expressivity analysis and modeled by entropy. Museum scenarios were selected as an ecological test-bed to elaborate three experiments that focus on visitor profiling and visitors flow regulation

    Classifying types of gesture and inferring intent

    Get PDF
    In order to infer intent from gesture, a rudimentary classification of types of gestures into five main classes is introduced. The classification is intended as a basis for incorporating the understanding of gesture into human-robot interaction (HRI). Some requirements for the operational classification of gesture by a robot interacting with humans are also suggested

    Down-Sampling coupled to Elastic Kernel Machines for Efficient Recognition of Isolated Gestures

    Get PDF
    In the field of gestural action recognition, many studies have focused on dimensionality reduction along the spatial axis, to reduce both the variability of gestural sequences expressed in the reduced space, and the computational complexity of their processing. It is noticeable that very few of these methods have explicitly addressed the dimensionality reduction along the time axis. This is however a major issue with regard to the use of elastic distances characterized by a quadratic complexity. To partially fill this apparent gap, we present in this paper an approach based on temporal down-sampling associated to elastic kernel machine learning. We experimentally show, on two data sets that are widely referenced in the domain of human gesture recognition, and very different in terms of quality of motion capture, that it is possible to significantly reduce the number of skeleton frames while maintaining a good recognition rate. The method proves to give satisfactory results at a level currently reached by state-of-the-art methods on these data sets. The computational complexity reduction makes this approach eligible for real-time applications.Comment: ICPR 2014, International Conference on Pattern Recognition, Stockholm : Sweden (2014

    ModDrop: adaptive multi-modal gesture recognition

    Full text link
    We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

    Mapping Tasks to Interactions for Graph Exploration and Graph Editing on Interactive Surfaces

    Full text link
    Graph exploration and editing are still mostly considered independently and systems to work with are not designed for todays interactive surfaces like smartphones, tablets or tabletops. When developing a system for those modern devices that supports both graph exploration and graph editing, it is necessary to 1) identify what basic tasks need to be supported, 2) what interactions can be used, and 3) how to map these tasks and interactions. This technical report provides a list of basic interaction tasks for graph exploration and editing as a result of an extensive system review. Moreover, different interaction modalities of interactive surfaces are reviewed according to their interaction vocabulary and further degrees of freedom that can be used to make interactions distinguishable are discussed. Beyond the scope of graph exploration and editing, we provide an approach for finding and evaluating a mapping from tasks to interactions, that is generally applicable. Thus, this work acts as a guideline for developing a system for graph exploration and editing that is specifically designed for interactive surfaces.Comment: 21 pages, minor corrections (typos etc.
    • …
    corecore