674 research outputs found

    Anticipating Daily Intention using On-Wrist Motion Triggered Sensing

    Full text link
    Anticipating human intention by observing one's actions has many applications. For instance, picking up a cellphone, then a charger (actions) implies that one wants to charge the cellphone (intention). By anticipating the intention, an intelligent system can guide the user to the closest power outlet. We propose an on-wrist motion triggered sensing system for anticipating daily intentions, where the on-wrist sensors help us to persistently observe one's actions. The core of the system is a novel Recurrent Neural Network (RNN) and Policy Network (PN), where the RNN encodes visual and motion observation to anticipate intention, and the PN parsimoniously triggers the process of visual observation to reduce computation requirement. We jointly trained the whole network using policy gradient and cross-entropy loss. To evaluate, we collect the first daily "intention" dataset consisting of 2379 videos with 34 intentions and 164 unique action sequences. Our method achieves 92.68%, 90.85%, 97.56% accuracy on three users while processing only 29% of the visual observation on average

    WEAR: A Multimodal Dataset for Wearable and Egocentric Video Activity Recognition

    Full text link
    Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both modalities remain scarce. In this paper we introduce WEAR, a multimodal benchmark dataset for both vision- and wearable-based Human Activity Recognition (HAR). The dataset comprises data from 18 participants performing a total of 18 different workout activities with untrimmed inertial (acceleration) and camera (egocentric video) data recorded at 10 different outside locations. WEAR features a diverse set of activities which are low in inter-class similarity and, unlike previous egocentric datasets, not defined by human-object-interactions nor originate from inherently distinct activity categories. Provided benchmark results reveal that single-modality architectures have different strengths and weaknesses in their prediction performance. Further, in light of the recent success of transformer-based video action detection models, we demonstrate their versatility by applying them in a plain fashion using vision, inertial and combined (vision + inertial) features as input. Results show that vision transformers are not only able to produce competitive results using only inertial data, but also can function as an architecture to fuse both modalities by means of simple concatenation, with the multimodal approach being able to produce the highest average mAP, precision and close-to-best F1-scores. Up until now, vision-based transformers have neither been explored in inertial nor in multimodal human activity recognition, making our approach the first to do so. The dataset and code to reproduce experiments is publicly available via: mariusbock.github.io/wearComment: 12 pages, 2 figures, 2 table

    Sensing, interpreting, and anticipating human social behaviour in the real world

    Get PDF
    Low-level nonverbal social signals like glances, utterances, facial expressions and body language are central to human communicative situations and have been shown to be connected to important high-level constructs, such as emotions, turn-taking, rapport, or leadership. A prerequisite for the creation of social machines that are able to support humans in e.g. education, psychotherapy, or human resources is the ability to automatically sense, interpret, and anticipate human nonverbal behaviour. While promising results have been shown in controlled settings, automatically analysing unconstrained situations, e.g. in daily-life settings, remains challenging. Furthermore, anticipation of nonverbal behaviour in social situations is still largely unexplored. The goal of this thesis is to move closer to the vision of social machines in the real world. It makes fundamental contributions along the three dimensions of sensing, interpreting and anticipating nonverbal behaviour in social interactions. First, robust recognition of low-level nonverbal behaviour lays the groundwork for all further analysis steps. Advancing human visual behaviour sensing is especially relevant as the current state of the art is still not satisfactory in many daily-life situations. While many social interactions take place in groups, current methods for unsupervised eye contact detection can only handle dyadic interactions. We propose a novel unsupervised method for multi-person eye contact detection by exploiting the connection between gaze and speaking turns. Furthermore, we make use of mobile device engagement to address the problem of calibration drift that occurs in daily-life usage of mobile eye trackers. Second, we improve the interpretation of social signals in terms of higher level social behaviours. In particular, we propose the first dataset and method for emotion recognition from bodily expressions of freely moving, unaugmented dyads. Furthermore, we are the first to study low rapport detection in group interactions, as well as investigating a cross-dataset evaluation setting for the emergent leadership detection task. Third, human visual behaviour is special because it functions as a social signal and also determines what a person is seeing at a given moment in time. Being able to anticipate human gaze opens up the possibility for machines to more seamlessly share attention with humans, or to intervene in a timely manner if humans are about to overlook important aspects of the environment. We are the first to propose methods for the anticipation of eye contact in dyadic conversations, as well as in the context of mobile device interactions during daily life, thereby paving the way for interfaces that are able to proactively intervene and support interacting humans.Blick, GesichtsausdrĂŒcke, Körpersprache, oder Prosodie spielen als nonverbale Signale eine zentrale Rolle in menschlicher Kommunikation. Sie wurden durch vielzĂ€hlige Studien mit wichtigen Konzepten wie Emotionen, Sprecherwechsel, FĂŒhrung, oder der QualitĂ€t des VerhĂ€ltnisses zwischen zwei Personen in Verbindung gebracht. Damit Menschen effektiv wĂ€hrend ihres tĂ€glichen sozialen Lebens von Maschinen unterstĂŒtzt werden können, sind automatische Methoden zur Erkennung, Interpretation, und Antizipation von nonverbalem Verhalten notwendig. Obwohl die bisherige Forschung in kontrollierten Studien zu ermutigenden Ergebnissen gekommen ist, bleibt die automatische Analyse nonverbalen Verhaltens in weniger kontrollierten Situationen eine Herausforderung. DarĂŒber hinaus existieren kaum Untersuchungen zur Antizipation von nonverbalem Verhalten in sozialen Situationen. Das Ziel dieser Arbeit ist, die Vision vom automatischen Verstehen sozialer Situationen ein StĂŒck weit mehr RealitĂ€t werden zu lassen. Diese Arbeit liefert wichtige BeitrĂ€ge zur autmatischen Erkennung menschlichen Blickverhaltens in alltĂ€glichen Situationen. Obwohl viele soziale Interaktionen in Gruppen stattfinden, existieren unĂŒberwachte Methoden zur Augenkontakterkennung bisher lediglich fĂŒr dyadische Interaktionen. Wir stellen einen neuen Ansatz zur Augenkontakterkennung in Gruppen vor, welcher ohne manuelle Annotationen auskommt, indem er sich den statistischen Zusammenhang zwischen Blick- und Sprechverhalten zu Nutze macht. TĂ€gliche AktivitĂ€ten sind eine Herausforderung fĂŒr GerĂ€te zur mobile Augenbewegungsmessung, da Verschiebungen dieser GerĂ€te zur Verschlechterung ihrer Kalibrierung fĂŒhren können. In dieser Arbeit verwenden wir Nutzerverhalten an mobilen EndgerĂ€ten, um den Effekt solcher Verschiebungen zu korrigieren. Neben der Erkennung verbessert diese Arbeit auch die Interpretation sozialer Signale. Wir veröffentlichen den ersten Datensatz sowie die erste Methode zur Emotionserkennung in dyadischen Interaktionen ohne den Einsatz spezialisierter AusrĂŒstung. Außerdem stellen wir die erste Studie zur automatischen Erkennung mangelnder Verbundenheit in Gruppeninteraktionen vor, und fĂŒhren die erste datensatzĂŒbergreifende Evaluierung zur Detektion von sich entwickelndem FĂŒhrungsverhalten durch. Zum Abschluss der Arbeit prĂ€sentieren wir die ersten AnsĂ€tze zur Antizipation von Blickverhalten in sozialen Interaktionen. Blickverhalten hat die besondere Eigenschaft, dass es sowohl als soziales Signal als auch der Ausrichtung der visuellen Wahrnehmung dient. Somit eröffnet die FĂ€higkeit zur Antizipation von Blickverhalten Maschinen die Möglichkeit, sich sowohl nahtloser in soziale Interaktionen einzufĂŒgen, als auch Menschen zu warnen, wenn diese Gefahr laufen wichtige Aspekte der Umgebung zu ĂŒbersehen. Wir prĂ€sentieren Methoden zur Antizipation von Blickverhalten im Kontext der Interaktion mit mobilen EndgerĂ€ten wĂ€hrend tĂ€glicher AktivitĂ€ten, als auch wĂ€hrend dyadischer Interaktionen mittels Videotelefonie

    Leveraging the Present to Anticipate the Future in Videos

    Get PDF
    Anticipating actions before they are executed is crucial for a wide range of practical applications including autonomous driving and robotics. While most prior work in this area requires partial observation of executed actions , in the paper we focus on anticipating actions seconds before they start. Our proposed approach is the fusion of a purely anticipatory model with a complementary model constrained to reason about the present. In particular, the latter predicts present action and scene attributes , and reasons about how they evolve over time. By doing so, we aim at modeling action anticipation at a more conceptual level than directly predicting future actions. Our model outperforms previously reported methods on the EPIC-KITCHENS and Breakfast datasets

    Continuous Semi-autonomous Prosthesis Control using a Depth Sensor on the Hand

    Get PDF

    Proceedings of the 2nd IUI Workshop on Interacting with Smart Objects

    Get PDF
    These are the Proceedings of the 2nd IUI Workshop on Interacting with Smart Objects. Objects that we use in our everyday life are expanding their restricted interaction capabilities and provide functionalities that go far beyond their original functionality. They feature computing capabilities and are thus able to capture information, process and store it and interact with their environments, turning them into smart objects

    Using movement kinematics to understand the motor side of Autism Spectrum Disorder

    Get PDF
    openComprensione del sintomo motorio dell'autismo attraverso la cinematica del movimentoBeside core deficits in social interaction and communication, atypical motor patterns have been often reported in people with Autism Spectrum Disorder (ASD). It has been recently speculated that a part of these sensorimotor abnormalities could be better explained considering prospective motor control (i.e., the ability to plan actions toward future events or consider future task demands), which has been hypothesized to be crucial for higher mind functions (e.g., understand intentions of other people) (Trevarthen and Delafield-Butt 2013). The aim of the current dissertation was to tackle the motor ‘side’ in ASD exploring whether and how prospective motor control might be atypical in children with a diagnosis of autism, given that actions are directed into the future and their control is based on knowledge of what is going to happen next (von Hofsten and Rosander 2012). To do this, an integrative approach based on neuropsychological assessment, behavioural paradigms and machine learning modelling of the kinematics recorded with motion capture techniques was applied in typically developing children and children with ASD without accompanying intellectual impairment.openXXXI CICLO - ARCHITETTURA E DESIGN - Design navale e nauticoBECCHIO, CRISTINA (IIT)Podda, Jessic

    A sensorimotor network for actions and intentions reading: a series of TMS studies

    Get PDF
    Information relevant for our social life are immediately processed by our brain. When we walk in the street we easily and quite automatically adjust our path to avoid colliding other people. Several social activities like working in a group, playing a sport, talking with people and many others, all require the ability to carefully read others movements. Thus, kinematics and postural information of others‟ body are a fundamental medium for good survival in our social environment. Along the reading of this manuscript a series of extensive and novel studies will describe the role of sensorimotor cortices and their differential contribution in specific action observation tasks. By means of transcranial magnetic stimulation (TMS) we tested in healthy subjects both low and high cognitive level processes that may require areas of the action observation network

    Motion-Based Video Games for Stroke Rehabilitation with Reduced Compensatory Motions

    Get PDF
    Stroke is the leading cause of long-term disability among adults in industrialized nations, with 80% of people who survive strokes experiencing motor disabilities. Recovery requires daily exercise with a high number of repetitions, often without therapist supervision. Motion-based video games can help motivate people with stroke to perform the necessary exercises to recover. We explore the design space of video games for stroke rehabilitation using Wii remotes and webcams as input devices, and share the lessons we learned about what makes games therapeutically useful. We demonstrate the feasibility of using games for home-based stroke therapy with a six-week case study. We show that exercise with games can help recovery even 17 years after the stroke, and share the lessons that we learned for game systems to be used at home as a part of outpatient therapy. As a major issue with home-based therapy, we identify that unsupervised exercises lead to compensatory motions that can impede recovery and create new health issues. We reliably detect torso compensation in shoulder exercises using a custom harness, and develop a game that meaningfully uses both exercise and compensation as inputs. We provide in-game feedback that reduces compensation in a number of ways. We evaluate alternative ways for reducing compensation in controlled experiments and show that using techniques from operant conditioning are effective in significantly reducing compensatory behavior compared to existing approaches
    • 

    corecore