44,687 research outputs found
Computer-based tracking, analysis, and visualization of linguistically significant nonmanual events in American Sign Language (ASL)
Our linguistically annotated American Sign Language (ASL) corpora have formed a basis for research to automate detection by
computer of essential linguistic information conveyed through facial expressions and head movements. We have tracked head position
and facial deformations, and used computational learning to discern specific grammatical markings. Our ability to detect, identify, and
temporally localize the occurrence of such markings in ASL videos has recently been improved by incorporation of (1) new techniques
for deformable model-based 3D tracking of head position and facial expressions, which provide significantly better tracking accuracy
and recover quickly from temporary loss of track due to occlusion; and (2) a computational learning approach incorporating 2-level
Conditional Random Fields (CRFs), suited to the multi-scale spatio-temporal characteristics of the data, which analyses not only
low-level appearance characteristics, but also the patterns that enable identification of significant gestural components, such as
periodic head movements and raised or lowered eyebrows. Here we summarize our linguistically motivated computational approach
and the results for detection and recognition of nonmanual grammatical markings; demonstrate our data visualizations, and discuss the
relevance for linguistic research; and describe work underway to enable such visualizations to be produced over large corpora and
shared publicly on the Web
3D face tracking and multi-scale, spatio-temporal analysis of linguistically significant facial expressions and head positions in ASL
Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body. This poses a significant challenge for computer-based sign language recognition. Here, we present new methods for the recognition of nonmanual grammatical markers in American Sign Language (ASL) based on: (1) new 3D tracking methods for the estimation of 3D head pose and facial expressions to determine the relevant low-level features; (2) methods for higher-level analysis of component events (raised/lowered eyebrows, periodic head nods and head shakes) used in grammatical markings—with differentiation of temporal phases (onset, core, offset, where appropriate), analysis of their characteristic properties, and extraction of corresponding features; (3) a 2-level learning framework to combine lowand high-level features of differing spatio-temporal scales. This new approach achieves significantly better tracking and recognition results than our previous methods
A Self-initializing Eyebrow Tracker for Binary Switch Emulation
We designed the Eyebrow-Clicker, a camera-based human computer interface system that implements a new form of binary switch. When the user raises his or her eyebrows, the binary switch is activated and a selection command is issued. The Eyebrow-Clicker thus replaces the "click" functionality of a mouse. The system initializes itself by detecting the user's eyes and eyebrows, tracks these features at frame rate, and recovers in the event of errors. The initialization uses the natural blinking of the human eye to select suitable templates for tracking. Once execution has begun, a user therefore never has to restart the program or even touch the computer. In our experiments with human-computer interaction software, the system successfully determined 93% of the time when a user raised his eyebrows.Office of Naval Research; National Science Foundation (IIS-0093367
Exploring the Affective Loop
Research in psychology and neurology shows that both body and mind are
involved when experiencing emotions (Damasio 1994, Davidson et al.
2003). People are also very physical when they try to communicate their
emotions. Somewhere in between beings consciously and unconsciously
aware of it ourselves, we produce both verbal and physical signs to make
other people understand how we feel. Simultaneously, this production of
signs involves us in a stronger personal experience of the emotions we
express.
Emotions are also communicated in the digital world, but there is little
focus on users' personal as well as physical experience of emotions in
the available digital media. In order to explore whether and how we can
expand existing media, we have designed, implemented and evaluated
/eMoto/, a mobile service for sending affective messages to others. With
eMoto, we explicitly aim to address both cognitive and physical
experiences of human emotions. Through combining affective gestures for
input with affective expressions that make use of colors, shapes and
animations for the background of messages, the interaction "pulls" the
user into an /affective loop/. In this thesis we define what we mean by
affective loop and present a user-centered design approach expressed
through four design principles inspired by previous work within Human
Computer Interaction (HCI) but adjusted to our purposes; /embodiment/
(Dourish 2001) as a means to address how people communicate emotions in
real life, /flow/ (Csikszentmihalyi 1990) to reach a state of
involvement that goes further than the current context, /ambiguity/ of
the designed expressions (Gaver et al. 2003) to allow for open-ended
interpretation by the end-users instead of simplistic, one-emotion
one-expression pairs and /natural but designed expressions/ to address
people's natural couplings between cognitively and physically
experienced emotions. We also present results from an end-user study of
eMoto that indicates that subjects got both physically and emotionally
involved in the interaction and that the designed "openness" and
ambiguity of the expressions, was appreciated and understood by our
subjects. Through the user study, we identified four potential design
problems that have to be tackled in order to achieve an affective loop
effect; the extent to which users' /feel in control/ of the interaction,
/harmony and coherence/ between cognitive and physical expressions/,/
/timing/ of expressions and feedback in a communicational setting, and
effects of users' /personality/ on their emotional expressions and
experiences of the interaction
- …