11,785 research outputs found
A Combined Semantic and Motion Capture Database for Real-Time Sign Language Synthesis
International audienceOver the past decade, many elds of discovery have begun to use motion capture data, leading to an exponential growth in the size of motion databases. Querying, indexing and retrieving motion capture data has thus become a crucial problem for the accessibility and usability of such databases. Our aim is to make this approach feasible for virtual agents signing in French Sign Language, taking into account the semantic information implicitly contained in sign language data.We propose a new methodology for accessing our database, by simultaneously using both a semantic and a captured-motion database, with dierent ways of index- ing the two database parts. This approach is used to eectively retrieve stored motions for the purposes of producing real-time sign language an- imations. The complete process and its in-use eciency are described, from querying motion in the semantic database to computing transitory segments between signs, and producing animations of a realistic virtual character
Toward a Motor Theory of Sign Language Perception
Researches on signed languages still strongly dissociate lin- guistic issues
related on phonological and phonetic aspects, and gesture studies for
recognition and synthesis purposes. This paper focuses on the imbrication of
motion and meaning for the analysis, synthesis and evaluation of sign language
gestures. We discuss the relevance and interest of a motor theory of perception
in sign language communication. According to this theory, we consider that
linguistic knowledge is mapped on sensory-motor processes, and propose a
methodology based on the principle of a synthesis-by-analysis approach, guided
by an evaluation process that aims to validate some hypothesis and concepts of
this theory. Examples from existing studies illustrate the di erent concepts
and provide avenues for future work.Comment: 12 pages Partiellement financ\'e par le projet ANR SignCo
A Study on Techniques and Challenges in Sign Language Translation
Sign Language Translation (SLT) plays a pivotal role in enabling effective communication for the Deaf and Hard of Hearing (DHH) community. This review delves into the state-of-the-art techniques and methodologies in SLT, focusing on its significance, challenges, and recent advancements. The review provides a comprehensive analysis of various SLT approaches, ranging from rule-based systems to deep learning models, highlighting their strengths and limitations. Datasets specifically tailored for SLT research are explored, shedding light on the diversity and complexity of Sign Languages across the globe. The review also addresses critical issues in SLT, such as the expressiveness of generated signs, facial expressions, and non-manual signals. Furthermore, it discusses the integration of SLT into assistive technologies and educational tools, emphasizing the transformative potential in enhancing accessibility and inclusivity. Finally, the review outlines future directions, including the incorporation of multimodal inputs and the imperative need for co-creation with the Deaf community, paving the way for more accurate, expressive, and culturally sensitive Sign Language Generation systems
A survey on mouth modeling and analysis for Sign Language recognition
© 2015 IEEE.Around 70 million Deaf worldwide use Sign Languages (SLs) as their native languages. At the same time, they have limited reading/writing skills in the spoken language. This puts them at a severe disadvantage in many contexts, including education, work, usage of computers and the Internet. Automatic Sign Language Recognition (ASLR) can support the Deaf in many ways, e.g. by enabling the development of systems for Human-Computer Interaction in SL and translation between sign and spoken language. Research in ASLR usually revolves around automatic understanding of manual signs. Recently, ASLR research community has started to appreciate the importance of non-manuals, since they are related to the lexical meaning of a sign, the syntax and the prosody. Nonmanuals include body and head pose, movement of the eyebrows and the eyes, as well as blinks and squints. Arguably, the mouth is one of the most involved parts of the face in non-manuals. Mouth actions related to ASLR can be either mouthings, i.e. visual syllables with the mouth while signing, or non-verbal mouth gestures. Both are very important in ASLR. In this paper, we present the first survey on mouth non-manuals in ASLR. We start by showing why mouth motion is important in SL and the relevant techniques that exist within ASLR. Since limited research has been conducted regarding automatic analysis of mouth motion in the context of ALSR, we proceed by surveying relevant techniques from the areas of automatic mouth expression and visual speech recognition which can be applied to the task. Finally, we conclude by presenting the challenges and potentials of automatic analysis of mouth motion in the context of ASLR
Interactive Editing in French Sign Language Dedicated to Virtual Signers: Requirements and Challenges
International audienceSigning avatars are increasingly used as an interface for communication to the deaf community. In recent years, an emerging approach uses captured data to edit and generate sign language (SL) gestures. Thanks to motion editing operations (e.g., concatenation, mixing), this method offers the possibility to compose new utterances, thus facilitating the enrichment of the original corpus, enhancing the natural look of the animation, and promoting the avatarâs acceptability. However, designing such an editing system raises many questions. In particular, manipulating existing movements does not guarantee the semantic consistency of the reconstructed actions. A solution is to insert the human operator in a loop for constructing new utterances and to incorporate within the utteranceâs structure constraints that are derived from linguistic patterns. This article discusses the main requirements for the whole pipeline design of interactive virtual signers, including: (1) the creation of corpora, (2) the needed resources for motion recording, (3) the annotation process as the heart of the SL editing process, (4) the building, indexing, and querying of a motion database, (5) the virtual avatar animation by editing and composing motion segments, and (6) the conception of a dedicated user interface according to userâ knowledge and abilities. Each step is illustrated by the authorsâ recent work and results from the project Sign3D, i.e., an editing system of French Sign Language (LSF) content
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Gestures that accompany speech are an essential part of natural and efficient
embodied human communication. The automatic generation of such co-speech
gestures is a long-standing problem in computer animation and is considered an
enabling technology in film, games, virtual social spaces, and for interaction
with social robots. The problem is made challenging by the idiosyncratic and
non-periodic nature of human co-speech gesture motion, and by the great
diversity of communicative functions that gestures encompass. Gesture
generation has seen surging interest recently, owing to the emergence of more
and larger datasets of human gesture motion, combined with strides in
deep-learning-based generative models, that benefit from the growing
availability of data. This review article summarizes co-speech gesture
generation research, with a particular focus on deep generative models. First,
we articulate the theory describing human gesticulation and how it complements
speech. Next, we briefly discuss rule-based and classical statistical gesture
synthesis, before delving into deep learning approaches. We employ the choice
of input modalities as an organizing principle, examining systems that generate
gestures from audio, text, and non-linguistic input. We also chronicle the
evolution of the related training data sets in terms of size, diversity, motion
quality, and collection method. Finally, we identify key research challenges in
gesture generation, including data availability and quality; producing
human-like motion; grounding the gesture in the co-occurring speech in
interaction with other speakers, and in the environment; performing gesture
evaluation; and integration of gesture synthesis into applications. We
highlight recent approaches to tackling the various key challenges, as well as
the limitations of these approaches, and point toward areas of future
development.Comment: Accepted for EUROGRAPHICS 202
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
- âŠ