27 research outputs found
TR-2015001: A Survey and Critique of Facial Expression Synthesis in Sign Language Animation
Sign language animations can lead to better accessibility of information and services for people who are deaf and have low literacy skills in spoken/written languages. Due to the distinct word-order, syntax, and lexicon of the sign language from the spoken/written language, many deaf people find it difficult to comprehend the text on a computer screen or captions on a television. Animated characters performing sign language in a comprehensible way could make this information accessible. Facial expressions and other non-manual components play an important role in the naturalness and understandability of these animations. Their coordination to the manual signs is crucial for the interpretation of the signed message. Software to advance the support of facial expressions in generation of sign language animation could make this technology more acceptable for deaf people.
In this survey, we discuss the challenges in facial expression synthesis and we compare and critique the state of the art projects on generating facial expressions in sign language animations. Beginning with an overview of facial expressions linguistics, sign language animation technologies, and some background on animating facial expressions, a discussion of the search strategy and criteria used to select the five projects that are the primary focus of this survey follows. This survey continues on to introduce the work from the five projects under consideration. Their contributions are compared in terms of support for specific sign language, categories of facial expressions investigated, focus range in the animation generation, use of annotated corpora, input data or hypothesis for their approach, and other factors. Strengths and drawbacks of individual projects are identified in the perspectives above. This survey concludes with our current research focus in this area and future prospects
Μελέτη Ελληνικού Οκτάστιγμου Συστήματος Braille
Η αναχείρας διπλωματική εργασία έχει ως στόχο τη μελέτη του οκτάστιγμου Braille
στα πλαίσια των αναγκών της ελληνικής πραγματικότητας και τη πρόταση ενός
οκτάστιγμου κώδικα που θα καλύπτει το μονοτονικό και το πολυτονικό σύστημα
συμπεριλαμβανομένου των αριθμών και ενός μικρού συνόλου συμβόλων στίξης. Για
την ανάπτυξη του προτεινόμενου κώδικα σχεδιάστηκε συγκεκριμένη μεθοδολογία
μετάβασης από την εξάστιγμη αναπαράσταση Braille των υποστηριζόμενων συμβόλων
στην οκτάστιγμη. H σχεδίαση αυτή βασίστηκε σε συγκεκριμένες αρχές όπως επίτευξη
συντομευμένης αναπαράστασης, διασυνδεσιμότητα με την εξάστιγμη αναπαράσταση,
ομοιότητα κανόνων μετάβασης με άλλες γλώσσες, αναίρεση αμφισημιών, συνέπεια,
και προβλεπτικότητα. Την ανάπτυξη του κώδικα ακολούθησε έλεγχος εγκυρότητας και
μελέτη απόδοσης με βάση τις μετρικές απόστασης από τον εξάστιγμο κώδικα και
εξοικονόμησης χαρακτήρων. Επιπρόσθετα προτάθηκε μια μεθοδολογία για τη μελέτη
αναγνωσιμότητας του προτεινόμενου κώδικα με πραγματικούς χρήστες και αναγνώστες
Braille.This thesis aims at studying the 8-dot Braille system for the needs of Greek
language and to propose an 8-dot Braille code that will cover monotonic and
polytonic orthography, including numbers and a small set of punctuation
symbols. The development of the proposed 8-dot code is based on a transition
methodology from the already existing 6-dot Braille code to the new 8-dot based
on well defined 8-dot transition designing principles such as achieving
abridged representation of the supported symbols, retaining connectivity with
the 6-dot representation, having similarity on the transition rules applied in
other languages, removing ambiguities, and taking into consideration future
extensions. The development of the 8-dot code is followed by validation and
evaluation based on two metrics, the distance from 6-dot code and the achieved
condensability. Additionally, it is proposed a methodology for studying
readability proposed code with real Braille users
Eyetracking Metrics Related to Subjective Assessments of ASL Animations
Analysis of eyetracking data can serve as an alternative method of evaluation when assessing the quality of computer-synthesized animations of American Sign Language (ASL), technology which can make information accessible to people who are deaf or hard-of-hearing, who may have lower levels of written language literacy. In this work, we build and evaluate the efficacy of descriptive models of subjective scores that native signers assign to ASL animations, based on eye-tracking metrics
Best practices for conducting evaluations of sign language animation
Automatic synthesis of linguistically accurate and natural-looking American Sign Language (ASL) animations would make it easier to add ASL content to websites and media, thereby increasing information accessibility for many people who are deaf. Based on several years of studies, we identify best practices for conducting experimental evaluations of sign language animations with feedback from deaf and hard-of-hearing users. First, we describe our techniques for identifying and screening participants, and for controlling the experimental environment. Finally, we discuss rigorous methodological research on how experiment design affects study outcomes when evaluating sign language animations. Our discussion focuses on stimuli design, effect of using videos as an upper baseline, using videos for presenting comprehension questions, and eye-tracking as an alternative to recording question-responses
Data-Driven Synthesis and Evaluation of Syntactic Facial Expressions in American Sign Language Animation
Technology to automatically synthesize linguistically accurate and natural-looking animations of American Sign Language (ASL) would make it easier to add ASL content to websites and media, thereby increasing information accessibility for many people who are deaf and have low English literacy skills. State-of-art sign language animation tools focus mostly on accuracy of manual signs rather than on the facial expressions. We are investigating the synthesis of syntactic ASL facial expressions, which are grammatically required and essential to the meaning of sentences. In this thesis, we propose to: (1) explore the methodological aspects of evaluating sign language animations with facial expressions, and (2) examine data-driven modeling of facial expressions from multiple recordings of ASL signers. In Part I of this thesis, we propose to conduct rigorous methodological research on how experiment design affects study outcomes when evaluating sign language animations with facial expressions. Our research questions involve: (i) stimuli design, (ii) effect of videos as upper baseline and for presenting comprehension questions, and (iii) eye-tracking as an alternative to recording question-responses from participants. In Part II of this thesis, we propose to use generative models to automatically uncover the underlying trace of ASL syntactic facial expressions from multiple recordings of ASL signers, and apply these facial expressions to manual signs in novel animated sentences. We hypothesize that an annotated sign language corpus, including both the manual and non-manual signs, can be used to model and generate linguistically meaningful facial expressions, if it is combined with facial feature extraction techniques, statistical machine learning, and an animation platform with detailed facial parameterization. To further improve sign language animation technology, we will assess the quality of the animation generated by our approach with ASL signers through the rigorous evaluation methodologies described in Part I
Selecting Exemplar Recordings of American Sign Language Non-Manual Expressions for Animation Synthesis Based on Manual Sign Timing
Animations of sign language can increase the accessibility of information for people who are deaf or hard of hearing (DHH), but prior work has demonstrated that accurate non-manual expressions (NMEs), consisting of face and head movements, are necessary to produce linguistically accurate animations that are easy to understand. When synthesizing animation, given a sequence of signs performed on the hands (and their timing), we must select an NME performance. Given a corpus of facial motion-capture recordings of ASL sentences with annotation of the timing of signs in the recording, we investigate methods (based on word count and on delexicalized sign timing) for selecting the best NME recoding to use as a basis for synthesizing a novel animation. By comparing recordings selected using these methods to a gold-standard recording, we identify the top-performing exemplar selection method for several NME categories
ViScene: A collaborative authoring tool for scene descriptions in videos
Ministry of Education, Singapore under its Academic Research Funding Tier; National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ
Data Representativeness in Accessibility Datasets: A Meta-Analysis
As data-driven systems are increasingly deployed at scale, ethical concerns
have arisen around unfair and discriminatory outcomes for historically
marginalized groups that are underrepresented in training data. In response,
work around AI fairness and inclusion has called for datasets that are
representative of various demographic groups. In this paper, we contribute an
analysis of the representativeness of age, gender, and race & ethnicity in
accessibility datasets - datasets sourced from people with disabilities and
older adults - that can potentially play an important role in mitigating bias
for inclusive AI-infused applications. We examine the current state of
representation within datasets sourced by people with disabilities by reviewing
publicly-available information of 190 datasets, we call these accessibility
datasets. We find that accessibility datasets represent diverse ages, but have
gender and race representation gaps. Additionally, we investigate how the
sensitive and complex nature of demographic variables makes classification
difficult and inconsistent (e.g., gender, race & ethnicity), with the source of
labeling often unknown. By reflecting on the current challenges and
opportunities for representation of disabled data contributors, we hope our
effort expands the space of possibility for greater inclusion of marginalized
communities in AI-infused systems.Comment: Preprint, The 24th International ACM SIGACCESS Conference on
Computers and Accessibility (ASSETS 2022), 15 page
Centroid-Based Exemplar Selection of ASL Non-Manual Expressions using Multidimensional Dynamic Time Warping and MPEG4 Features
We investigate a method for selecting recordings of human face and head movements from a sign language corpus to serve as a basis for generating animations of novel sentences of American Sign Language (ASL). Drawing from a collection of recordings that have been categorized into various types of non-manual expressions (NMEs), we define a method for selecting an exemplar recording of a given type using a centroid-based selection procedure, using multivariate dynamic time warping (DTW) as the distance function. Through intra- and inter-signer methods of evaluation, we demonstrate the efficacy of this technique, and we note useful potential for the DTW visualizations generated in this study for linguistic researchers collecting and analyzing sign language corpora