1,184 research outputs found

    Signing for the deaf using virtual humans

    Get PDF
    Research at Televirtual (Norwich) and the University of East Anglia, funded predominantly by the Independent Television Commission and more recently by the UK Post Office also, has investigated the feasibility of using virtual signing as a communication medium for presenting information to the deaf. We describe and demonstrate the underlying virtual signer technology, and discuss the language processing techniques and discourse models which have been investigated for information communication in a transaction application in Post Offices, and for presentation of more general textual material in texts such as subtitles accompanying television programme

    Multimodal Dialogue Management for Multiparty Interaction with Infants

    Full text link
    We present dialogue management routines for a system to engage in multiparty agent-infant interaction. The ultimate purpose of this research is to help infants learn a visual sign language by engaging them in naturalistic and socially contingent conversations during an early-life critical period for language development (ages 6 to 12 months) as initiated by an artificial agent. As a first step, we focus on creating and maintaining agent-infant engagement that elicits appropriate and socially contingent responses from the baby. Our system includes two agents, a physical robot and an animated virtual human. The system's multimodal perception includes an eye-tracker (measures attention) and a thermal infrared imaging camera (measures patterns of emotional arousal). A dialogue policy is presented that selects individual actions and planned multiparty sequences based on perceptual inputs about the baby's internal changing states of emotional engagement. The present version of the system was evaluated in interaction with 8 babies. All babies demonstrated spontaneous and sustained engagement with the agents for several minutes, with patterns of conversationally relevant and socially contingent behaviors. We further performed a detailed case-study analysis with annotation of all agent and baby behaviors. Results show that the baby's behaviors were generally relevant to agent conversations and contained direct evidence for socially contingent responses by the baby to specific linguistic samples produced by the avatar. This work demonstrates the potential for language learning from agents in very young babies and has especially broad implications regarding the use of artificial agents with babies who have minimal language exposure in early life

    Assessing a 3D digital Prototype for Teaching the Brazilian Sign Language Alphabet: an Alternative for Non-programming Designers

    Get PDF
    This study aims to analyse the users’ perceptions about a 3D digital artifacts prototype for teaching the fingerspelling alphabet of Brazilian Sign Language (LIBRAS). For this purpose, a high-fidelity prototype was developed with a non-programming method, and a usability test was conducted using a structured questionnaire with 31 participants, including Deaf and hearing people. Most users (96.7%) rated the learning experience with the tool as positive, with 67.7% rating the experience as "good", 12.9% as "very good", and 16.1% as "excellent". Comparing the evaluation between Deaf and hearing people showed that both target groups mostly rated it positively. However, most hearing people rated it "good," while the majority of the Deaf rated it as "excellent" (29%) or "outstanding" (14%) compared to 13% and 12%, respectively, among the hearing. In summary, considering the variables presented, the experience was well rated and did not encounter solid obstacles or resistance

    A Representation of Selected Nonmanual Signals in American Sign Language

    Get PDF
    Computer-generated three-dimensional animation holds great promise for synthesizing utterances in American Sign Language (ASL) that are not only grammatical, but believable by members of the Deaf community. Animation poses several challenges stemming from the massive amounts of data necessary to specify the movement of three-dimensional geometry, and there is no current system that facilitates the synthesis of nonmanual signals. However, the linguistics of ASL can aid in surmounting the challenge by providing structure and rules for organizing the data. This work presents a first method for representing ASL linguistic and extralinguistic processes that involve the face. Any such representation must be capable of expressing the subtle nuances of ASL. Further, it must be able to represent co-occurrences because many ASL signs require that two or more nonmanual signals be used simultaneously. In fact simultaneity of multiple nonmanual signals can occur on the same facial feature. Additionally, such a system should allow both binary and incremental nonmanual signals to display the full range of adjectival and adverbial modifiers. Validating such a representation requires both the affirmation that nonmanual signals are indeed necessary in the animation of ASL, and the evaluation of the effectiveness of the new representation in synthesizing nonmanual signals. In this study, members of the Deaf community viewed animations created with the new representation and answered questions concerning the influence of selected nonmanual signals on the perceived meaning of the synthesized utterances. Results reveal that, not only is the representation capable of effectively portraying nonmanual signals, but also that it can be used to combine various nonmanual signals in the synthesis of complete ASL sentences. In a study with Deaf users, participants viewing synthesized animations consistently identified the intended nonmanual signals correctly

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency

    Loyalty cards and the problem of CAPTCHA: 2nd tier security and usability issues for senior citizens

    Get PDF
    Information Security often works in antipathy to access and useability in communities of older citizens. Whilst security features are required to prevent the disclosure of information, some security tools have a deleterious effect upon users, resulting in insecure practices. Security becomes unfit for purpose where users prefer to abandon applications and online benefits in favour of non-digital authentication and verification requirements. For some, the ability to read letters and symbols from a distorted image is a decidedly more difficult task than for others, and the resulting level of security from CAPTCHA tests is not consistent from person to person. This paper discusses the changing paradigm regarding second tier applications where non-essential benefits are forgone in order to avoid the frustration, uncertainty and humiliation of repeated failed attempts to access online software by means of CAPTCHA

    Making academia more accessible

    Get PDF
    Academia can be a challenging place to work and academics who have a disability, neurodiversity or chronic illness are further disadvantaged, as non-stereotypical ways of working are not necessarily supported or catered for. The remit of this paper is to provide practical ideas and recommendations to address accessibility issues in events and conferences as a first step to improving existing working conditions. We start with providing a brief overview of and background to the issues of ableism, disabilities, chronic illnesses and neurodiversities in academia. We then offer a detailed description of the organisational and developmental strategies relating to the Ableism in Academia conference to practically demonstrate how accessibility can be achieved. Despite vast literature available on theorisations of reasonable adjustments and some individual handbooks on conference accessibility, noted the absence of a systematic write-up of a case study that would demonstrate the thought processes required for the organisation of a fully accessible and inclusive event. This paper provides almost a step-by-step rationale and rundown of the decisions that had to be taken in order to facilitate an accessible event. After a brief consideration of challenges we encountered along the way, we share personal reflections regarding the event and future developments
    corecore