47 research outputs found

    Toward a typeface for the transcription of facial actions in sign languages

    Get PDF
    International audienceNon-manual actions, and more specifically facial actions (FA) can be found in all Sign Languages (SL). Those actions involve all the different facial parts and can have various and intricate linguistic relations with manual signs. Unlike in vocal languages, FA in SL provide more meaning than simple expressions of feelings and emotions. Yet non-manual parameters are among the most unknown formal features in SL studies. During the past 30 years, some studies have started questioning the meanings and linguistic values and the relations between manual and non-manual signs (Crashborn et al. 2008; Crashborn & Bank 2014); more recently, SL corpora have been analysed, segmented, and transcribed to help study FA (Vogst-Svenden 2008; Bergman et al. 2008; Sutton-Spence & Day 2008).Moreover, to fill the lack of an annotation system for FA, a few manual annotation systems have integrated facial glyphs, such as HamNoSys (Prillwitz et al. 1989) and SignWriting (Sutton 1995). On one hand, HamNoSys has been developed to describe all existing SLs at a phonetic level; it allows a formal, linear, highly detailed and searchable description of manual parameters. As for non-manual parameters, HamNoSys offers the replacement of hands by another articulators. Non-manual parameters can be written as “eyes” or “mouth” and described by the same symbols developed for hands (Hanke 2004). Unfortunately only a limited number of manual symbols can be translated into FA and the annotation system remains incomplete. On the other hand, SignWriting describes SL with iconic symbols placed in a 2D space representing the signer’s body. Facial expressions are divided into mouth, eyes, nose, eyebrows, etc., and are drawn in a circular “head” much like emoticons. SignWriting offers a detailed description of posture and actions of non-manual parameters, but does not ensure compatibility with the most common annotation software used by SL linguists (e.g., ELAN).Typannot, a interdisciplinary project led by linguists, designers, and developers, which aims to set up a complete transcription system for SL that includes every SL parameter (handshape, localisation, movement, FA), has developed a different methodologie. As mentioned earlier, FA have various linguistic values (mouthings, adverbial mouth gestures, semantically empty, enacting, whole face) and also include prosody and emotional meanings. In this regard, they can be more variable and signer-related than manual parameters. To offer the best annotation tool, Typannot’s approach has been to define facial parameters and all their possible tangible configurations. The goal is to set up the most efficient, simple, yet complete and universal formula to describe all possible FA.This formula is based on a 3 dimensional grid. Indeed all the different configurations of FA can be described by its X, Y, Z axis position. As a result, all FA can be described and encoded using a restricted list of 39 qualifiers. Based on this model and to help reduce the annotation process, a set of generic glyphs has been developed. Each qualifier has its own symbolic “generic” glyph. This methodical decomposition of all facial components enables a precise and accurate transcription of a complex FA using only a few glyphs.This formula and its generic glyphs have gone through a series of tests and revisions. Recently, an 18m20s long FA corpus of two deaf signers has been recorded using two different cameras. The first one, RGB HQ, is used to capture a high quality image and the second, infrared Kinect, is used to captured the depth. The latter was linked with Brekel Proface 2 (Leong et al. 2015), a 3D animation software that enables an automatic recognition of FA. This corpus has been fully annotated using Typannot’s generic glyphs. These annotations have enabled the validation of the general structure of Typannot FAformula and to identify some minor corrections to be made. For instance, it has been shown that the description of the air used to puff out or suck in cheeks is too restrictive and the description of the opening and closure of the eyelids is too unnecessarily precise.When those changes are implemented, our next task will be to develop a morphological glyphic system that will combine the different generic glyphs used for each facial parameter into one unique morphological glyph. This means that for any given FA, all the information contained in Typannot descriptive formula will be contained within one legible glyph. Some early research and work has already begun on this topic, but needs further development before providing a statement on its typographic structure. When this system is completed, it will be released with its own virtual keyboard (Typannot Keyboard, currently in development for handshapes) to help transcription and improve annotation processes.Bibliography :-ChĂ©telat-PelĂ©, E. (2010). Les Gestes Non Manuels en Langue des Signes Française ; Annotation, analyse et formalisation : application aux mouvements des sourcils et aux clignements des yeux. UniversitĂ© de Provence - Aix-Marseille I.-Crasborn, O., Van Der Kooij, E., Waters, D., Woll, B., & Mesch, J. (2008). Frequency distribution and spreading behavior of different types of mouth actions in three sign languages. Sign Language & Linguistics, 11(1), 45–67.-Crasborn, O. A., & Bank, R. (2014). An annotation scheme for the linguistic study of mouth actions in sign languages. http://repository.ubn.ru.nl/handle/2066/132960-Fontana, S. (2008). Mouth actions as gesture in sign language. Gesture, 8(1), 104‑123.-Hanke, T. (2004). HamNoSys—Representing sign language data in language resources and language processing contexts. In Workshop on the Representation and Processing of Sign Languages on the occasion of the Fourth International Conference on Language Resources and Evaluation (p. 1‑6).-Leong, C. W., Chen, L., Feng, G., Lee, C. M., & Mulholland, M. (2015). Utilizing depth sensors for analyzing multimodal presentations: Hardware, software and toolkits (p. 547‑556).Presented at Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, ACM.-Prillwitz, S., Leven, R., Zienert, H., Hanke, T., & Henning, J. (1989). Hamburg notation system for sign languages: An introductory guide. Signum Press, Hamburg.-Sandler, W. (2009). Symbiotic symbolization by hand and mouth in sign language. Semiotica, 2009(174), 241‑275. http://doi.org/10.1515/semi.2009.035-Sutton, V. (1995). Lessons in Sign Writing: Textbook. DAC, La Jolla (CA).-Sutton-Spence, R., & Boyes-Braem, P. (2001). The hands are the head of the mouth: The mouth as articulator in sign languages. Signum Press, Hamburg

    SystÚmes graphématiques et écritures des langues signées

    Get PDF
    Depuis quelques dĂ©cennies, les langues des signes (LS) connaissent l’apparition de systĂšmes de notations qui s’inscrivent dans une pratique dominĂ©e par les Ă©critures de langues vocales (LV). Alors que ces derniĂšres sont caractĂ©risĂ©es par une rupture de modalitĂ© entre l’écrit et l’oral, les LS introduisent la situation inĂ©dite d’un possible partage des modalitĂ©s en convoquant dans leur production les mĂȘmes articulateurs (le membre supĂ©rieur) et dans leur rĂ©ception la mĂȘme modalitĂ© visuelle. Ces circonstances de partage des mĂȘmes modalitĂ©s gestuo-visuelle placent l’écriture et l’oralitĂ© des LS dans un rapport de coexistence, voire de sĂ©miotiques partagĂ©es. L’acte d’écrire peut alors se manifester comme une inscription des dimensions formelles (parfois hautement graphiques) et gestuelles intrinsĂšques de l’oralitĂ© et de l’expression des LS. Nous chercherons dans cet article Ă  dĂ©velopper les fondements d’une approche analogique visuo-gestuelle ancrĂ©e d’une part dans l’oralitĂ©, c’est Ă  dire le geste, et d’autre part dans le tracĂ© comme vecteur de sens communs Ă  la langue et son Ă©criture. ConsidĂ©rer ainsi l’expĂ©rience scripturale trouve un Ă©cho dans la thĂ©orie cognitive de l’énaction (Varela et al., 1991) ou plus gĂ©nĂ©ralement dans les hypothĂšses portant sur la cognition incarnĂ©e.Nous prĂ©senterons trois des principaux systĂšmes de notation actuellement en usage, dont l’usage est le plus souvent limitĂ© aux chercheurs en linguistique de ces langues. Les principales caractĂ©ristiques de ces systĂšmes de notation seront abordĂ©es ; il s’agit de leur lisibilitĂ© en lien avec le formalisme sur lequel ils s’appuient, et aussi de leur capacitĂ© Ă  ĂȘtre Ă©crits et maniĂ©s (en particulier au regard du principe de linĂ©aritĂ© des Ă©critures existantes). Dans une deuxiĂšme partie, les problĂ©matiques que l’écriture des LS posent seront exposĂ©es en dĂ©tails : l’articulation des modalitĂ©s de production et des modalitĂ©s sĂ©miotique orale/Ă©crite, les domaines analogiques exploitables dans la construction des principes glyphiques et les rapports sĂ©miotiques nouveaux entre traces et tracĂ©s qu’apporte l’écriture des LS. Dans la troisiĂšme partie, quelques Ă©lĂ©ments des thĂ©ories sur l’écriture seront examinĂ©s au regard de ce qu’une Ă©criture des LS apporte. Ils seront illustrĂ©s par le systĂšme de notation pour les LS appelĂ© « Typannot » (sur lequel les auteurs du prĂ©sent article travaillent) et plus spĂ©cifiquement par ses choix graphĂ©matiques et typographiques. Nous dĂ©veloppons pour cela une rĂ©flexion Ă  plusieurs niveaux rendant compte de ce couplage entre la technique et notre activitĂ© scripturale, portant sur la lisibilitĂ©, la modularitĂ©, la scripturalitĂ© et l’automatisation (requĂȘtabilitĂ©). Ces critĂšres constituent le cadre d’une rĂ©flexion prenant racine dans les systĂšmes d’écriture des LV pour Ă©voluer vers des problĂ©matiques proprement liĂ©es aux modalitĂ©s gestuelles et formelles des LS.Over the last few decades, various writing systems for sign languages (SLs) have been developed in a context where vocal languages (VLs) obviously prevail. In fact, VLs are characterized by a switch of modality between writing and speaking, while SLs present a brand new situation with the possibility of sharing modalities. Indeed, in the act of speaking and writing, SLs use similar modalities (upper body limbs movement) and reception (vision). This unprecedented situation of shared visuo-gestural modalities allows writing and speaking to meet in a form of cohabitation, sharing semiotic features. Scripturality may emerge from the formal (highly graphical in many situations) and gestural dimensions that are inherent to oral expression in SLs. The goal of this article is to establish an approach that links visuo-gestual modalities. On one hand, this approach is founded in ‘orality,’ i.e., in gesture; on the other hand, this approach is rooted in the act of drawing (tracing) as a link between language and writing. Similarly to the way SLs are able to assign meaning to movements, the meaning represented in the marks intended to be read can be back-traced to the body in action. To consider the scriptural experience in this way resonates with the cognitive theory of enaction (Varela et al., 1991) and, more broadly, with the hypotheses of embodied cognition.First we will present the three main notation systems that are used—almost exclusively—among researchers in linguistics studies of SLs. We will discuss their specific characteristics: the visual principles on which they are built in regard to their legibility, and also their ability to be written and used (focusing on the principle of linearity found in other writing systems). In the second part, we will explore in detail some issues of writing SLs such as: how can we articulate the modalities of the act of writing with the semiotic modalities of language itself? What are the analogies available to build a glyphic system? Can the act of tracing and the trace itself boost new semiotic relations in the writing of SLs? In the third part, we will look at some theoretical aspects of existing writing systems and put them into the perspective of the writing of SLs. Finally, this matter will be dealt with through the presentation of Typannot, a notation system on which we are currently working. We will focus on describing the graphematic and typographic principles of this new system. Several conceptual levels are envisioned in order to justify the coupling between technical aspects and writing/tracing activity, with the goal of obtaining a system aimed at: readability, modularity, writability and searchability. These criteria are an attempt to translate concepts of writing systems in VLs for specific issues related to the visual and gestural modalities of SLs

    Reconnaissance de parole beatboxée à l'aide d'un systÚme HMM-GMM inspiré de la reconnaissance automatique de la parole

    No full text
    Le human-beatbox est un art vocal utilisant les organes de la parole pour produire des sons percussifs et imiter les instruments de musique. La classification des sons du beatbox reprĂ©sente actuellement un dĂ©fi. Nous proposons un systĂšme de reconnaissance des sons de beatbox s’inspirant de la reconnaissance automatique de la parole. Nous nous appuyons sur la boĂźte Ă  outils Kaldi, qui est trĂšs utilisĂ©e dans le cadre de la reconnaissance automatique de la parole (RAP). Notre corpus est composĂ© de sons isolĂ©s produits par deux beatboxers et se compose de 80 sons diffĂ©rents. Nous nous sommes concentrĂ©s sur le dĂ©codage avec des modĂšles acoustiques monophones, Ă  base de HMM-GMM. La transcription utilisĂ©e s’appuie sur un systĂšme d’écriture spĂ©cifique aux beatboxers, appelĂ© Vocal Grammatics (VG). Ce systĂšme d’écriture s’appuie sur les concepts de la phonĂ©tique articulatoire

    Human Beatbox Sound Recognition using an Automatic Speech Recognition Toolkit

    No full text
    International audienceHuman beatboxing is a vocal art making use of speech organs to produce vocal drum sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a large-vocabulary humanbeatbox sound recognition system was developed with an adaptation of Kaldi toolbox, a widely-used tool for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox specific morphographic writing system (Vocal Grammatics). The recognition-system robustness to recording conditions was assessed on recordings of six different microphones and settings. The decoding part was made with monophone acoustic models trained with a classical HMM-GMM model. A change of acoustic features (MFCC, PLP, Fbank) and a variation of different parameters of the beatbox recognition system were tested : i) the number of HMM states, ii) the number of MFCC, iii) the presence or not of a pause boxeme in right and left contexts in the lexicon and iv) the rate of silence probability. Our best model was obtained with the addition of a pause in left and right contexts of each boxeme in the lexicon, a 0.8 silence probability, 22 MFCC and three states HMM. Boxeme error rate in such configuration was lowered to 13.65%, and 8.6 boxemes over 10 were well recognized. The recording settings did not greatly affect system performance, apart from recording with closed-cup technique

    Reconnaissance de parole beatboxée à l'aide d'un systÚme HMM-GMM inspiré de la reconnaissance automatique de la parole

    No full text
    Le human-beatbox est un art vocal utilisant les organes de la parole pour produire des sons percussifs et imiter les instruments de musique. La classification des sons du beatbox reprĂ©sente actuellement un dĂ©fi. Nous proposons un systĂšme de reconnaissance des sons de beatbox s’inspirant de la reconnaissance automatique de la parole. Nous nous appuyons sur la boĂźte Ă  outils Kaldi, qui est trĂšs utilisĂ©e dans le cadre de la reconnaissance automatique de la parole (RAP). Notre corpus est composĂ© de sons isolĂ©s produits par deux beatboxers et se compose de 80 sons diffĂ©rents. Nous nous sommes concentrĂ©s sur le dĂ©codage avec des modĂšles acoustiques monophones, Ă  base de HMM-GMM. La transcription utilisĂ©e s’appuie sur un systĂšme d’écriture spĂ©cifique aux beatboxers, appelĂ© Vocal Grammatics (VG). Ce systĂšme d’écriture s’appuie sur les concepts de la phonĂ©tique articulatoire

    Reconnaissance de parole beatboxée à l'aide d'un systÚme HMM-GMM inspiré de la reconnaissance automatique de la parole

    No full text
    Le human-beatbox est un art vocal utilisant les organes de la parole pour produire des sons percussifs et imiter les instruments de musique. La classification des sons du beatbox reprĂ©sente actuellement un dĂ©fi. Nous proposons un systĂšme de reconnaissance des sons de beatbox s’inspirant de la reconnaissance automatique de la parole. Nous nous appuyons sur la boĂźte Ă  outils Kaldi, qui est trĂšs utilisĂ©e dans le cadre de la reconnaissance automatique de la parole (RAP). Notre corpus est composĂ© de sons isolĂ©s produits par deux beatboxers et se compose de 80 sons diffĂ©rents. Nous nous sommes concentrĂ©s sur le dĂ©codage avec des modĂšles acoustiques monophones, Ă  base de HMM-GMM. La transcription utilisĂ©e s’appuie sur un systĂšme d’écriture spĂ©cifique aux beatboxers, appelĂ© Vocal Grammatics (VG). Ce systĂšme d’écriture s’appuie sur les concepts de la phonĂ©tique articulatoire

    Coding Movement in Sign Languages: the Typannot Approach

    Get PDF
    International audienceTypannot is an innovative transcription system (TranSys) for Sign Languages (SLs), based on robust graphematic and coherent typographic formulas. It is characterized by readability, writability, searchability, genericity and modularity. Typannot can be used to record handshapes, mouth actions, facial expressions, initial locations (LOCini) and movements of the upper limbs (MOV). For LOCini and MOV, Typannot uses intrinsic frames of reference (iFoR) to describe the position of each segment (arm, forearm, hand) in terms of degrees of freedom (DoF). It assumes that the motion is subdivided into a complex moment of initial preparation, leading to the stabilization of a LOCini, and a subsequent phase of MOV deployment based on simple motor patterns. The goal of Typannot is not only to create a new TranSys, but also to provide an instrument to advance the knowledge about SLs. The observation of the SLs makes it possible to formulate various hypotheses, among which: 1) MOV follows a simple motor scheme that aims at minimizing motor control during MOV; 2) proximal→distal flows of MOV are predominant in SLs. Only the use of a TranSys based on iFoR and the description of the DoF makes it possible to explore the data in order to test these hypotheses. CCS CONCEPTS ‱ Human-centered computing~Interaction devices ‱ Human-centered computing~Gestural input ‱ Human-centered computing~Accessibility ‱ Social and professional topics~People with disabilitie

    La codifica del movimento nelle lingue dei segni: l'approccio di Typannot

    No full text
    see publication #018 (HAL-2342251)International audienceTypannot is an innovative transcription system (TranSys) for Sign Languages (SLs), based on robust graphematic and coherent typographic formulas. It is characterized by readability, writability, seachability, genericity and modularity. Typannot can be used to record handshapes, mouth actions, facial expressions, initial locations (LOCini) and movements of the upper limbs (MOV). For LOCine and MOV, Typannot uses intrinsic frames of reference (iFoR) to describe the position of each segment (arm, forearm, hand) in terms of degrees of freedom (DoF). It assumes that the motion is subdivided into a complex moment of initial preparation, leading to the stabilization of a LOCini, and a subsequent phase of MOV deployment based on simple motor patterns. The goal of Typannot is not only to create a new TranSys, but also to provide an instrument to advance the knowledge about SLs. The observation of the SLs makes it possible to formulate various hypotheses, among which: i) MOV follows a simple motor scheme that aims at minimizing motor control during MOV; ii) proximal→distal flows of MOV are predominant in SLs. Only the use of a TranSys based on an iFoR and the description of the DoF makes it possible to explore the data in order to test these hypotheses
    corecore