1,174 research outputs found

    NEW shared & interconnected ASL resources: SignStreamÂź 3 Software; DAI 2 for web access to linguistically annotated video corpora; and a sign bank

    Get PDF
    2017 marked the release of a new version of SignStream¼ software, designed to facilitate linguistic analysis of ASL video. SignStream¼ provides an intuitive interface for labeling and time-aligning manual and non-manual components of the signing. Version 3 has many new features. For example, it enables representation of morpho-phonological information, including display of handshapes. An expanding ASL video corpus, annotated through use of SignStream¼, is shared publicly on the Web. This corpus (video plus annotations) is Web-accessible—browsable, searchable, and downloadable—thanks to a new, improved version of our Data Access Interface: DAI 2. DAI 2 also offers Web access to a brand new Sign Bank, containing about 10,000 examples of about 3,000 distinct signs, as produced by up to 9 different ASL signers. This Sign Bank is also directly accessible from within SignStream¼, thereby boosting the efficiency and consistency of annotation; new items can also be added to the Sign Bank. Soon to be integrated into SignStream¼ 3 and DAI 2 are visualizations of computer-generated analyses of the video: graphical display of eyebrow height, eye aperture, an

    Articulatory features for robust visual speech recognition

    Full text link

    Read My Lips: Continuous Signer Independent Weakly Supervised Viseme Recognition

    Full text link
    Abstract. This work presents a framework to recognise signer indepen-dent mouthings in continuous sign language, with no manual annotations needed. Mouthings represent lip-movements that correspond to pronun-ciations of words or parts of them during signing. Research on sign lan-guage recognition has focused extensively on the hands as features. But sign language is multi-modal and a full understanding particularly with respect to its lexical variety, language idioms and grammatical structures is not possible without further exploring the remaining information chan-nels. To our knowledge no previous work has explored dedicated viseme recognition in the context of sign language recognition. The approach is trained on over 180.000 unlabelled frames and reaches 47.1 % precision on the frame level. Generalisation across individuals and the influence of context-dependent visemes are analysed

    Articulatory features for robust visual speech recognition

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 99-105).This thesis explores a novel approach to visual speech modeling. Visual speech, or a sequence of images of the speaker's face, is traditionally viewed as a single stream of contiguous units, each corresponding to a phonetic segment. These units are defined heuristically by mapping several visually similar phonemes to one visual phoneme, sometimes referred to as a viseme. However, experimental evidence shows that phonetic models trained from visual data are not synchronous in time with acoustic phonetic models, indicating that visemes may not be the most natural building blocks of visual speech. Instead, we propose to model the visual signal in terms of the underlying articulatory features. This approach is a natural extension of feature-based modeling of acoustic speech, which has been shown to increase robustness of audio-based speech recognition systems. We start by exploring ways of defining visual articulatory features: first in a data-driven manner, using a large, multi-speaker visual speech corpus, and then in a knowledge-driven manner, using the rules of speech production. Based on these studies, we propose a set of articulatory features, and describe a computational framework for feature-based visual speech recognition. Multiple feature streams are detected in the input image sequence using Support Vector Machines, and then incorporated in a Dynamic Bayesian Network to obtain the final word hypothesis. Preliminary experiments show that our approach increases viseme classification rates in visually noisy conditions, and improves visual word recognition through feature-based context modeling.by Ekaterina Saenko.S.M

    A new visual speech modelling approach for visual speech recognition

    Get PDF
    In this paper we propose a new learning-based representation that is referred to as Visual Speech Unit (VSU) for visual speech recognition (VSR). The new Visual Speech Unit concept proposes an extension of the standard viseme model that is currently applied for VSR by including in this representation not only the data associated with the visemes, but also the transitory information between consecutive visemes. The developed speech recognition system consists of several computational stages: (a) lips segmentation, (b) construction of the Expectation-Maximization Principal Component Analysis (EM-PCA) manifolds from the input video image, (c) registration between the models of the VSUs and the EM-PCA data constructed from the input image sequence and (d) recognition of the VSUs using a standard Hidden Markov Model (HMM) classification scheme. In this paper we were particularly interested to evaluate the classification accuracy obtained for our new VSU models when compared with that attained for standard (MPEG-4) viseme models. The experimental results indicate that we achieved 90% recognition rate when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 52%

    A vision-based approach for human hand tracking and gesture recognition.

    Get PDF
    Hand gesture interface has been becoming an active topic of human-computer interaction (HCI). The utilization of hand gestures in human-computer interface enables human operators to interact with computer environments in a natural and intuitive manner. In particular, bare hand interpretation technique frees users from cumbersome, but typically required devices in communication with computers, thus offering the ease and naturalness in HCI. Meanwhile, virtual assembly (VA) applies virtual reality (VR) techniques in mechanical assembly. It constructs computer tools to help product engineers planning, evaluating, optimizing, and verifying the assembly of mechanical systems without the need of physical objects. However, traditional devices such as keyboards and mice are no longer adequate due to their inefficiency in handling three-dimensional (3D) tasks. Special VR devices, such as data gloves, have been mandatory in VA. This thesis proposes a novel gesture-based interface for the application of VA. It develops a hybrid approach to incorporate an appearance-based hand localization technique with a skin tone filter in support of gesture recognition and hand tracking in the 3D space. With this interface, bare hands become a convenient substitution of special VR devices. Experiment results demonstrate the flexibility and robustness introduced by the proposed method to HCI.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .L8. Source: Masters Abstracts International, Volume: 43-03, page: 0883. Adviser: Xiaobu Yuan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    The future of dialects: Selected papers from Methods in Dialectology XV

    Get PDF
    Traditional dialects have been encroached upon by the increasing mobility of their speakers and by the onslaught of national languages in education and mass media. Typically, older dialects are “leveling” to become more like national languages. This is regrettable when the last articulate traces of a culture are lost, but it also promotes a complex dynamics of interaction as speakers shift from dialect to standard and to intermediate compromises between the two in their forms of speech. Varieties of speech thus live on in modern communities, where they still function to mark provenance, but increasingly cultural and social provenance as opposed to pure geography. They arise at times from the need to function throughout the different groups in society, but they also may have roots in immigrants’ speech, and just as certainly from the ineluctable dynamics of groups wishing to express their identity to themselves and to the world. The future of dialects is a selection of the papers presented at Methods in Dialectology XV, held in Groningen, the Netherlands, 11-15 August 2014. While the focus is on methodology, the volume also includes specialized studies on varieties of Catalan, Breton, Croatian, (Belgian) Dutch, English (in the US, the UK and in Japan), German (including Swiss German), Italian (including Tyrolean Italian), Japanese, and Spanish as well as on heritage languages in Canada

    The future of dialects: Selected papers from Methods in Dialectology XV

    Get PDF
    Traditional dialects have been encroached upon by the increasing mobility of their speakers and by the onslaught of national languages in education and mass media. Typically, older dialects are “leveling” to become more like national languages. This is regrettable when the last articulate traces of a culture are lost, but it also promotes a complex dynamics of interaction as speakers shift from dialect to standard and to intermediate compromises between the two in their forms of speech. Varieties of speech thus live on in modern communities, where they still function to mark provenance, but increasingly cultural and social provenance as opposed to pure geography. They arise at times from the need to function throughout the different groups in society, but they also may have roots in immigrants’ speech, and just as certainly from the ineluctable dynamics of groups wishing to express their identity to themselves and to the world. The future of dialects is a selection of the papers presented at Methods in Dialectology XV, held in Groningen, the Netherlands, 11-15 August 2014. While the focus is on methodology, the volume also includes specialized studies on varieties of Catalan, Breton, Croatian, (Belgian) Dutch, English (in the US, the UK and in Japan), German (including Swiss German), Italian (including Tyrolean Italian), Japanese, and Spanish as well as on heritage languages in Canada

    The future of dialects: Selected papers from Methods in Dialectology XV

    Get PDF
    Traditional dialects have been encroached upon by the increasing mobility of their speakers and by the onslaught of national languages in education and mass media. Typically, older dialects are “leveling” to become more like national languages. This is regrettable when the last articulate traces of a culture are lost, but it also promotes a complex dynamics of interaction as speakers shift from dialect to standard and to intermediate compromises between the two in their forms of speech. Varieties of speech thus live on in modern communities, where they still function to mark provenance, but increasingly cultural and social provenance as opposed to pure geography. They arise at times from the need to function throughout the different groups in society, but they also may have roots in immigrants’ speech, and just as certainly from the ineluctable dynamics of groups wishing to express their identity to themselves and to the world. The future of dialects is a selection of the papers presented at Methods in Dialectology XV, held in Groningen, the Netherlands, 11-15 August 2014. While the focus is on methodology, the volume also includes specialized studies on varieties of Catalan, Breton, Croatian, (Belgian) Dutch, English (in the US, the UK and in Japan), German (including Swiss German), Italian (including Tyrolean Italian), Japanese, and Spanish as well as on heritage languages in Canada
    • 

    corecore