51,748 research outputs found
The ATIS sign language corpus
Systems that automatically process sign language rely on appropriate data. We therefore present the ATIS sign language corpus that is based on the domain of air travel information. It is available for five languages, English, German, Irish sign language, German sign language and South African sign language. The corpus can be used for different tasks like automatic statistical translation and automatic sign language recognition and it allows the specific modelling of spatial references in signing space
Recommended from our members
Analysis of the visual spatiotemporal properties of American Sign Language.
Careful measurements of the temporal dynamics of speech have provided important insights into phonetic properties of spoken languages, which are important for understanding auditory perception. By contrast, analytic quantification of the visual properties of signed languages is still largely uncharted. Exposure to sign language is a unique experience that could shape and modify low-level visual processing for those who use it regularly (i.e., what we refer to as the Enhanced Exposure Hypothesis). The purpose of the current study was to characterize the visual spatiotemporal properties of American Sign Language (ASL) so that future studies can test the enhanced exposure hypothesis in signers, with the prediction that altered vision should be observed within, more so than outside, the range of properties found in ASL. Using an ultrasonic motion tracking system, we recorded the hand position in 3-dimensional space over time during sign language production of signs, sentences, and narratives. From these data, we calculated several metrics: hand position and eccentricity in space and hand motion speed. For individual signs, we also measured total distance travelled by the dominant hand and total duration of each sign. These metrics were found to fall within a selective range, suggesting that exposure to signs is a specific and unique visual experience, which might alter visual perceptual abilities in signers for visual information within the experienced range, even for non-language stimuli
Building a sign language corpus for use in machine translation
In recent years data-driven methods of machine translation (MT) have overtaken rule-based approaches as the predominant means of automatically translating between languages. A pre-requisite for such an approach is a parallel corpus of the source and target languages. Technological developments in sign language (SL) capturing, analysis and processing tools now mean that SL corpora are
becoming increasingly available. With transcription and language analysis tools being mainly designed and used for linguistic purposes, we describe the process of creating a multimedia parallel corpus specifically for the purposes of English to Irish Sign Language (ISL) MT. As part of our larger project on localisation, our research is focussed on developing assistive technology for patients with limited English in the domain of healthcare. Focussing on the first point of contact a patient has with a GPâs office, the
medical secretary, we sought to develop a corpus from the dialogue between the two parties when scheduling an appointment. Throughout the development process we have created one parallel corpus in six different modalities from this initial dialogue. In this paper we discuss the multi-stage process of the development of this parallel corpus as individual and interdependent entities, both for
our own MT purposes and their usefulness in the wider MT and SL research domains
Detection of major ASL sign types in continuous signing for ASL recognition
In American Sign Language (ASL) as well as other signed languages, different classes of signs (e.g., lexical signs, fingerspelled signs, and classifier constructions) have different internal structural properties. Continuous sign recognition accuracy can be improved through use of distinct recognition strategies, as well as different training datasets, for each class of signs. For these strategies to be applied, continuous signing video needs to be segmented into parts corresponding to particular classes of signs. In this paper we present a multiple instance learning-based segmentation system that accurately labels 91.27% of the video frames of 500 continuous utterances (including 7 different subjects) from the publicly accessible NCSLGR corpus (Neidle and Vogler, 2012). The system uses novel feature descriptors derived from both motion and shape statistics of the regions of high local motion. The system does not require a hand tracker
Lost in translation: the problems of using mainstream MT evaluation metrics for sign language translation
In this paper we consider the problems of applying corpus-based techniques to minority languages that are neither politically recognised nor have a formally accepted writing system, namely sign languages. We discuss the adoption of an annotated form of sign language data as a suitable corpus for the development of a data-driven machine translation (MT) system, and deal with issues that arise from its use. Useful software tools that facilitate easy annotation of video data are also discussed. Furthermore, we address the problems of using traditional MT evaluation metrics for sign language translation. Based on the candidate translations produced from our example-based machine translation system, we discuss why standard metrics fall short of providing an accurate evaluation and suggest more suitable evaluation methods
Recommended from our members
The development of complex verb constructions in British Sign Language
This study focuses on the mapping of events onto verb-argument structures in British Sign Language (BSL). The development of complex sentences in BSL is described in a group of 30 children, aged 3;2â12;0, using data from comprehension measures and elicited sentence production. The findings support two interpretations: firstly, in the mapping of concepts onto language, children acquiring BSL overgeneralize the use of argument structure related to perspective shifting;secondly, these overgeneralizations are predicted by the typological characteristics of the language and modality. Children under age 6;0, in attempting to produce sentences encoded through a perspective shift, begin by breaking down double-verb constructions (AB verbs) into components, producing only the part of the verb phrase which describes the perspective of the patient. There is also a prolonged period of development of non-manual features, with the full structure not seen in its adult form until after 9;0. The errors in the use of AB verbs and the subsequent protracted development of correct usage are explained in terms of the conceptualâlinguistic interface
NEW shared & interconnected ASL resources: SignStreamÂŽ 3 Software; DAI 2 for web access to linguistically annotated video corpora; and a sign bank
2017 marked the release of a new version of SignStreamÂŽ software, designed to facilitate linguistic analysis of ASL video. SignStreamÂŽ provides an intuitive interface for labeling and time-aligning manual and non-manual components of the signing. Version 3 has many new features. For example, it enables representation of morpho-phonological information, including display of handshapes. An expanding ASL video corpus, annotated through use of SignStreamÂŽ, is shared publicly on the Web. This corpus (video plus annotations) is Web-accessibleâbrowsable, searchable, and downloadableâthanks to a new, improved version of our Data Access Interface: DAI 2. DAI 2 also offers Web access to a brand new Sign Bank, containing about 10,000 examples of about 3,000 distinct signs, as produced by up to 9 different ASL signers. This Sign Bank is also directly accessible from within SignStreamÂŽ, thereby boosting the efficiency and consistency of annotation; new items can also be added to the Sign Bank. Soon to be integrated into SignStreamÂŽ 3 and DAI 2 are visualizations of computer-generated analyses of the video: graphical display of eyebrow height, eye aperture, an
- âŚ