93,370 research outputs found

    Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

    Get PDF
    A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept

    The left superior temporal gyrus is a shared substrate for auditory short-term memory and speech comprehension: evidence from 210 patients with stroke

    Get PDF
    Competing theories of short-term memory function make specific predictions about the functional anatomy of auditory short-term memory and its role in language comprehension. We analysed high-resolution structural magnetic resonance images from 210 stroke patients and employed a novel voxel based analysis to test the relationship between auditory short-term memory and speech comprehension. Using digit span as an index of auditory short-term memory capacity we found that the structural integrity of a posterior region of the superior temporal gyrus and sulcus predicted auditory short-term memory capacity, even when performance on a range of other measures was factored out. We show that the integrity of this region also predicts the ability to comprehend spoken sentences. Our results therefore support cognitive models that posit a shared substrate between auditory short-term memory capacity and speech comprehension ability. The method applied here will be particularly useful for modelling structure–function relationships within other complex cognitive domains

    A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition

    Get PDF
    We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s

    Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models

    Get PDF
    Common noise compensation techniques use vector Taylor series (VTS) to approximate the mismatch function. Recent work shows that the approximation accuracy may be improved by sampling. One such sampling technique is the unscented transform (UT), which draws samples deterministically from clean speech and noise model to derive the noise corrupted speech parameters. This paper applies UT to noise compensation of the subspace Gaussian mixture model (SGMM). Since UT requires relatively smaller number of samples for accurate estimation, it has significantly lower computational cost compared to other random sampling techniques. However, the number of surface Gaussians in an SGMM is typically very large, making the direct application of UT, for compensating individual Gaussian components, computationally impractical. In this paper, we avoid the computational burden by employing UT in the framework of joint uncertainty decoding (JUD), which groups all the Gaussian components into small number of classes, sharing the compensation parameters by class. We evaluate the JUD-UT technique for an SGMM system using the Aurora 4 corpus. Experimental results indicate that UT can lead to increased accuracy compared to VTS approximation if the JUD phase factor is untuned, and to similar accuracy if the phase factor is tuned empirically. 1

    The adult literacy evaluator: An intelligent computer-aided training system for diagnosing adult illiterates

    Get PDF
    An important part of NASA's mission involves the secondary application of its technologies in the public and private sectors. One current application being developed is The Adult Literacy Evaluator, a simulation-based diagnostic tool designed to assess the operant literacy abilities of adults having difficulties in learning to read and write. Using ICAT system technology in addition to speech recognition, closed-captioned television (CCTV), live video and other state-of-the art graphics and storage capabilities, this project attempts to overcome the negative effects of adult literacy assessment by allowing the client to interact with an intelligent computer system which simulates real-life literacy activities and materials and which measures literacy performance in the actual context of its use. The specific objectives of the project are as follows: (1) To develop a simulation-based diagnostic tool to assess adults' prior knowledge about reading and writing processes in actual contexts of application; (2) to provide a profile of readers' strengths and weaknesses; and (3) to suggest instructional strategies and materials which can be used as a beginning point for remediation. In the first and developmental phase of the project, descriptions of literacy events and environments are being written and functional literacy documents analyzed for their components. Examples of literacy events and situations being considered included interactions with environmental print (e.g., billboards, street signs, commercial marquees, storefront logos, etc.), functional literacy materials (e.g., newspapers, magazines, telephone books, bills, receipts, etc.) and employment related communication (i.e., job descriptions, application forms, technical manuals, memorandums, newsletters, etc.). Each of these situations and materials is being analyzed for its literacy requirements in terms of written display (i.e., knowledge of printed forms and conventions), meaning demands (i.e., comprehension and word knowledge) and social situation. From these descriptions, scripts are being generated which define the interaction between the student, an on-screen guide and the simulated literacy environment. The proposed outcome of the Evaluator is a diagnostic profile which will present broad classifications of literacy behaviors across the major areas of metacognitive abilities, word recognition, vocabulary knowledge, comprehension and writing. From these classifications, suggestions for materials and strategies for instruction with which to begin corrective action will be made. The focus of the Literacy Evaluator will be essentially to provide an expert diagnosis and an interpretation of that assessment which then can be used by a human tutor to further design and individualize a remedial program as needed through the use of an authoring system
    corecore