7,457 research outputs found

    A continuous speech recognition approach for the design of a dictation machine

    Get PDF
    International audienceThe oral entry of texts (dictation machine) remains an important potential field of application for automatic speech recognition. The RFIA group of CRIN/INRIA has been investigating this research area for the french language during the past ten years. We propose in this paper a general presentation of the present state of our MAUD system which is based upon four major interacting components: an acoustic phonetic decoder, a lexical component, a linguistic model and a user interface

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Language Modeling for Multi-Domain Speech-Driven Text Retrieval

    Full text link
    We report experimental results associated with speech-driven text retrieval, which facilitates retrieving information in multiple domains with spoken queries. Since users speak contents related to a target collection, we produce language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval accuracy. Experiments using existing test collections combined with dictated queries showed the effectiveness of our method

    Speech technology for medical reporting : consequences for the correction process

    Get PDF
    +155hlm.;23c

    Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

    Get PDF
    Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device

    Translating On the Go? : Investigating the Potential of Multimodal Mobile Devices for Interactive Translation Dictation

    Get PDF
    This article provides a general overview of interactive translation dictation (ITD), an emerging translation technique that involves interacting with multimodal voice-and-touch-enabled devices such as touch-screen computers, tablets and smartphones. The author discusses the interest in integrating new techniques and technologies into the translation sector, provides a brief description of a recent experiment investigating the potential and challenges of ITD and outlines avenues for future work.Aquest article proveeix un panorama general sobre la traducció dictada interactiva (TDI), tècnica de traducció emergent que implica interactuar amb dispositius multimodals activats amb la veu i el tacte com ara els ordinadors de pantalla tàctil, les tauletes i els telèfons intel¡ligents. L'autor examina l'interès d'integrar noves tècnicas i tecnologies al sector de la traducció, proveeix una breu descripció d'un experiment recent que investiga el potencial i els reptes de la TDI, i conclou indicant algunes avingudes per a la recerca futura.Este artículo provee un panorama general sobre la traducción dictada interactiva (TDI), tÊcnica de traducción emergente que implica interactuar con dispositivos multimodales activados con la voz y el tacto tales como los ordenadores de pantalla tåctil, las tabletas y los telÊfonos inteligentes. El autor examina el interÊs en integrar nuevas tÊcnicas y tecnologías al sector de la traducción, provee una breve descripción de un experimento reciente que investiga el potencial y los retos de la TDI, y concluye indicando algunas avenidas para investigaciones futuras
    • …
    corecore