7,457 research outputs found
A continuous speech recognition approach for the design of a dictation machine
International audienceThe oral entry of texts (dictation machine) remains an important potential field of application for automatic speech recognition. The RFIA group of CRIN/INRIA has been investigating this research area for the french language during the past ten years. We propose in this paper a general presentation of the present state of our MAUD system which is based upon four major interacting components: an acoustic phonetic decoder, a lexical component, a linguistic model and a user interface
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
Language Modeling for Multi-Domain Speech-Driven Text Retrieval
We report experimental results associated with speech-driven text retrieval,
which facilitates retrieving information in multiple domains with spoken
queries. Since users speak contents related to a target collection, we produce
language models used for speech recognition based on the target collection, so
as to improve both the recognition and retrieval accuracy. Experiments using
existing test collections combined with dictated queries showed the
effectiveness of our method
Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PCâs processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device
Translating On the Go? : Investigating the Potential of Multimodal Mobile Devices for Interactive Translation Dictation
This article provides a general overview of interactive translation dictation (ITD), an emerging translation technique that involves interacting with multimodal voice-and-touch-enabled devices such as touch-screen computers, tablets and smartphones. The author discusses the interest in integrating new techniques and technologies into the translation sector, provides a brief description of a recent experiment investigating the potential and challenges of ITD and outlines avenues for future work.Aquest article proveeix un panorama general sobre la traducciĂł dictada interactiva (TDI), tècnica de traducciĂł emergent que implica interactuar amb dispositius multimodals activats amb la veu i el tacte com ara els ordinadors de pantalla tĂ ctil, les tauletes i els telèfons intel¡ligents. L'autor examina l'interès d'integrar noves tècnicas i tecnologies al sector de la traducciĂł, proveeix una breu descripciĂł d'un experiment recent que investiga el potencial i els reptes de la TDI, i conclou indicant algunes avingudes per a la recerca futura.Este artĂculo provee un panorama general sobre la traducciĂłn dictada interactiva (TDI), tĂŠcnica de traducciĂłn emergente que implica interactuar con dispositivos multimodales activados con la voz y el tacto tales como los ordenadores de pantalla tĂĄctil, las tabletas y los telĂŠfonos inteligentes. El autor examina el interĂŠs en integrar nuevas tĂŠcnicas y tecnologĂas al sector de la traducciĂłn, provee una breve descripciĂłn de un experimento reciente que investiga el potencial y los retos de la TDI, y concluye indicando algunas avenidas para investigaciones futuras
- âŚ