5,684 research outputs found
Evaluation of automatic transcription systems for the judicial domain
This paper describes two different automatic transcription systems
developed for judicial application domains for the Polish and Italian
languages. The judicial domain requires to cope with several factors
which are known to be critical for automatic speech recognition, such
as: background noise, reverberation, spontaneous and accented speech,
overlapped speech, cross channel effects, etc.
The two automatic speech recognition (ASR) systems have been developed
independently starting from out-of-domain data and, then, they have
been adapted to the judicial domain using a certain amount of
in-domain audio and text data.
The ASR performance have been measured on audio data acquired in the
courtrooms of Naples and Wroclaw. The resulting word error rates are
around 40%, for Italian, and around between 30% and 50% for Polish.
This performance, similar to that reported for other comparable ASR
tasks (e.g. meeting transcriptions with distant microphone), suggests
that possible applications can address tasks such as indexing and/or
information retrieval in multimedia documents recorded during judicial
debates
Spoken Corpora Good Practice Guide 2006
International audienceThere is currently a vast amount of fundamental or applied research, which is based on the exploitation of oral corpora (organized recorded collections of oral and multimodal language productions). Created as a result of linguists becoming aware of the importance to ensure the durability of sources and a diversified access to the oral documents they produce, this Guide to good practice mainly deals with “oral corpora”, created for and used by linguists. But the questions raised by the creation and documentary exploitation of these corpora can be found in numerous disciplines: ethnology, anthropology, sociology, psychology, demography, oral history notably use oral surveys, testimonies, interviews, life stories. Based on a linguistic approach, this Guide also touches on the preoccupations of other researchers who use oral corpora (for example in the field of speech synthesis and recognition), even if their specific needs aren’t consistently dealt with in the present document
The Indo-US Summit Partnership in Building India’s Infrastructure—A Summary of Events
While India’s policymakers have increasingly been giving importance to the private sector, as regards the participation and investment in infrastructure, not too many American companies have availed of this opportunity. There have been many issues regarding the regulatory framework, bureaucratic delays etc. All this is set to change. The recent surveys have shown a persistent trend of increasing satisfaction among the overseas investors at the changes that are taking place in India. The situation in the infrastructure sector is not different. The summarized proceedings of the above-mentioned conference are useful in this context, and would serve to update the reader on the developments that are taking place in this important sector. The broad conclusions, as understood by the authors, are reproduced in this article.Infrastructure , India -US
The Indo-US Summit Partnership in Building India’s Infrastructure—A Summary of Events
Infrastructure , India -US
Strategic Selection of Training Data for Domain-Specific Speech Recognition
Speech recognition is now a key topic in computer science with the proliferation of voice-activated assistants, and voice-enabled devices. Many companies over a speech recognition service for developers to use to enable smart devices and services. These speech-to-text systems, however, have significant room for improvement, especially in domain specific speech. IBM\u27s Watson speech-to-text service attempts to support domain specific uses by allowing users to upload their own training data for making custom models that augment Watson\u27s general model. This requires deciding a strategy for picking the training model. This thesis experiments with different training choices for custom language models that augment Watson\u27s speech to text service. The results show that using recent utterances is the best choice of training data in our use case of Digital Democracy. We are able to improve speech recognition accuracy by 2.3% percent over the control with no custom model. However, choosing training utterances most specific to the use case is better when large enough volumes of such training data is available
- …