5,684 research outputs found

    Evaluation of automatic transcription systems for the judicial domain

    Full text link
    This paper describes two different automatic transcription systems developed for judicial application domains for the Polish and Italian languages. The judicial domain requires to cope with several factors which are known to be critical for automatic speech recognition, such as: background noise, reverberation, spontaneous and accented speech, overlapped speech, cross channel effects, etc. The two automatic speech recognition (ASR) systems have been developed independently starting from out-of-domain data and, then, they have been adapted to the judicial domain using a certain amount of in-domain audio and text data. The ASR performance have been measured on audio data acquired in the courtrooms of Naples and Wroclaw. The resulting word error rates are around 40%, for Italian, and around between 30% and 50% for Polish. This performance, similar to that reported for other comparable ASR tasks (e.g. meeting transcriptions with distant microphone), suggests that possible applications can address tasks such as indexing and/or information retrieval in multimedia documents recorded during judicial debates

    Spoken Corpora Good Practice Guide 2006

    Get PDF
    International audienceThere is currently a vast amount of fundamental or applied research, which is based on the exploitation of oral corpora (organized recorded collections of oral and multimodal language productions). Created as a result of linguists becoming aware of the importance to ensure the durability of sources and a diversified access to the oral documents they produce, this Guide to good practice mainly deals with “oral corpora”, created for and used by linguists. But the questions raised by the creation and documentary exploitation of these corpora can be found in numerous disciplines: ethnology, anthropology, sociology, psychology, demography, oral history notably use oral surveys, testimonies, interviews, life stories. Based on a linguistic approach, this Guide also touches on the preoccupations of other researchers who use oral corpora (for example in the field of speech synthesis and recognition), even if their specific needs aren’t consistently dealt with in the present document

    The Indo-US Summit Partnership in Building India’s Infrastructure—A Summary of Events

    Get PDF
    While India’s policymakers have increasingly been giving importance to the private sector, as regards the participation and investment in infrastructure, not too many American companies have availed of this opportunity. There have been many issues regarding the regulatory framework, bureaucratic delays etc. All this is set to change. The recent surveys have shown a persistent trend of increasing satisfaction among the overseas investors at the changes that are taking place in India. The situation in the infrastructure sector is not different. The summarized proceedings of the above-mentioned conference are useful in this context, and would serve to update the reader on the developments that are taking place in this important sector. The broad conclusions, as understood by the authors, are reproduced in this article.Infrastructure , India -US

    Strategic Selection of Training Data for Domain-Specific Speech Recognition

    Get PDF
    Speech recognition is now a key topic in computer science with the proliferation of voice-activated assistants, and voice-enabled devices. Many companies over a speech recognition service for developers to use to enable smart devices and services. These speech-to-text systems, however, have significant room for improvement, especially in domain specific speech. IBM\u27s Watson speech-to-text service attempts to support domain specific uses by allowing users to upload their own training data for making custom models that augment Watson\u27s general model. This requires deciding a strategy for picking the training model. This thesis experiments with different training choices for custom language models that augment Watson\u27s speech to text service. The results show that using recent utterances is the best choice of training data in our use case of Digital Democracy. We are able to improve speech recognition accuracy by 2.3% percent over the control with no custom model. However, choosing training utterances most specific to the use case is better when large enough volumes of such training data is available
    • …
    corecore