71 research outputs found
Speech Technology Services for Oral History Research
Oral history is about oral sources of witnesses and commentors on historical
events. Speech technology is an important instrument to process such recordings
in order to obtain transcription and further enhancements to structure the oral
account In this contribution we address the transcription portal and the
webservices associated with speech processing at BAS, speech solutions
developed at LINDAT, how to do it yourself with Whisper, remaining challenges,
and future developments.Comment: 5 pages plus references, 3 figure
Speech Technology Services for Oral History Research
Oral history is about oral sources of witnesses and commentors on historical events. Speech technology is an important instrument to process such recordings in order to obtain transcription and further enhancements to structure the oral account In this contribution we address the transcription portal and the webservices associated with speech processing at BAS, speech solutions developed at LINDAT, how to do it yourself with Whisper, remaining challenges, and future developments
System for fast lexical and phonetic spoken term detection in a czech cultural heritage archive,”
Abstract The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech, emotionally loaded content) and its close coupling with the actual search engine. The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 hours of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds. The phonetic search implemented alongside the search based on the lexicon words allows to find even the words outside the ASR system lexicon such as names, geographic locations or Jewish slang
Large vocabulary continuous speech recognition of highly inflectional language (Czech).
The thesis concerns the development of a large vocabulary continuous speech recognition (LVCSR) system for highly inflectional languages, with special emphasis on the language modeling. An idea and usage of the automatic speech recognition is introduced and the basic principles of the statistical approach to the speech recognition and the decomposition of the system into basic components are explained. An overview of the existing statistical language modeling techniques is given and methods of inferring reliable probability estimates from sparse data and measures of the language model quality are described. There are offered a theoretical background to the finite-state machinery and the application of the finite-state machine framework to LVCSR. The goals of the thesis were to build a LVCSR system for the Czech language using standard techniques that were used for English and to analyze the system performance and propose and implement techniques that would improve the recognition accuracy. The development of the baseline system is described. The Czech language properties, especially from the automatic speech recognition point of view, were analyzed. The outcomes of this theoretical analysis are exploited and language models that take into account the specific features of the Czech language are presented. There is given a description of the class-based language models that strengthen the language model robustness and therefore reduce the perplexity and consequently improve the recognition accuracy. And finally a model that uses subword parts (morphemes) as the basic language modeling units is introduced. Such model offers a better coverage of an unknown text in comparison with standard word-based models given the same vocabulary size.Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi
Czech translation of the EBUContentGenre thesaurus
The EBUContentGenre is a thesaurus containing the hierarchical description of various genres utilized in the TV broadcasting industry. This thesaurus is a part of a complex metadata specification called EBUCore intended for multifaceted description of audiovisual content. EBUCore (http://tech.ebu.ch/docs/tech/tech3293v1_3.pdf) is a set of descriptive and technical metadata based on the Dublin Core and adapted to media. EBUCore is the flagship metadata specification of European Broadcasting Union, the largest professional association of broadcasters around the world. It is developed and maintained by EBU's Technical Department (http://tech.ebu.ch). The translated thesaurus can be used for effective cataloguing of (mostly TV) audiovisual content and consequent development of systems for automatic cataloguing (topic/genre detection)
Benefit of proper language processing for czech speech retrieval in the CL-SR task at CLEF 2006
Přínos vhodného jazykového předzpracování pro vyhledávání v mluvené češtině v úloze CL-SR na CLEF 2006
Článek popisuje systém vytvořený týmem Západočeské univerzity pro účely participace v kampani CLEF 2006 CL-SR track. Rozhodli jsme se soustředit pouze na prohledávání české testovací kolekce a prozkoumání přínosu vhodného jazykového předzpracování pro úspěšnost vyhledávání. Pro účely lingvistického předzpracování dat jsme použili morfologický analyzátor a tagger. Pro vlastní vyhledávání jsme využili klasický tf.idf přístup se slepou zpětnou vazbou tak, jak je implementován v systému Lemur. Výsledky naznačují, že vhodné lingvistické předzpracování je pro úspěšné vyhledávání v mluvené češtině vskutku klíčové.The paper describes the system built by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection and investigate the effect of proper language processing on the retrieval performance. We have employed the Czech morphological analyser and tagger for that purposes. For the actual search system, we have used the classical tf.idf approach with blind relevance feedback as implemented in the Lemur toolkit. The results indicate that a suitable linguistic preprocessing is indeed crucial for the Czech IR performance
- …
