14 research outputs found

    The relationship of word error rate to document ranking

    Get PDF
    This paper describes two experiments that examine the relationship of Word Error Rate (WER) of retrieved spoken documents returned by a spoken document retrieval system. Previous work has demonstrated that recognition errors do not significantly affect retrieval effectiveness but whether they will adversely affect relevance judgement remains unclear. A user-based experiment measuring ability to judge relevance from the recognised text presented in a retrieved result list was conducted. The results indicated that users were capable of judging relevance accurately despite transcription errors. This lead an examination of the relationship of WER in retrieved audio documents to their rank position when retrieved for a particular query. Here it was shown that WER was somewhat lower for top ranked documents than it was for documents retrieved further down the ranking, thereby indicating a possible explanation for the success of the user experiment

    Speech and hand transcribed retrieval

    Get PDF
    This paper describes the issues and preliminary work involved in the creation of an information retrieval system that will manage the retrieval from collections composed of both speech recognised and ordinary text documents. In previous work, it has been shown that because of recognition errors, ordinary documents are generally retrieved in preference to recognised ones. Means of correcting or eliminating the observed bias is the subject of this paper. Initial ideas and some preliminary results are presented

    Search of spoken documents retrieves well recognized transcripts

    Get PDF
    This paper presents a series of analyses and experiments on spoken document retrieval systems: search engines that retrieve transcripts produced by speech recognizers. Results show that transcripts that match queries well tend to be recognized more accurately than transcripts that match a query less well. This result was described in past literature, however, no study or explanation of the effect has been provided until now. This paper provides such an analysis showing a relationship between word error rate and query length. The paper expands on past research by increasing the number of recognitions systems that are tested as well as showing the effect in an operational speech retrieval system. Potential future lines of enquiry are also described

    Spoken content metadata and MPEG-7

    Full text link
    The words spoken in an audio stream form an obvious descriptor essential to most audio-visual metadata standards. When derived using automatic speech recognition systems, the spoken content fits into neither low-level (representative) nor high-level (semantic) metadata categories. This results in difficulties in creating a representation that can support both interoperability between different extraction and application utilities while retaining robustness to the limitations of the extraction process. In this paper, we discuss the issues encountered in the design of the MPEG-7 spoken content descriptor and their applicability to other metadata standards

    A survey of smoothing techniques for ME models

    Full text link

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Diversité et systÚme de recommandation : application à une plateforme de blogs à fort trafic (convention CIFRE n°20091274)

    Get PDF
    Les systĂšmes de recommandation ont pour objectif de proposer automatique­ment aux usagers des objets en relation avec leurs intĂ©rĂȘts. Ces outils d'aide Ă  l'accĂšs Ă  l'information sont de plus en plus prĂ©sents sur les plateformes de conte­nus. Dans ce contexte, les intĂ©rĂȘts des usagers peuvent ĂȘtre modĂ©lisĂ©s Ă  partir du contenu des documents visitĂ©s ou des actions rĂ©alisĂ©es (clics, commentaires, ...). Cependant, ces intĂ©rĂȘts ne peuvent ĂȘtre modĂ©lisĂ©s en cas de dĂ©marrage Ă  froid, c'est-Ă -dire pour un usager inconnu du systĂšme ou un nouveau document. Cette modĂ©lisation s'avĂšre donc complexe Ă  obtenir, et demeure parfois incom­plĂšte, conduisant Ă  des recommandations bien souvent Ă©loignĂ©es des intĂ©rĂȘts rĂ©els des usagers. De plus, les approches existantes ne sont gĂ©nĂ©ralement pas en me­sure de garantir des performances satisfaisantes sur des plateformes Ă  fort trafic et hĂ©bergeant une volumĂ©trie de donnĂ©es consĂ©quente. Pour tendre vers des recommandations plus pertinentes, nous proposons un modĂšle de systĂšme de recommandation qui construit une liste de recommandations rĂ©pondant Ă  un large spectre d'intĂ©rĂȘts potentiels, et ce mĂȘme dans un contexte oĂč le systĂšme ne possĂšde que peu d'informations sur l'usager. L'originalitĂ© de notre modĂšle est qu'il repose sur la notion de diversitĂ©. Cette diversitĂ© est obtenue en agrĂ©geant le rĂ©sultat de diffĂ©rentes mesures de sĂ©lection pour construire la liste de recommandations finale. AprĂšs avoir dĂ©montrĂ© l'intĂ©rĂȘt de notre approche en utilisant des corpus des rĂ©fĂ©rences, ainsi qu'au travers d'une Ă©valuation auprĂšs d'usagers rĂ©els, nous Ă©valuons notre modĂšle sur la plateforme de blogs OverBlog. Nous validons ainsi notre proposition dans un contexte industriel Ă  grande Ă©chelle.Recommender Systems aim at automatically providing objects related to user's interests. These tools are increasingly used on content platforms to help the users to access information. In this context, user's interests can be modeled from the visited content and/or user's actions (clicks, comments, etc). However, these interests can not be modeled for an unknown user (cold start issue). Therefore, modeling is complex and recommendations are often far away from the real user's interests. In addition, existing approaches are generally not able to guarantee good performances on platforms with high trafic and which host a significant volume of data. To obtain more relevant recommendations for each user, we propose a recommender system model that builds a list of recommendations aiming at covering a large range of interests, even when only few information about the user is available. The recommender system model we propose is based on diversity. It uses different interest measures and an aggregation function to build the final set of recommendations. We demonstrate the interest of our approach using reference collections and through a user study. Finally, we evaluate our model on the OverBlog platform to validate its scalability in an industrial context

    Diversité et systÚme de recommandation : application à une plateforme de blogs à fort trafic

    Get PDF
    Recommender Systems aim at automatically providing objects related to user’s interests. These tools are increasingly used on content platforms to help the users to access information. In this context, user’s interests can be modeled from the visited content and/or user’s actions (clicks, comments, etc). However, these interests can not be modeled for an unknown user (cold start issue). Therefore, modeling is complex and recommendations are often far away from the real user’s interests. In addition, existing approaches are generally not able to guarantee good performances on platforms with high traffic and which host a significant volume of data.To obtain more relevant recommendations for each user, we propose a recommender system model that builds a list of recommendations aiming at covering a large range of interests, even when only few information about the user is available. The recommender system model we propose is based on diversity. It uses different interest measures and an aggregation function to build the final set of recommendations.We demonstrate the interest of our approach using reference collections and through a user study. Finally, we evaluate our model on the OverBlog platform to validate its scalability in an industrial context.Les systĂšmes de recommandation ont pour objectif de proposer automatiquement aux usagers des objets en relation avec leurs intĂ©rĂȘts. Ces outils d’aide Ă  l’accĂšs Ă  l’information sont de plus en plus prĂ©sents sur les plateformes de contenus. Dans ce contexte, les intĂ©rĂȘts des usagers peuvent ĂȘtre modĂ©lisĂ©s Ă  partir du contenu des documents visitĂ©s ou des actions rĂ©alisĂ©es (clics, commentaires, ...). Cependant, ces intĂ©rĂȘts ne peuvent ĂȘtre modĂ©lisĂ©s en cas de dĂ©marrage Ă  froid, c’est-Ă -dire pour un usager inconnu du systĂšme ou un nouveau document. Cette modĂ©lisation s’avĂšre donc complexe Ă  obtenir, et demeure parfois incomplĂšte, conduisant Ă  des recommandations bien souvent Ă©loignĂ©es des intĂ©rĂȘts rĂ©els des usagers. De plus, les approches existantes ne sont gĂ©nĂ©ralement pas en mesure de garantir des performances satisfaisantes sur des plateformes Ă  fort trafic et hĂ©bergeant une volumĂ©trie de donnĂ©es consĂ©quente.Pour tendre vers des recommandations plus pertinentes, nous proposons un modĂšle de systĂšme de recommandation qui construit une liste de recommandations rĂ©pondant Ă  un large spectre d’intĂ©rĂȘts potentiels, et ce mĂȘme dans un contexte oĂč le systĂšme ne possĂšde que peu d’informations sur l’usager. L’originalitĂ© de notre modĂšle est qu’il repose sur la notion de diversitĂ©. Cette diversitĂ© est obtenue en agrĂ©geant le rĂ©sultat de diffĂ©rentes mesures de sĂ©lection pour construire la liste de recommandations finale.AprĂšs avoir dĂ©montrĂ© l’intĂ©rĂȘt de notre approche en utilisant des corpus des rĂ©fĂ©rences, ainsi qu’au travers d’une Ă©valuation auprĂšs d’usagers rĂ©els, nous Ă©valuons notre modĂšle sur la plateforme de blogs OverBlog. Nous validons ainsi notre proposition dans un contexte industriel Ă  grande Ă©chelle

    Automatic processing of computer-transcribed spoken documents from multimedia archives

    Get PDF
    Tato prĂĄce se zaměƙuje na ƙeĆĄenĂ­ komplexnĂ­ho problĂ©mu jak strukturalizovat (vhodně rozčlenit, textově i foneticky analyzovat a nĂĄsledně upravit) vĂœstup systĂ©mu pro automatickĂ© rozpoznĂĄvĂĄnĂ­ ƙeči tak, aby byl co nejčitelnějĆĄĂ­ pro člověka a zĂĄroveƈ pƙipravenĂœ pro efektivnĂ­ strojovĂ© zpracovĂĄnĂ­ a vyhledĂĄvĂĄnĂ­. MotivacĂ­ pro ƙeĆĄenĂ­ tohoto problĂ©mu byl vĂœzkumnĂœ projekt podporovanĂœ Ministerstvem kultury ČR, jehoĆŸ cĂ­lem bylo pƙepsat mluvenĂ© dokumenty z archivu ČeskĂ©ho a ČeskoslovenskĂ©ho rozhlasu a zpƙístupnit je pro vyhledĂĄvĂĄnĂ­. Vzhledem k rozsahu archivu (213.000 dokumentĆŻ z obdobĂ­ 1923 aĆŸ 2014) bylo nutnĂ© navrhnout a zrealizovat takovĂœ postup a technologie, kterĂ© by byly schopny zvlĂĄdnout nejen obrovskĂ© mnoĆŸstvĂ­ dat, ale takĂ© specifickĂ© problĂ©my souvisejĂ­cĂ­ s rĆŻznou kvalitou zĂĄznamĆŻ, s pƙítomnostĂ­ českĂ©ho i slovenskĂ©ho jazyka v dokumentech, se stƙídajĂ­cĂ­mi se mluvčími, s proklĂĄdĂĄnĂ­m ƙeči znělkami, hudebnĂ­mi pƙeděly a pĂ­sničkami či s hluky na pozadĂ­ ƙeči.This thesis focuses on solving a complex task how to structure (i.e. appropriately divide, textually and phonetically analyze and subsequently modify) the output of the speech recognition system so it is most readable for human and also prepared for effective machine processing and search. Motivation to solve this task was the research project supported by the Czech Ministry of culture, aimed at transcription of spoken documents contained in the Czech and Czechoslovak radio and to make them available for search. Taking into account the archive size (213,000 documents form the years 1923-2014) it was essential to propose and implement such technologies, that were able to handle not only the waste amount of the data but also some specific issues associated with different acoustic quality of the documents, speaker changes, presence of jingles, music divides and song between the speech segments or with background noise

    Experiments in Spoken Document Retrieval at CMU

    No full text
    We describe our submission to the TREC-6 Spoken Document Retrieval (SDR) track and the speech recognition and the information retrieval engines. We present SDR evaluation results and a brief analysis. A few developments and experiments are also described in detail including: . Vocabulary size experiments, which assess the effect of words missing from the speech recognition vocabulary. For our 51,000-word vocabulary the effect was minimal
    corecore