Search CORE

14 research outputs found

The relationship of word error rate to document ranking

Author: Mang Shou X.
Sanderson M.
Tuffs N.
Publication venue
Publication date: 01/01/2003
Field of study

This paper describes two experiments that examine the relationship of Word Error Rate (WER) of retrieved spoken documents returned by a spoken document retrieval system. Previous work has demonstrated that recognition errors do not significantly affect retrieval effectiveness but whether they will adversely affect relevance judgement remains unclear. A user-based experiment measuring ability to judge relevance from the recognised text presented in a retrieved result list was conducted. The results indicated that users were capable of judging relevance accurately despite transcription errors. This lead an examination of the relationship of WER in retrieved audio documents to their rank position when retrieved for a particular query. Here it was shown that WER was somewhat lower for top ranked documents than it was for documents retrieved further down the ranking, thereby indicating a possible explanation for the success of the user experiment

CiteSeerX

White Rose Research Online

Speech and hand transcribed retrieval

Author: Sanderson M.
Shou X.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

This paper describes the issues and preliminary work involved in the creation of an information retrieval system that will manage the retrieval from collections composed of both speech recognised and ordinary text documents. In previous work, it has been shown that because of recognition errors, ordinary documents are generally retrieved in preference to recognised ones. Means of correcting or eliminating the observed bias is the subject of this paper. Initial ideas and some preliminary results are presented

CiteSeerX

Crossref

White Rose Research Online

Search of spoken documents retrieves well recognized transcripts

Author: Sanderson M.
Shou X.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2007
Field of study

This paper presents a series of analyses and experiments on spoken document retrieval systems: search engines that retrieve transcripts produced by speech recognizers. Results show that transcripts that match queries well tend to be recognized more accurately than transcripts that match a query less well. This result was described in past literature, however, no study or explanation of the effect has been provided until now. This paper provides such an analysis showing a relationship between word error rate and query length. The paper expands on past research by increasing the number of recognitions systems that are tested as well as showing the effect in an operational speech retrieval system. Potential future lines of enquiry are also described

White Rose Research Online

Spoken content metadata and MPEG-7

Author: J P. A. Charlesworth
P. N. Garner
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2000
Field of study

The words spoken in an audio stream form an obvious descriptor essential to most audio-visual metadata standards. When derived using automatic speech recognition systems, the spoken content fits into neither low-level (representative) nor high-level (semantic) metadata categories. This results in difficulties in creating a representation that can support both interoperability between different extraction and application utilities while retaining robustness to the limitations of the extraction process. In this paper, we discuss the issues encountered in the design of the MPEG-7 spoken content descriptor and their applicability to other metadata standards

CiteSeerX

Crossref

A survey of smoothing techniques for ME models

Author: R. Rosenfeld
S.F. Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Diversité et système de recommandation : application à une plateforme de blogs à fort trafic (convention CIFRE n°20091274)

Author: Dudognon Damien
Publication venue
Publication date: 04/04/2014
Field of study

Les systèmes de recommandation ont pour objectif de proposer automatiquement aux usagers des objets en relation avec leurs intérêts. Ces outils d'aide à l'accès à l'information sont de plus en plus présents sur les plateformes de contenus. Dans ce contexte, les intérêts des usagers peuvent être modélisés à partir du contenu des documents visités ou des actions réalisées (clics, commentaires, ...). Cependant, ces intérêts ne peuvent être modélisés en cas de démarrage à froid, c'est-à-dire pour un usager inconnu du système ou un nouveau document. Cette modélisation s'avère donc complexe à obtenir, et demeure parfois incomplète, conduisant à des recommandations bien souvent éloignées des intérêts réels des usagers. De plus, les approches existantes ne sont généralement pas en mesure de garantir des performances satisfaisantes sur des plateformes à fort trafic et hébergeant une volumétrie de données conséquente. Pour tendre vers des recommandations plus pertinentes, nous proposons un modèle de système de recommandation qui construit une liste de recommandations répondant à un large spectre d'intérêts potentiels, et ce même dans un contexte où le système ne possède que peu d'informations sur l'usager. L'originalité de notre modèle est qu'il repose sur la notion de diversité. Cette diversité est obtenue en agrégeant le résultat de différentes mesures de sélection pour construire la liste de recommandations finale. Après avoir démontré l'intérêt de notre approche en utilisant des corpus des références, ainsi qu'au travers d'une évaluation auprès d'usagers réels, nous évaluons notre modèle sur la plateforme de blogs OverBlog. Nous validons ainsi notre proposition dans un contexte industriel à grande échelle.Recommender Systems aim at automatically providing objects related to user's interests. These tools are increasingly used on content platforms to help the users to access information. In this context, user's interests can be modeled from the visited content and/or user's actions (clicks, comments, etc). However, these interests can not be modeled for an unknown user (cold start issue). Therefore, modeling is complex and recommendations are often far away from the real user's interests. In addition, existing approaches are generally not able to guarantee good performances on platforms with high trafic and which host a significant volume of data. To obtain more relevant recommendations for each user, we propose a recommender system model that builds a list of recommendations aiming at covering a large range of interests, even when only few information about the user is available. The recommender system model we propose is based on diversity. It uses different interest measures and an aggregation function to build the final set of recommendations. We demonstrate the interest of our approach using reference collections and through a user study. Finally, we evaluate our model on the OverBlog platform to validate its scalability in an industrial context

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

Diversité et système de recommandation : application à une plateforme de blogs à fort trafic

Author: Dudognon Damien
Publication venue: HAL CCSD
Publication date: 04/04/2014
Field of study

Recommender Systems aim at automatically providing objects related to user’s interests. These tools are increasingly used on content platforms to help the users to access information. In this context, user’s interests can be modeled from the visited content and/or user’s actions (clicks, comments, etc). However, these interests can not be modeled for an unknown user (cold start issue). Therefore, modeling is complex and recommendations are often far away from the real user’s interests. In addition, existing approaches are generally not able to guarantee good performances on platforms with high traffic and which host a significant volume of data.To obtain more relevant recommendations for each user, we propose a recommender system model that builds a list of recommendations aiming at covering a large range of interests, even when only few information about the user is available. The recommender system model we propose is based on diversity. It uses different interest measures and an aggregation function to build the final set of recommendations.We demonstrate the interest of our approach using reference collections and through a user study. Finally, we evaluate our model on the OverBlog platform to validate its scalability in an industrial context.Les systèmes de recommandation ont pour objectif de proposer automatiquement aux usagers des objets en relation avec leurs intérêts. Ces outils d’aide à l’accès à l’information sont de plus en plus présents sur les plateformes de contenus. Dans ce contexte, les intérêts des usagers peuvent être modélisés à partir du contenu des documents visités ou des actions réalisées (clics, commentaires, ...). Cependant, ces intérêts ne peuvent être modélisés en cas de démarrage à froid, c’est-à-dire pour un usager inconnu du système ou un nouveau document. Cette modélisation s’avère donc complexe à obtenir, et demeure parfois incomplète, conduisant à des recommandations bien souvent éloignées des intérêts réels des usagers. De plus, les approches existantes ne sont généralement pas en mesure de garantir des performances satisfaisantes sur des plateformes à fort trafic et hébergeant une volumétrie de données conséquente.Pour tendre vers des recommandations plus pertinentes, nous proposons un modèle de système de recommandation qui construit une liste de recommandations répondant à un large spectre d’intérêts potentiels, et ce même dans un contexte où le système ne possède que peu d’informations sur l’usager. L’originalité de notre modèle est qu’il repose sur la notion de diversité. Cette diversité est obtenue en agrégeant le résultat de différentes mesures de sélection pour construire la liste de recommandations finale.Après avoir démontré l’intérêt de notre approche en utilisant des corpus des références, ainsi qu’au travers d’une évaluation auprès d’usagers réels, nous évaluons notre modèle sur la plateforme de blogs OverBlog. Nous validons ainsi notre proposition dans un contexte industriel à grande échelle

Thèses en Ligne

Scientific Publications of the University of Toulouse II Le Mirail

Automatic processing of computer-transcribed spoken documents from multimedia archives

Author: Boháč Marek
Publication venue: Technická Univerzita v Liberci
Publication date: 10/12/2018
Field of study

Tato práce se zaměřuje na řešení komplexního problému jak strukturalizovat (vhodně rozčlenit, textově i foneticky analyzovat a následně upravit) výstup systému pro automatické rozpoznávání řeči tak, aby byl co nejčitelnější pro člověka a zároveň připravený pro efektivní strojové zpracování a vyhledávání. Motivací pro řešení tohoto problému byl výzkumný projekt podporovaný Ministerstvem kultury ČR, jehož cílem bylo přepsat mluvené dokumenty z archivu Českého a Československého rozhlasu a zpřístupnit je pro vyhledávání. Vzhledem k rozsahu archivu (213.000 dokumentů z období 1923 až 2014) bylo nutné navrhnout a zrealizovat takový postup a technologie, které by byly schopny zvládnout nejen obrovské množství dat, ale také specifické problémy související s různou kvalitou záznamů, s přítomností českého i slovenského jazyka v dokumentech, se střídajícími se mluvčími, s prokládáním řeči znělkami, hudebními předěly a písničkami či s hluky na pozadí řeči.This thesis focuses on solving a complex task how to structure (i.e. appropriately divide, textually and phonetically analyze and subsequently modify) the output of the speech recognition system so it is most readable for human and also prepared for effective machine processing and search. Motivation to solve this task was the research project supported by the Czech Ministry of culture, aimed at transcription of spoken documents contained in the Czech and Czechoslovak radio and to make them available for search. Taking into account the archive size (213,000 documents form the years 1923-2014) it was essential to propose and implement such technologies, that were able to handle not only the waste amount of the data but also some specific issues associated with different acoustic quality of the documents, speaker changes, presence of jingles, music divides and song between the speech segments or with background noise

DSpace@TUL

Experiments in Spoken Document Retrieval at CMU

Author: A. G. Hauptmann
K. Seymore
M. A. Siegler
M. J. Witbrock
R. E. Jones
S. T. Slattery
Siegler Witbrock Slattery
Publication venue: NIST-SP
Publication date: 01/01/1997
Field of study

We describe our submission to the TREC-6 Spoken Document Retrieval (SDR) track and the speech recognition and the information retrieval engines. We present SDR evaluation results and a brief analysis. A few developments and experiments are also described in detail including: . Vocabulary size experiments, which assess the effect of words missing from the speech recognition vocabulary. For our 51,000-word vocabulary the effect was minimal

CiteSeerX