8,679 research outputs found
DCU at the NTCIR-12 SpokenQuery&Doc-2 task
We describe DCUās participation in the NTCIR-12 SpokenQuery&Doc (SQD-2) task. In the context of the slide-group
retrieval sub-task, we experiment with a passage retrieval
method that re-scores each passage according to the relevance score of the document from which the passage is taken.
This is performed by linearly interpolating their relevance
scores which are calculated using the Okapi BM25 model of
probabilistic retrieval for passages and documents independently. In conjunction with this, we assess the benefits of
using pseudo-relevance feedback for expanding the textual
representation of the spoken queries with terms found in the
top-ranked documents and passages, and experiment with
a general multidimensional optimisation method to jointly
tune the BM25 and query expansion parameters with queries
and relevance data from the NTCIR-11 SQD-1 task. Retrieval experiments performed over the SQD-1 and SQD-2
queries confirm previous findings which affirm that integrating document information when ranking passages can lead
to improved passage retrieval effectiveness. Furthermore,
results indicate that no significant gains in retrieval effectiveness can be obtained by using query expansion in combination with our retrieval models over these two query sets
Spoken query processing for interactive information retrieval
It has long been recognised that interactivity improves the effectiveness of information retrieval systems. Speech is the most natural and interactive medium of communication and recent progress in speech recognition is making it possible to build systems that interact with the user via speech. However, given the typical length of queries submitted to information retrieval systems, it is easy to imagine that the effects of word recognition errors in spoken queries must be severely destructive on the system's effectiveness. The experimental work reported in this paper shows that the use of classical information retrieval techniques for spoken query processing is robust to considerably high levels of word recognition errors, in particular for long queries. Moreover, in the case of short queries, both standard relevance feedback and pseudo relevance feedback can be effectively employed to improve the effectiveness of spoken query processing
Vocal Access to a Newspaper Archive: Design Issues and Preliminary Investigation
This paper presents the design and the current prototype implementation of an
interactive vocal Information Retrieval system that can be used to access
articles of a large newspaper archive using a telephone. The results of
preliminary investigation into the feasibility of such a system are also
presented
Language Modeling for Multi-Domain Speech-Driven Text Retrieval
We report experimental results associated with speech-driven text retrieval,
which facilitates retrieving information in multiple domains with spoken
queries. Since users speak contents related to a target collection, we produce
language models used for speech recognition based on the target collection, so
as to improve both the recognition and retrieval accuracy. Experiments using
existing test collections combined with dictated queries showed the
effectiveness of our method
Users' perception of relevance of spoken documents
We present the results of a study of user's perception of relevance of documents. The aim is to study experimentally how users' perception varies depending on the form that retrieved documents are presented. Documents retrieved in response to a query are presented to users in a variety of ways, from full text to a machine spoken query-biased automatically-generated summary, and the difference in users' perception of relevance is studied. The experimental results suggest that the effectiveness of advanced multimedia information retrieval applications may be affected by the low level of users' perception of relevance of retrieved documents
Search of spoken documents retrieves well recognized transcripts
This paper presents a series of analyses and experiments on spoken
document retrieval systems: search engines that retrieve transcripts produced by
speech recognizers. Results show that transcripts that match queries well tend to
be recognized more accurately than transcripts that match a query less well.
This result was described in past literature, however, no study or explanation of
the effect has been provided until now. This paper provides such an analysis
showing a relationship between word error rate and query length. The paper
expands on past research by increasing the number of recognitions systems that
are tested as well as showing the effect in an operational speech retrieval
system. Potential future lines of enquiry are also described
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
Subword-based Indexing for a Minimal False Positive Rate
Subword-based Indexing for a Minimal False Positive Rat
- ā¦