138,744 research outputs found
Recommended from our members
Local search: A guide for the information retrieval practitioner
There are a number of combinatorial optimisation problems in information retrieval in which the use of local search methods are worthwhile. The purpose of this paper is to show how local search can be used to solve some well known tasks in information retrieval (IR), how previous research in the field is piecemeal, bereft of a structure and methodologically flawed, and to suggest more rigorous ways of applying local search methods to solve IR problems. We provide a query based taxonomy for analysing the use of local search in IR tasks and an overview of issues such as fitness functions, statistical significance and test collections when conducting experiments on combinatorial optimisation problems. The paper gives a guide on the pitfalls and problems for IR practitioners who wish to use local search to solve their research issues, and gives practical advice on the use of such methods. The query based taxonomy is a novel structure which can be used by the IR practitioner in order to examine the use of local search in IR
Spoken content retrieval: A survey of techniques and technologies
Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR
A survey on the use of relevance feedback for information access systems
Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems
WISER: A Semantic Approach for Expert Finding in Academia based on Entity Linking
We present WISER, a new semantic search engine for expert finding in
academia. Our system is unsupervised and it jointly combines classical language
modeling techniques, based on text evidences, with the Wikipedia Knowledge
Graph, via entity linking.
WISER indexes each academic author through a novel profiling technique which
models her expertise with a small, labeled and weighted graph drawn from
Wikipedia. Nodes in this graph are the Wikipedia entities mentioned in the
author's publications, whereas the weighted edges express the semantic
relatedness among these entities computed via textual and graph-based
relatedness functions. Every node is also labeled with a relevance score which
models the pertinence of the corresponding entity to author's expertise, and is
computed by means of a proper random-walk calculation over that graph; and with
a latent vector representation which is learned via entity and other kinds of
structural embeddings derived from Wikipedia.
At query time, experts are retrieved by combining classic document-centric
approaches, which exploit the occurrences of query terms in the author's
documents, with a novel set of profile-centric scoring strategies, which
compute the semantic relatedness between the author's expertise and the query
topic via the above graph-based profiles.
The effectiveness of our system is established over a large-scale
experimental test on a standard dataset for this task. We show that WISER
achieves better performance than all the other competitors, thus proving the
effectiveness of modelling author's profile via our "semantic" graph of
entities. Finally, we comment on the use of WISER for indexing and profiling
the whole research community within the University of Pisa, and its application
to technology transfer in our University
Development and evaluation of clustering techniques for finding people
Typically in a large organisation much expertise and knowledge is held informally within employees' own memories. When employees leave an organisation many documented links that go through that person are broken and no mechanism is usually available to overcome these broken links. This match making problem is related to the problem of finding potential work partners in a large and distributed organisation. This paper reports a comparative investigation into using standard information retrieval techniques to group employees together based on their webpages. This information can, hopefully, be subsequently used to redirect broken links to people who worked closely with a departed employee or used to highlight people, say indifferent departments, who work on similar topics. The paper reports the design and positive results of an experiment conducted at RisĂž National Laboratory comparing four different IR searching and clustering approaches using real users' web pages
- âŠ