392 research outputs found
Building user interest profiles from wikipedia clusters
Users of search systems are often reluctant to explicitly build profiles to indicate their search interests. Thus automatically building user profiles is an important research area for personalized search. One difficult component of doing this is accessing a knowledge system which provides broad coverage of user search interests. In this work, we describe a
method to build category id based user profiles from a user's
historical search data. Our approach makes significant use
of Wikipedia as an external knowledge resource
Entity Query Feature Expansion Using Knowledge Base Links
Recent advances in automatic entity linking and knowledge base
construction have resulted in entity annotations for document and
query collections. For example, annotations of entities from large
general purpose knowledge bases, such as Freebase and the Google
Knowledge Graph. Understanding how to leverage these entity
annotations of text to improve ad hoc document retrieval is an open
research area. Query expansion is a commonly used technique to
improve retrieval effectiveness. Most previous query expansion
approaches focus on text, mainly using unigram concepts. In this
paper, we propose a new technique, called entity query feature
expansion (EQFE) which enriches the query with features from
entities and their links to knowledge bases, including structured
attributes and text. We experiment using both explicit query entity
annotations and latent entities. We evaluate our technique on TREC
text collections automatically annotated with knowledge base entity
links, including the Google Freebase Annotations (FACC1) data.
We find that entity-based feature expansion results in significant
improvements in retrieval effectiveness over state-of-the-art text
expansion approaches
Supporting aspect-based video browsing - analysis of a user study
In this paper, we present a novel video search interface based on the concept of aspect browsing. The proposed strategy is to assist the user in exploratory video search by actively suggesting new query terms and video shots. Our approach has the potential to narrow the "Semantic Gap" issue by allowing users to explore the data collection. First, we describe a clustering technique to identify potential aspects of a search. Then, we use the results to propose suggestions to the user to help them in their search task. Finally, we analyse this approach by exploiting the log files and the feedbacks of a user study
Exploring sentence level query expansion in language modeling based information retrieval
We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from
pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major
findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher
mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for
English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can
improve performance even for topics with low initial retrieval precision where standard BRF fails
Optimising content clarity for human-machine systems
This paper details issues associated with the production of clearly expressed and comprehensible technical documentation for domestic appliances and human-machine systems, and describes an approach to optimising the clarity of such content. The aim is to develop support for authors in checking the likely comprehensibility of chosen forms of expression by reference to an external measure of 'likely familiarity'. Our DOcumentation Support Tool (DoST) will assist in identifying words and expression forms that are likely to be unfamiliar to end users
Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches
Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches
ModÚle de langue pour l'ordonnancement conjoint d'entités pertinentes dans un réseau d'informations hétérogÚnes
National audienceDans ce papier, nous proposons un nouveau modĂšle, appelĂ© BibRank, ayant pour objectif d'ordonnancer conjointement des ressources hĂ©tĂ©rogĂšnes, documents et auteurs, d'un rĂ©seau bibliographique selon leur degrĂ© de pertinence vis-Ă -vis d'une requĂȘte. Ce modĂšle utilise le principe de propagation des scores des entitĂ©s en considĂ©rant Ă la fois la structure du rĂ©seau et le sujet de la requĂȘte. De plus, ce modĂšle introduit deux indicateurs de proximitĂ© thĂ©matique entre entitĂ©s connectĂ©es suivant le type des entitĂ©s reliĂ©es. Pour les relations entre entitĂ©s homogĂšnes, cet indicateur dĂ©tecte les citations marginales tandis que pour les relations entre entitĂ©s hĂ©tĂ©rogĂšnes, il utilise deux sources d'Ă©vidence : le sujet du document et l'expertise de l'auteur. Des expĂ©rimentations, menĂ©es en utilisant le rĂ©seau bibliographique CiteSeerX, montrent l'efficacitĂ© du modĂšle d'ordonnancement proposĂ©
- âŠ