Search CORE

26,030 research outputs found

Entity Query Feature Expansion Using Knowledge Base Links

Author: Allan James
Dalton Jeffrey
Dietz Laura
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2014
Field of study

Recent advances in automatic entity linking and knowledge base construction have resulted in entity annotations for document and query collections. For example, annotations of entities from large general purpose knowledge bases, such as Freebase and the Google Knowledge Graph. Understanding how to leverage these entity annotations of text to improve ad hoc document retrieval is an open research area. Query expansion is a commonly used technique to improve retrieval effectiveness. Most previous query expansion approaches focus on text, mainly using unigram concepts. In this paper, we propose a new technique, called entity query feature expansion (EQFE) which enriches the query with features from entities and their links to knowledge bases, including structured attributes and text. We experiment using both explicit query entity annotations and latent entities. We evaluate our technique on TREC text collections automatically annotated with knowledge base entity links, including the Google Freebase Annotations (FACC1) data. We find that entity-based feature expansion results in significant improvements in retrieval effectiveness over state-of-the-art text expansion approaches

CiteSeerX

Crossref

Enlighten

Using the Annotated Bibliography as a Resource for Indicative Summarization

Author: Kan Min-Yen
Klavans Judith L.
McKeown Kathleen R.
Publication venue
Publication date: 01/01/2002
Field of study

We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.Comment: 8 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Columbia University Academic Commons

Bridging the Semantic Gap in Multimedia Information Retrieval: Top-down and Bottom-up approaches

Author: Enser Peter G.B.
Hare Jonathon S.
Lewis Paul H.
Martinez Kirk
Sandom Christine J.
Sinclair Patrick A. S.
Publication venue
Publication date: 01/01/2006
Field of study

Semantic representation of multimedia information is vital for enabling the kind of multimedia search capabilities that professional searchers require. Manual annotation is often not possible because of the shear scale of the multimedia information that needs indexing. This paper explores the ways in which we are using both top-down, ontologically driven approaches and bottom-up, automatic-annotation approaches to provide retrieval facilities to users. We also discuss many of the current techniques that we are investigating to combine these top-down and bottom-up approaches

CiteSeerX

Southampton (e-Prints Soton)

Hierarchical Event Descriptors (HED): Semi-Structured Tagging for Real-World Events in Large-Scale EEG.

Author: Bigdely-Shamlo Nima
Cockfield Jeremy
La Valle Chris
Makeig Scott
Miyakoshi Makoto
Robbins Kay A
Rognon Thomas
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Real-world brain imaging by EEG requires accurate annotation of complex subject-environment interactions in event-rich tasks and paradigms. This paper describes the evolution of the Hierarchical Event Descriptor (HED) system for systematically describing both laboratory and real-world events. HED version 2, first described here, provides the semantic capability of describing a variety of subject and environmental states. HED descriptions can include stimulus presentation events on screen or in virtual worlds, experimental or spontaneous events occurring in the real world environment, and events experienced via one or multiple sensory modalities. Furthermore, HED 2 can distinguish between the mere presence of an object and its actual (or putative) perception by a subject. Although the HED framework has implicit ontological and linked data representations, the user-interface for HED annotation is more intuitive than traditional ontological annotation. We believe that hiding the formal representations allows for a more user-friendly interface, making consistent, detailed tagging of experimental, and real-world events possible for research users. HED is extensible while retaining the advantages of having an enforced common core vocabulary. We have developed a collection of tools to support HED tag assignment and validation; these are available at hedtags.org. A plug-in for EEGLAB (sccn.ucsd.edu/eeglab), CTAGGER, is also available to speed the process of tagging existing studies

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

eScholarship - University of California

Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Author: Jones Gareth J.F.
Lam-Adesina Adenike M.
Newman Eamonn
Zhang Ke
Publication venue: Centre for Telematics and Information Technology, Enschede, The Netherlands
Publication date: 01/07/2007
Field of study

The searching spontaneous speech can be enhanced by combining automatic speech transcriptions with semantically related metadata. An important question is what can be expected from search of such transcriptions and different sources of related metadata in terms of retrieval effectiveness. The Cross-Language Speech Retrieval (CL-SR) track at recent CLEF workshops provides a spontaneous speech test collection with manual and automatically derived metadata fields. Using this collection we investigate the comparative search effectiveness of individual fields comprising automated transcriptions and the available metadata. A further important question is how transcriptions and metadata should be combined for the greatest benefit to search accuracy. We compare simple field merging of individual fields with the extended BM25 model for weighted field combination (BM25F). Results indicate that BM25F can produce improved search accuracy, but that it is currently important to set its parameters suitably using a suitable training set

Irish Universities

DCU Online Research Access Service

Unravelling the voice of Willem Frederik Hermans: an oral history indexing case study

Author: Huijbregts Marijn
Jong Franciska de
Ordelman Roeland
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2009
Field of study

University of Twente Research Information

Robust audio indexing for Dutch spoken-word collections

Author: Huijbregts Marijn
Jong Franciska de
Leeuwen David van
Ordelman Roeland
Publication venue: KNAW
Publication date: 01/01/2005
Field of study

Abstract—Whereas the growth of storage capacity is in accordance with widely acknowledged predictions, the possibilities to index and access the archives created is lagging behind. This is especially the case in the oral history domain and much of the rich content in these collections runs the risk to remain inaccessible for lack of robust search technologies. This paper addresses the history and development of robust audio indexing technology for searching Dutch spoken-word collections and compares Dutch audio indexing in the well-studied broadcast news domain with an oral-history case-study. It is concluded that despite significant advances in Dutch audio indexing technology and demonstrated applicability in several domains, further research is indispensable for successful automatic disclosure of spoken-word collections

University of Twente Research Information

Evaluation of a prototype interface for structured document retrieval

Author: Dunlop M.D.
Reid J.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

Document collections often display either internal structure, in the form of the logical arrangement of document components, or external structure, in the form of links between documents. Structured document retrieval systems aim to exploit this structural information to provide users with more effective access to structured documents. To do this, the associated interface must both represent this information explicitly and support users in their browsing behaviour. This paper describes the implementation and user-centred evaluation of a prototype interface, the RelevanceLinkBar interface. The results of the evaluation show that the RelevanceLinkBar interface supported users in their browsing behaviour, allowing them to find more relevant documents, and was strongly preferred over a standard results interface

University of Strathclyde Institutional Repository

Visual exploration and retrieval of XML document collections with the generic system X2

Author: Felix Weigel
François Bry
H Meuss
Holger Meuss
Klaus U. Schulz
S Ceri
S Mizzaro
Simone Leonardi
T Catarci
T Schlieder
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2005
Field of study

This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed

Crossref

Open Access LMU