Search CORE

26 research outputs found

Mapping queries to the Linking Open Data cloud: A case study using DBpedia.

Author: Bouke Huurnink
Laura Hollink
Maarten De Rijke
Marc Bron
Q Edgar Meij
Publication venue
Publication date: 01/01/2011
Field of study

a b s t r a c t We introduce the task of mapping search engine queries to DBpedia, a major linking hub in the Linking Open Data cloud. We propose and compare various methods for addressing this task, using a mixture of information retrieval and machine learning techniques. Specifically, we present a supervised machine learning-based method to determine which concepts are intended by a user issuing a query. The concepts are obtained from an ontology and may be used to provide contextual information, related concepts, or navigational suggestions to the user submitting the query. Our approach first ranks candidate concepts using a language modeling for information retrieval framework. We then extract query, concept, and search-history feature vectors for these concepts. Using manual annotations we inform a machine learning algorithm that learns how to select concepts from the candidates given an input query. Simply performing a lexical match between the queries and concepts is found to perform poorly and so does using retrieval alone, i.e., omitting the concept selection stage. Our proposed method significantly improves upon these baselines and we find that support vector machines are able to achieve the best performance out of the machine learning algorithms evaluated

CiteSeerX

Autoseek - Towards A Fully Automated Video Search System

Author: Bouke Huurnink
Bouke Huurnink
Publication venue
Publication date
Field of study

The astounding rate at which digital video is becoming available has stimulated research into video retrieval systems that incorporate visual, auditory, and spatio-temporal analysis. In the beginning, these multimodal systems required intensive user interaction, but during the past few years automatic search systems that need no interaction at all have emerged, requiring only a string of natural language text and a number of multimodal examples as input. We apply ourselves to this task of automatic search, and investigate the feasibility of automatic search without multimodal examples. The result is AutoSeek, an automatic multimodal search system that requires only text as input. In our search strategy we first extract semantic concepts from text and match them to semantic concept indices using a large lexical database. Queries are then created for the semantic concept indices as well as for indices that incorporate ASR text. Finally, the result sets from the different indices are fused with a combination strategy that was created using a set of development search statements. We subject our system to an external assessment in the form of the TRECVID 2005 automatic search task, and find that our system performs competitively when compared to systems that also use multimodal examples, ranking in the top three systems for 25% of the search tasks and scoring the fourth highest in overall mean average precision. We conclude that automatic search without using multimodal examples is a realistic task, and predict that performance will improve further as semantic concept detectors increase in quantity and quality

CiteSeerX

The University of Amsterdam at the CLEF Cross Language Speech Retrieval Track 2007

Author: Bouke Huurnink
Publication venue
Publication date: 01/01/2007
Field of study

In this paper we present the contents of the University of Amsterdam submission in the CLEF Cross Language Speech Retrieval 2007 English task. We describe the effects of using character n-grams and field combinations on both monolingual English retrieval, and crosslingual Dutch to English retrieval

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Search in audiovisual broadcast archives

Author: Bouke Huurnink
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Retrieval; H.3.4 Systems and Software; H.4 [Information Systems Applications]: H.4.2 Types of Systems; H.4.m

Author: Bouke Huurnink
Miscellaneo Us
Publication venue
Publication date
Field of study

Video producers, in telling a news story, tend to repeat important visual and speech material multiple times in adjacent shots, thus creating a certain level of redundancy. We describe this phenomenon, and use it to develop a framework to incorporate redundancy for cross-channel retrieval of visual items using speech. Testing our models in a series of retrieval experiments, we find that incorporating the fact that information occurs redundantly into cross-channel retrieval leads to significant improvements in retrieval performance. Categories and Subject Descriptor

CiteSeerX

Systems Applications]: H.4.2 Types of Systems; H.4.m

Author: Bouke Huurnink
Miscellaneo Us
Publication venue
Publication date
Field of study

Anecdotal evidence suggests that story-level information is important for the speech component of video retrieval. In this paper we perform a systematic examination of the combination of shot-level and story-level speech, using a document expansion approach. We isolate speech from other retrieval features, and evaluate on the 2003–2006 TRECVID test sets with a set of 94 natural language queries. Our main finding is that that the use of story information significantly improves retrieval performance compared to shotbased search, increasing overall mean average precision by over 65%

CiteSeerX