Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

Croft, W. Bruce; Maxwell, K. Tamsin; Oberlander, Jon

Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval

Authors: W. Bruce Croft
K. Tamsin Maxwell
Jon Oberlander
Publication date: 1 August 2013
Publisher

Abstract

Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Edinburgh Research Explorer

oai:pure.ed.ac.uk:publications...

Last time updated on 08/02/2015