Search CORE

8,771 research outputs found

Entity Query Feature Expansion Using Knowledge Base Links

Author: Allan James
Dalton Jeffrey
Dietz Laura
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2014
Field of study

Recent advances in automatic entity linking and knowledge base construction have resulted in entity annotations for document and query collections. For example, annotations of entities from large general purpose knowledge bases, such as Freebase and the Google Knowledge Graph. Understanding how to leverage these entity annotations of text to improve ad hoc document retrieval is an open research area. Query expansion is a commonly used technique to improve retrieval effectiveness. Most previous query expansion approaches focus on text, mainly using unigram concepts. In this paper, we propose a new technique, called entity query feature expansion (EQFE) which enriches the query with features from entities and their links to knowledge bases, including structured attributes and text. We experiment using both explicit query entity annotations and latent entities. We evaluate our technique on TREC text collections automatically annotated with knowledge base entity links, including the Google Freebase Annotations (FACC1) data. We find that entity-based feature expansion results in significant improvements in retrieval effectiveness over state-of-the-art text expansion approaches

CiteSeerX

Enlighten

Binary Models for Marginal Independence

Author: Anderson T. W
Anderson T. W.
Bergsma W
Bergsma W
Bertsekas D. P
Besag J.
Cox D. R.
Cox D. R.
Dawid A. P.
Drton M.
Edwards D. M
Ekholm A.
Ekholm A.
Erdos P.
Glonek G. F. V.
Hinton G. E.
Hojsgaard S
Kauermann G.
Kauermann G.
Kendler K. S.
Kennes R.
Knuth D. E
Lauritzen S. L.
Levi M.
Mao Y.
Marchetti G. M.
McCullagh P.
McCullagh P.
Moore A.
Pearl J.
Pearl J.
Putnam R.
R Development Core Team
Sztompka P.
Wright S.
Publication venue: 'Wiley'
Publication date: 25/07/2007
Field of study

Log-linear models are a classical tool for the analysis of contingency tables. In particular, the subclass of graphical log-linear models provides a general framework for modelling conditional independences. However, with the exception of special structures, marginal independence hypotheses cannot be accommodated by these traditional models. Focusing on binary variables, we present a model class that provides a framework for modelling marginal independences in contingency tables. The approach taken is graphical and draws on analogies to multivariate Gaussian models for marginal independence. For the graphical model representation we use bi-directed graphs, which are in the tradition of path diagrams. We show how the models can be parameterized in a simple fashion, and how maximum likelihood estimation can be performed using a version of the Iterated Conditional Fitting algorithm. Finally we consider combining these models with symmetry restrictions

arXiv.org e-Print Archive

Crossref

Research Papers in Economics

Recommended from our members

Event-based hyperspace analogue to language for query expansion

Author: Hou Yuexian
Maxwell Tamsin
Song Dawei
Yan Tingxu
Zhang Peng
Publication venue
Publication date: 01/07/2010
Field of study

Bag-of-words approaches to information retrieval (IR) are effective but assume independence between words. The Hyperspace Analogue to Language (HAL) is a cognitively motivated and validated semantic space model that captures statistical dependencies between words by considering their co-occurrences in a surrounding window of text. HAL has been successfully applied to query expansion in IR, but has several limitations, including high processing cost and use of distributional statistics that do not exploit syntax. In this paper, we pursue two methods for incorporating syntactic-semantic information from textual ‘events’ into HAL. We build the HAL space directly from events to investigate whether processing costs can be reduced through more careful definition of word co-occurrence, and improve the quality of the pseudo-relevance feedback by applying event information as a constraint during HAL construction. Both methods significantly improve performance results in comparison with original HAL, and interpolation of HAL and relevance model expansion outperforms either method alone

Open Research Online (The Open University)

Flexible sampling of discrete data correlations without the marginal distributions

Author: Kalaitzis Alfredo
Silva Ricardo
Publication venue
Publication date: 01/01/2013
Field of study

Learning the joint dependence of discrete variables is a fundamental problem in machine learning, with many applications including prediction, clustering and dimensionality reduction. More recently, the framework of copula modeling has gained popularity due to its modular parametrization of joint distributions. Among other properties, copulas provide a recipe for combining flexible models for univariate marginal distributions with parametric families suitable for potentially high dimensional dependence structures. More radically, the extended rank likelihood approach of Hoff (2007) bypasses learning marginal models completely when such information is ancillary to the learning task at hand as in, e.g., standard dimensionality reduction problems or copula parameter estimation. The main idea is to represent data by their observable rank statistics, ignoring any other information from the marginals. Inference is typically done in a Bayesian framework with Gaussian copulas, and it is complicated by the fact this implies sampling within a space where the number of constraints increases quadratically with the number of data points. The result is slow mixing when using off-the-shelf Gibbs sampling. We present an efficient algorithm based on recent advances on constrained Hamiltonian Markov chain Monte Carlo that is simple to implement and does not require paying for a quadratic cost in sample size.Comment: An overhauled version of the experimental section moved to the main paper. Old experimental section moved to supplementary materia

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery