51,777 research outputs found
Looking at Vector Space and Language Models for IR using Density Matrices
In this work, we conduct a joint analysis of both Vector Space and Language
Models for IR using the mathematical framework of Quantum Theory. We shed light
on how both models allocate the space of density matrices. A density matrix is
shown to be a general representational tool capable of leveraging capabilities
of both VSM and LM representations thus paving the way for a new generation of
retrieval models. We analyze the possible implications suggested by our
findings.Comment: In Proceedings of Quantum Interaction 201
The Most Influential Paper Gerard Salton Never Wrote
Gerard Salton is often credited with developing the vector space model
(VSM) for information retrieval (IR). Citations to Salton give the impression
that the VSM must have been articulated as an IR model sometime between
1970 and 1975. However, the VSM as it is understood today evolved over a
longer time period than is usually acknowledged, and an articulation of the
model and its assumptions did not appear in print until several years after
those assumptions had been criticized and alternative models proposed. An
often cited overview paper titled ???A Vector Space Model for Information
Retrieval??? (alleged to have been published in 1975) does not exist, and
citations to it represent a confusion of two 1975 articles, neither of which
were overviews of the VSM as a model of information retrieval. Until the
late 1970s, Salton did not present vector spaces as models of IR generally
but rather as models of specifi c computations. Citations to the phantom
paper refl ect an apparently widely held misconception that the operational
features and explanatory devices now associated with the VSM must have
been introduced at the same time it was fi rst proposed as an IR model.published or submitted for publicatio
Personalized content retrieval in context using ontological knowledge
Personalized content retrieval aims at improving the retrieval process by taking into account the particular interests of individual users. However, not all user preferences are relevant in all situations. It is well known that human preferences are complex, multiple, heterogeneous, changing, even contradictory, and should be understood in context with the user goals and tasks at hand. In this paper, we propose a method to build a dynamic representation of the semantic context of ongoing retrieval tasks, which is used to activate different subsets of user interests at runtime, in a way that out-of-context preferences are discarded. Our approach is based on an ontology-driven representation of the domain of discourse, providing enriched descriptions of the semantics involved in retrieval actions and preferences, and enabling the definition of effective means to relate preferences and context
A Quantum Many-body Wave Function Inspired Language Modeling Approach
The recently proposed quantum language model (QLM) aimed at a principled
approach to modeling term dependency by applying the quantum probability
theory. The latest development for a more effective QLM has adopted word
embeddings as a kind of global dependency information and integrated the
quantum-inspired idea in a neural network architecture. While these
quantum-inspired LMs are theoretically more general and also practically
effective, they have two major limitations. First, they have not taken into
account the interaction among words with multiple meanings, which is common and
important in understanding natural language text. Second, the integration of
the quantum-inspired LM with the neural network was mainly for effective
training of parameters, yet lacking a theoretical foundation accounting for
such integration. To address these two issues, in this paper, we propose a
Quantum Many-body Wave Function (QMWF) inspired language modeling approach. The
QMWF inspired LM can adopt the tensor product to model the aforesaid
interaction among words. It also enables us to reveal the inherent necessity of
using Convolutional Neural Network (CNN) in QMWF language modeling.
Furthermore, our approach delivers a simple algorithm to represent and match
text/sentence pairs. Systematic evaluation shows the effectiveness of the
proposed QMWF-LM algorithm, in comparison with the state of the art
quantum-inspired LMs and a couple of CNN-based methods, on three typical
Question Answering (QA) datasets.Comment: 10 pages,4 figures,CIK
Probabilistic hyperspace analogue to language
Song and Bruza introduce a framework for Information Retrieval(IR) based on Gardenfor's three tiered cognitive model; Conceptual Spaces. They instantiate a conceptual space using Hyperspace Analogue to Language (HAL to generate higher order concepts which are later used for ad-hoc retrieval. In this poster, we propose an alternative implementation of the conceptual space by using a probabilistic HAL space (pHAL). To evaluate whether converting to such an implementation is beneficial we have performed an initial investigation comparing the concept combination of HAL against pHAL for the task of query expansion. Our experiments indicate that pHAL outperforms the original HAL method and that better query term selection methods can improve performance on both HAL and pHAL
- …