153,330 research outputs found
The State-of-the-arts in Focused Search
The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a userās topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems
Questions, relevance and relative entropy
What is a question? According to Cox a question can be identified with the
set of assertions that constitute possible answers. In this paper we propose a
different approach that combines the notion that questions are requests for
information with the notion that probability distributions represent
uncertainties resulting from lack of information. This suggests that to each
probability distribution one can naturally associate that particular question
which requests the information that is missing and vice-versa. We propose to
represent questions q by probability distributions Next we consider how
questions relate to each other: to what extent is finding the answer to one
question relevant to answering another? A natural measure of relevance is
derived by requiring that it satisfy three desirable features (three axioms).
We find that the relevance of a question q to another question Q turns out to
be the relative entropy S[q,Q] of the corresponding distributions. An
application to statistical physics is briefly considered.Comment: Presented at MaxEnt 2004, the 24th International Workshop on Bayesian
Inference and Maximum Entropy Methods (July 25-30, 2004, Garching bei
Munchen, Germany
Contextual Media Retrieval Using Natural Language Queries
The widespread integration of cameras in hand-held and head-worn devices as
well as the ability to share content online enables a large and diverse visual
capture of the world that millions of users build up collectively every day. We
envision these images as well as associated meta information, such as GPS
coordinates and timestamps, to form a collective visual memory that can be
queried while automatically taking the ever-changing context of mobile users
into account. As a first step towards this vision, in this work we present
Xplore-M-Ego: a novel media retrieval system that allows users to query a
dynamic database of images and videos using spatio-temporal natural language
queries. We evaluate our system using a new dataset of real user queries as
well as through a usability study. One key finding is that there is a
considerable amount of inter-user variability, for example in the resolution of
spatial relations in natural language utterances. We show that our retrieval
system can cope with this variability using personalisation through an online
learning-based retrieval formulation.Comment: 8 pages, 9 figures, 1 tabl
Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion
Fact-centric information needs are rarely one-shot; users typically ask follow-up questions to explore a topic. In such a conversational setting, the user's inputs are often incomplete, with entities or predicates left out, and ungrammatical phrases. This poses a huge challenge to question answering (QA) systems that typically rely on cues in full-fledged interrogative sentences. As a solution, we develop CONVEX: an unsupervised method that can answer incomplete questions over a knowledge graph (KG) by maintaining conversation context using entities and predicates seen so far and automatically inferring missing or ambiguous pieces for follow-up questions. The core of our method is a graph exploration algorithm that judiciously expands a frontier to find candidate answers for the current question. To evaluate CONVEX, we release ConvQuestions, a crowdsourced benchmark with 11,200 distinct conversations from five different domains. We show that CONVEX: (i) adds conversational support to any stand-alone QA system, and (ii) outperforms state-of-the-art baselines and question completion strategies
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering (VQA) is a challenging task that has received
increasing attention from both the computer vision and the natural language
processing communities. Given an image and a question in natural language, it
requires reasoning over visual elements of the image and general knowledge to
infer the correct answer. In the first part of this survey, we examine the
state of the art by comparing modern approaches to the problem. We classify
methods by their mechanism to connect the visual and textual modalities. In
particular, we examine the common approach of combining convolutional and
recurrent neural networks to map images and questions to a common feature
space. We also discuss memory-augmented and modular architectures that
interface with structured knowledge bases. In the second part of this survey,
we review the datasets available for training and evaluating VQA systems. The
various datatsets contain questions at different levels of complexity, which
require different capabilities and types of reasoning. We examine in depth the
question/answer pairs from the Visual Genome project, and evaluate the
relevance of the structured annotations of images with scene graphs for VQA.
Finally, we discuss promising future directions for the field, in particular
the connection to structured knowledge bases and the use of natural language
processing models.Comment: 25 page
Negative Statements Considered Useful
Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities
- ā¦