44,704 research outputs found
ANTIQUE: A Non-Factoid Question Answering Benchmark
Considering the widespread use of mobile and voice search, answer passage
retrieval for non-factoid questions plays a critical role in modern information
retrieval systems. Despite the importance of the task, the community still
feels the significant lack of large-scale non-factoid question answering
collections with real questions and comprehensive relevance judgments. In this
paper, we develop and release a collection of 2,626 open-domain non-factoid
questions from a diverse set of categories. The dataset, called ANTIQUE,
contains 34,011 manual relevance annotations. The questions were asked by real
users in a community question answering service, i.e., Yahoo! Answers.
Relevance judgments for all the answers to each question were collected through
crowdsourcing. To facilitate further research, we also include a brief analysis
of the data as well as baseline results on both classical and recently
developed neural IR models
Finding Structured and Unstructured Features to Improve the Search Result of Complex Question
-Recently, search engine got challenge deal with such a natural language questions.
Sometimes, these questions are complex questions. A complex question is a question that
consists several clauses, several intentions or need long answer.
In this work we proposed that finding structured features and unstructured features of
questions and using structured data and unstructured data could improve the search result
of complex questions. According to those, we will use two approaches, IR approach and
structured retrieval, QA template.
Our framework consists of three parts. Question analysis, Resource Discovery and
Analysis The Relevant Answer. In Question Analysis we used a few assumptions, and
tried to find structured and unstructured features of the questions. Structured feature
refers to Structured data and unstructured feature refers to unstructured data. In the
resource discovery we integrated structured data (relational database) and unstructured
data (webpage) to take the advantaged of two kinds of data to improve and reach the
relevant answer. We will find the best top fragments from context of the webpage In the
Relevant Answer part, we made a score matching between the result from structured data
and unstructured data, then finally used QA template to reformulate the question.
In the experiment result, it shows that using structured feature and unstructured
feature and using both structured and unstructured data, using approach IR and QA
template could improve the search result of complex questions
EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets
This article introduces a new language-independent approach for creating a
large-scale high-quality test collection of tweets that supports multiple
information retrieval (IR) tasks without running a shared-task campaign. The
adopted approach (demonstrated over Arabic tweets) designs the collection
around significant (i.e., popular) events, which enables the development of
topics that represent frequent information needs of Twitter users for which
rich content exists. That inherently facilitates the support of multiple tasks
that generally revolve around events, namely event detection, ad-hoc search,
timeline generation, and real-time summarization. The key highlights of the
approach include diversifying the judgment pool via interactive search and
multiple manually-crafted queries per topic, collecting high-quality
annotations via crowd-workers for relevancy and in-house annotators for
novelty, filtering out low-agreement topics and inaccessible tweets, and
providing multiple subsets of the collection for better availability. Applying
our methodology on Arabic tweets resulted in EveTAR , the first
freely-available tweet test collection for multiple IR tasks. EveTAR includes a
crawl of 355M Arabic tweets and covers 50 significant events for which about
62K tweets were judged with substantial average inter-annotator agreement
(Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating
existing algorithms in the respective tasks. Results indicate that the new
collection can support reliable ranking of IR systems that is comparable to
similar TREC collections, while providing strong baseline results for future
studies over Arabic tweets
Reading Wikipedia to Answer Open-Domain Questions
This paper proposes to tackle open- domain question answering using Wikipedia
as the unique knowledge source: the answer to any factoid question is a text
span in a Wikipedia article. This task of machine reading at scale combines the
challenges of document retrieval (finding the relevant articles) with that of
machine comprehension of text (identifying the answer spans from those
articles). Our approach combines a search component based on bigram hashing and
TF-IDF matching with a multi-layer recurrent neural network model trained to
detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA
datasets indicate that (1) both modules are highly competitive with respect to
existing counterparts and (2) multitask learning using distant supervision on
their combination is an effective complete system on this challenging task.Comment: ACL2017, 10 page
Machine learning research 1989-90
Multifunctional knowledge bases offer a significant advance in artificial intelligence because they can support numerous expert tasks within a domain. As a result they amortize the costs of building a knowledge base over multiple expert systems and they reduce the brittleness of each system. Due to the inevitable size and complexity of multifunctional knowledge bases, their construction and maintenance require knowledge engineering and acquisition tools that can automatically identify interactions between new and existing knowledge. Furthermore, their use requires software for accessing those portions of the knowledge base that coherently answer questions. Considerable progress was made in developing software for building and accessing multifunctional knowledge bases. A language was developed for representing knowledge, along with software tools for editing and displaying knowledge, a machine learning program for integrating new information into existing knowledge, and a question answering system for accessing the knowledge base
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
Planning strategically, designing architecturally : a framework for digital library services
In an era of unprecedented technological innovation and evolving user expectations and information seeking behaviour, we are arguably now an online society, with digital services increasingly common and increasingly preferred. As a trusted information provider, libraries are in an advantageous position to respond, but this requires integrated strategic and enterprise architecture planning, for information technology (IT) has evolved from a support role to a strategic role, providing the core management systems, communication networks, and delivery channels of the modern library. Further, IT components do not function in isolation from one another, but are interdependent elements of distributed and multidimensional systems encompassing people, processes, and technologies, which must consider social, economic, legal, organisational, and ergonomic requirements and relationships, as well as being logically sound from a technical perspective. Strategic planning provides direction, while enterprise architecture strategically aligns and holistically integrates business and information system architectures. While challenging, such integrated planning should be regarded as an opportunity for the library to evolve as an enterprise in the digital age, or at minimum, to simply keep pace with societal change and alternative service providers. Without strategy, a library risks being directed by outside forces with independent motivations and inadequate understanding of its broader societal role. Without enterprise architecture, it risks technological disparity, redundancy, and obsolescence. Adopting an interdisciplinary approach, this conceptual paper provides an integrated framework for strategic and architectural planning of digital library services. The concept of the library as an enterprise is also introduced
- …