35,041 research outputs found
Information fusion for automated question answering
Until recently, research efforts in automated Question Answering (QA) have mainly
focused on getting a good understanding of questions to retrieve correct answers. This
includes deep parsing, lookups in ontologies, question typing and machine learning
of answer patterns appropriate to question forms. In contrast, I have focused on the
analysis of the relationships between answer candidates as provided in open domain
QA on multiple documents. I argue that such candidates have intrinsic properties,
partly regardless of the question, and those properties can be exploited to provide better
quality and more user-oriented answers in QA.Information fusion refers to the technique of merging pieces of information from
different sources. In QA over free text, it is motivated by the frequency with which
different answer candidates are found in different locations, leading to a multiplicity
of answers. The reason for such multiplicity is, in part, the massive amount of data
used for answering, and also its unstructured and heterogeneous content: Besides am¬
biguities in user questions leading to heterogeneity in extractions, systems have to deal
with redundancy, granularity and possible contradictory information. Hence the need
for answer candidate comparison. While frequency has proved to be a significant char¬
acteristic of a correct answer, I evaluate the value of other relationships characterizing
answer variability and redundancy.Partially inspired by recent developments in multi-document summarization, I re¬
define the concept of "answer" within an engineering approach to QA based on the
Model-View-Controller (MVC) pattern of user interface design. An "answer model"
is a directed graph in which nodes correspond to entities projected from extractions
and edges convey relationships between such nodes. The graph represents the fusion
of information contained in the set of extractions. Different views of the answer model
can be produced, capturing the fact that the same answer can be expressed and pre¬
sented in various ways: picture, video, sound, written or spoken language, or a formal
data structure. Within this framework, an answer is a structured object contained in the
model and retrieved by a strategy to build a particular view depending on the end user
(or taskj's requirements.I describe shallow techniques to compare entities and enrich the model by discovering four broad categories of relationships between entities in the model: equivalence,
inclusion, aggregation and alternative. Quantitatively, answer candidate modeling im¬
proves answer extraction accuracy. It also proves to be more robust to incorrect answer
candidates than traditional techniques. Qualitatively, models provide meta-information
encoded by relationships that allow shallow reasoning to help organize and generate
the final output
Ontological Matchmaking in Recommender Systems
The electronic marketplace offers great potential for the recommendation of
supplies. In the so called recommender systems, it is crucial to apply
matchmaking strategies that faithfully satisfy the predicates specified in the
demand, and take into account as much as possible the user preferences. We
focus on real-life ontology-driven matchmaking scenarios and identify a number
of challenges, being inspired by such scenarios. A key challenge is that of
presenting the results to the users in an understandable and clear-cut fashion
in order to facilitate the analysis of the results. Indeed, such scenarios
evoke the opportunity to rank and group the results according to specific
criteria. A further challenge consists of presenting the results to the user in
an asynchronous fashion, i.e. the 'push' mode, along with the 'pull' mode, in
which the user explicitly issues a query, and displays the results. Moreover,
an important issue to consider in real-life cases is the possibility of
submitting a query to multiple providers, and collecting the various results.
We have designed and implemented an ontology-based matchmaking system that
suitably addresses the above challenges. We have conducted a comprehensive
experimental study, in order to investigate the usability of the system, the
performance and the effectiveness of the matchmaking strategies with real
ontological datasets.Comment: 28 pages, 8 figure
MOTIVATION TO RESPOND ON STACK OVERFLOW Q&A WEBSITE
Abstract. The importance of using Q&A sites such as Stack Overflow and Code Project, etc. is obvious toeveryone in order to solve the potential problems of the developers. The objective of this research was to increase the participation rate and responsiveness of developers on Stack Overflow website by improving the gamification methods. To present the proposed solution, a tool called Stack Overflow Super Gamification (SSG) was proposed, which is an extension for Eclipse. The purpose of this extension is to create an ongoing competition and motivation among developers to participate in answering questions on Stack Overflow site. In this extension, the ranking practices for active users on the siteare improved so that the continuity of participation in the site will earn more privileges. Also, the ranking structure of the users with various nationalities who have gained privileges was used to create a motivation and competition among the developers of different countries. Rewards for users in this extension, for example, offering superior job opportunities based on higher privileges, as well as providing an opportunity to advertise products or businesses and demonstrate personalabilities and talent for free, will make them more willing to participate, and will provide the incentive to stay active on the site. The proposed solutions will not only provide more activities and answers to more questions, but also bring valuable achievements to developers active on the site. According to the evaluations, the performance of the proposed solution for motivating developers to participate and answer the questions on the Stack Overflow site is acceptable. Since the purpose of this strategy is to encourage developers to participate effectively on the site, the evaluation results clearly reflect theusefulness of this solution in motivating developers. The results indicate that while the developers are actively involved, the number of unanswered questions, as well as unacceptable responses is reduced, and in the meanwhile, the quality of the responses given is acceptable in terms of brevity, completeness, and accuracy.Keywords: Gamification, Motivation on Stack Overflow website, Q&A sites
Concept-based Interactive Query Expansion Support Tool (CIQUEST)
This report describes a three-year project (2000-03) undertaken in the Information Studies
Department at The University of Sheffield and funded by Resource, The Council for
Museums, Archives and Libraries. The overall aim of the research was to provide user
support for query formulation and reformulation in searching large-scale textual resources
including those of the World Wide Web. More specifically the objectives were: to investigate
and evaluate methods for the automatic generation and organisation of concepts derived from
retrieved document sets, based on statistical methods for term weighting; and to conduct
user-based evaluations on the understanding, presentation and retrieval effectiveness of
concept structures in selecting candidate terms for interactive query expansion.
The TREC test collection formed the basis for the seven evaluative experiments conducted in
the course of the project. These formed four distinct phases in the project plan. In the first
phase, a series of experiments was conducted to investigate further techniques for concept
derivation and hierarchical organisation and structure. The second phase was concerned with
user-based validation of the concept structures. Results of phases 1 and 2 informed on the
design of the test system and the user interface was developed in phase 3. The final phase
entailed a user-based summative evaluation of the CiQuest system.
The main findings demonstrate that concept hierarchies can effectively be generated from
sets of retrieved documents and displayed to searchers in a meaningful way. The approach
provides the searcher with an overview of the contents of the retrieved documents, which in
turn facilitates the viewing of documents and selection of the most relevant ones. Concept
hierarchies are a good source of terms for query expansion and can improve precision. The
extraction of descriptive phrases as an alternative source of terms was also effective. With
respect to presentation, cascading menus were easy to browse for selecting terms and for
viewing documents. In conclusion the project dissemination programme and future work are
outlined
ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters
To bridge the gap between the capabilities of the state-of-the-art in factoid
question answering (QA) and what users ask, we need large datasets of real user
questions that capture the various question phenomena users are interested in,
and the diverse ways in which these questions are formulated. We introduce
ComQA, a large dataset of real user questions that exhibit different
challenging aspects such as compositionality, temporal reasoning, and
comparisons. ComQA questions come from the WikiAnswers community QA platform,
which typically contains questions that are not satisfactorily answerable by
existing search engine technology. Through a large crowdsourcing effort, we
clean the question dataset, group questions into paraphrase clusters, and
annotate clusters with their answers. ComQA contains 11,214 questions grouped
into 4,834 paraphrase clusters. We detail the process of constructing ComQA,
including the measures taken to ensure its high quality while making effective
use of crowdsourcing. We also present an extensive analysis of the dataset and
the results achieved by state-of-the-art systems on ComQA, demonstrating that
our dataset can be a driver of future research on QA.Comment: 11 pages, NAACL 201
Neural Response Ranking for Social Conversation: A Data-Efficient Approach
The overall objective of 'social' dialogue systems is to support engaging,
entertaining, and lengthy conversations on a wide variety of topics, including
social chit-chat. Apart from raw dialogue data, user-provided ratings are the
most common signal used to train such systems to produce engaging responses. In
this paper we show that social dialogue systems can be trained effectively from
raw unannotated data. Using a dataset of real conversations collected in the
2017 Alexa Prize challenge, we developed a neural ranker for selecting 'good'
system responses to user utterances, i.e. responses which are likely to lead to
long and engaging conversations. We show that (1) our neural ranker
consistently outperforms several strong baselines when trained to optimise for
user ratings; (2) when trained on larger amounts of data and only using
conversation length as the objective, the ranker performs better than the one
trained using ratings -- ultimately reaching a Precision@1 of 0.87. This
advance will make data collection for social conversational agents simpler and
less expensive in the future.Comment: 2018 EMNLP Workshop SCAI: The 2nd International Workshop on
Search-Oriented Conversational AI. Brussels, Belgium, October 31, 201
- …