10,557 research outputs found
Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems
With the development of Natural Language Processing (NLP), more and more
systems want to adopt NLP in User Interface Module to process user input, in
order to communicate with user in a natural way. However, this raises a speed
problem. That is, if NLP module can not process sentences in durable time
delay, users will never use the system. As a result, systems which are strict
with processing time, such as dialogue systems, web search systems, automatic
customer service systems, especially real-time systems, have to abandon NLP
module in order to get a faster system response. This paper aims to solve the
speed problem. In this paper, at first, the construction of a syntactic parser
which is based on corpus machine learning and statistics model is introduced,
and then a speed problem analysis is performed on the parser and its
algorithms. Based on the analysis, two accelerating methods, Compressed POS Set
and Syntactic Patterns Pruning, are proposed, which can effectively improve the
time efficiency of parsing in NLP module. To evaluate different parameters in
the accelerating algorithms, two new factors, PT and RT, are introduced and
explained in detail. Experiments are also completed to prove and test these
methods, which will surely contribute to the application of NLP.Comment: 7 pages, International Conference on Artificial Intelligence
(ICAI'07
A Hybrid Approach using Ontology Similarity and Fuzzy Logic for Semantic Question Answering
One of the challenges in information retrieval is providing accurate answers
to a user's question often expressed as uncertainty words. Most answers are
based on a Syntactic approach rather than a Semantic analysis of the query. In
this paper, our objective is to present a hybrid approach for a Semantic
question answering retrieval system using Ontology Similarity and Fuzzy logic.
We use a Fuzzy Co-clustering algorithm to retrieve the collection of documents
based on Ontology Similarity. The Fuzzy Scale uses Fuzzy type-1 for documents
and Fuzzy type-2 for words to prioritize answers. The objective of this work is
to provide retrieval system with more accurate answers than non-fuzzy Semantic
Ontology approach
Why we have switched from building full-fledged taxonomies to simply detecting hypernymy relations
The study of taxonomies and hypernymy relations has been extensive on the
Natural Language Processing (NLP) literature. However, the evaluation of
taxonomy learning approaches has been traditionally troublesome, as it mainly
relies on ad-hoc experiments which are hardly reproducible and manually
expensive. Partly because of this, current research has been lately focusing on
the hypernymy detection task. In this paper we reflect on this trend, analyzing
issues related to current evaluation procedures. Finally, we propose three
potential avenues for future work so that is-a relations and resources based on
them play a more important role in downstream NLP applications.Comment: Discussion paper. 6 pages, 1 figur
Boosting Question Answering by Deep Entity Recognition
In this paper an open-domain factoid question answering system for Polish,
RAFAEL, is presented. The system goes beyond finding an answering sentence; it
also extracts a single string, corresponding to the required entity. Herein the
focus is placed on different approaches to entity recognition, essential for
retrieving information matching question constraints. Apart from traditional
approach, including named entity recognition (NER) solutions, a novel
technique, called Deep Entity Recognition (DeepER), is introduced and
implemented. It allows a comprehensive search of all forms of entity references
matching a given WordNet synset (e.g. an impressionist), based on a previously
assembled entity library. It has been created by analysing the first sentences
of encyclopaedia entries and disambiguation and redirect pages. DeepER also
provides automatic evaluation, which makes possible numerous experiments,
including over a thousand questions from a quiz TV show answered on the grounds
of Polish Wikipedia. The final results of a manual evaluation on a separate
question set show that the strength of DeepER approach lies in its ability to
answer questions that demand answers beyond the traditional categories of named
entities
A survey of Community Question Answering
With the advent of numerous community forums, tasks associated with the same
have gained importance in the recent past. With the influx of new questions
every day on these forums, the issues of identifying methods to find answers to
said questions, or even trying to detect duplicate questions, are of practical
importance and are challenging in their own right. This paper aims at surveying
some of the aforementioned issues, and methods proposed for tackling the same
QuASE: Question-Answer Driven Sentence Encoding
Question-answering (QA) data often encodes essential information in many
facets. This paper studies a natural question: Can we get supervision from QA
data for other tasks (typically, non-QA ones)? For example, {\em can we use
QAMR (Michael et al., 2017) to improve named entity recognition?} We suggest
that simply further pre-training BERT is often not the best option, and propose
the {\em question-answer driven sentence encoding (QuASE)} framework. QuASE
learns representations from QA data, using BERT or other state-of-the-art
contextual language models. In particular, we observe the need to distinguish
between two types of sentence encodings, depending on whether the target task
is a single- or multi-sentence input; in both cases, the resulting encoding is
shown to be an easy-to-use plugin for many downstream tasks. This work may
point out an alternative way to supervise NLP tasks
Learning to Ask: Neural Question Generation for Reading Comprehension
We study automatic question generation for sentences from text passages in
reading comprehension. We introduce an attention-based sequence learning model
for the task and investigate the effect of encoding sentence- vs.
paragraph-level information. In contrast to all previous work, our model does
not rely on hand-crafted rules or a sophisticated NLP pipeline; it is instead
trainable end-to-end via sequence-to-sequence learning. Automatic evaluation
results show that our system significantly outperforms the state-of-the-art
rule-based system. In human evaluations, questions generated by our system are
also rated as being more natural (i.e., grammaticality, fluency) and as more
difficult to answer (in terms of syntactic and lexical divergence from the
original text and reasoning needed to answer).Comment: Accepted to ACL 2017, 11 page
A Syntactic Approach to Domain-Specific Automatic Question Generation
Factoid questions are questions that require short fact-based answers.
Automatic generation (AQG) of factoid questions from a given text can
contribute to educational activities, interactive question answering systems,
search engines, and other applications. The goal of our research is to generate
factoid source-question-answer triplets based on a specific domain. We propose
a four-component pipeline, which obtains as input a training corpus of
domain-specific documents, along with a set of declarative sentences from the
same domain, and generates as output a set of factoid questions that refer to
the source sentences but are slightly different from them, so that a
question-answering system or a person can be asked a question that requires a
deeper understanding and knowledge than a simple word-matching. Contrary to
existing domain-specific AQG systems that utilize the template-based approach
to question generation, we propose to transform each source sentence into a set
of questions by applying a series of domain-independent rules (a
syntactic-based approach). Our pipeline was evaluated in the domain of cyber
security using a series of experiments on each component of the pipeline
separately and on the end-to-end system. The proposed approach generated a
higher percentage of acceptable questions than a prior state-of-the-art AQG
system
Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs
Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines
Creating and Characterizing a Diverse Corpus of Sarcasm in Dialogue
The use of irony and sarcasm in social media allows us to study them at scale
for the first time. However, their diversity has made it difficult to construct
a high-quality corpus of sarcasm in dialogue. Here, we describe the process of
creating a large- scale, highly-diverse corpus of online debate forums
dialogue, and our novel methods for operationalizing classes of sarcasm in the
form of rhetorical questions and hyperbole. We show that we can use
lexico-syntactic cues to reliably retrieve sarcastic utterances with high
accuracy. To demonstrate the properties and quality of our corpus, we conduct
supervised learning experiments with simple features, and show that we achieve
both higher precision and F than previous work on sarcasm in debate forums
dialogue. We apply a weakly-supervised linguistic pattern learner and
qualitatively analyze the linguistic differences in each class.Comment: 11 pages, 4 figures, SIGDIAL 201
- …