10,557 research outputs found

    Accelerating and Evaluation of Syntactic Parsing in Natural Language Question Answering Systems

    Full text link
    With the development of Natural Language Processing (NLP), more and more systems want to adopt NLP in User Interface Module to process user input, in order to communicate with user in a natural way. However, this raises a speed problem. That is, if NLP module can not process sentences in durable time delay, users will never use the system. As a result, systems which are strict with processing time, such as dialogue systems, web search systems, automatic customer service systems, especially real-time systems, have to abandon NLP module in order to get a faster system response. This paper aims to solve the speed problem. In this paper, at first, the construction of a syntactic parser which is based on corpus machine learning and statistics model is introduced, and then a speed problem analysis is performed on the parser and its algorithms. Based on the analysis, two accelerating methods, Compressed POS Set and Syntactic Patterns Pruning, are proposed, which can effectively improve the time efficiency of parsing in NLP module. To evaluate different parameters in the accelerating algorithms, two new factors, PT and RT, are introduced and explained in detail. Experiments are also completed to prove and test these methods, which will surely contribute to the application of NLP.Comment: 7 pages, International Conference on Artificial Intelligence (ICAI'07

    A Hybrid Approach using Ontology Similarity and Fuzzy Logic for Semantic Question Answering

    Full text link
    One of the challenges in information retrieval is providing accurate answers to a user's question often expressed as uncertainty words. Most answers are based on a Syntactic approach rather than a Semantic analysis of the query. In this paper, our objective is to present a hybrid approach for a Semantic question answering retrieval system using Ontology Similarity and Fuzzy logic. We use a Fuzzy Co-clustering algorithm to retrieve the collection of documents based on Ontology Similarity. The Fuzzy Scale uses Fuzzy type-1 for documents and Fuzzy type-2 for words to prioritize answers. The objective of this work is to provide retrieval system with more accurate answers than non-fuzzy Semantic Ontology approach

    Why we have switched from building full-fledged taxonomies to simply detecting hypernymy relations

    Full text link
    The study of taxonomies and hypernymy relations has been extensive on the Natural Language Processing (NLP) literature. However, the evaluation of taxonomy learning approaches has been traditionally troublesome, as it mainly relies on ad-hoc experiments which are hardly reproducible and manually expensive. Partly because of this, current research has been lately focusing on the hypernymy detection task. In this paper we reflect on this trend, analyzing issues related to current evaluation procedures. Finally, we propose three potential avenues for future work so that is-a relations and resources based on them play a more important role in downstream NLP applications.Comment: Discussion paper. 6 pages, 1 figur

    Boosting Question Answering by Deep Entity Recognition

    Full text link
    In this paper an open-domain factoid question answering system for Polish, RAFAEL, is presented. The system goes beyond finding an answering sentence; it also extracts a single string, corresponding to the required entity. Herein the focus is placed on different approaches to entity recognition, essential for retrieving information matching question constraints. Apart from traditional approach, including named entity recognition (NER) solutions, a novel technique, called Deep Entity Recognition (DeepER), is introduced and implemented. It allows a comprehensive search of all forms of entity references matching a given WordNet synset (e.g. an impressionist), based on a previously assembled entity library. It has been created by analysing the first sentences of encyclopaedia entries and disambiguation and redirect pages. DeepER also provides automatic evaluation, which makes possible numerous experiments, including over a thousand questions from a quiz TV show answered on the grounds of Polish Wikipedia. The final results of a manual evaluation on a separate question set show that the strength of DeepER approach lies in its ability to answer questions that demand answers beyond the traditional categories of named entities

    A survey of Community Question Answering

    Full text link
    With the advent of numerous community forums, tasks associated with the same have gained importance in the recent past. With the influx of new questions every day on these forums, the issues of identifying methods to find answers to said questions, or even trying to detect duplicate questions, are of practical importance and are challenging in their own right. This paper aims at surveying some of the aforementioned issues, and methods proposed for tackling the same

    QuASE: Question-Answer Driven Sentence Encoding

    Full text link
    Question-answering (QA) data often encodes essential information in many facets. This paper studies a natural question: Can we get supervision from QA data for other tasks (typically, non-QA ones)? For example, {\em can we use QAMR (Michael et al., 2017) to improve named entity recognition?} We suggest that simply further pre-training BERT is often not the best option, and propose the {\em question-answer driven sentence encoding (QuASE)} framework. QuASE learns representations from QA data, using BERT or other state-of-the-art contextual language models. In particular, we observe the need to distinguish between two types of sentence encodings, depending on whether the target task is a single- or multi-sentence input; in both cases, the resulting encoding is shown to be an easy-to-use plugin for many downstream tasks. This work may point out an alternative way to supervise NLP tasks

    Learning to Ask: Neural Question Generation for Reading Comprehension

    Full text link
    We study automatic question generation for sentences from text passages in reading comprehension. We introduce an attention-based sequence learning model for the task and investigate the effect of encoding sentence- vs. paragraph-level information. In contrast to all previous work, our model does not rely on hand-crafted rules or a sophisticated NLP pipeline; it is instead trainable end-to-end via sequence-to-sequence learning. Automatic evaluation results show that our system significantly outperforms the state-of-the-art rule-based system. In human evaluations, questions generated by our system are also rated as being more natural (i.e., grammaticality, fluency) and as more difficult to answer (in terms of syntactic and lexical divergence from the original text and reasoning needed to answer).Comment: Accepted to ACL 2017, 11 page

    A Syntactic Approach to Domain-Specific Automatic Question Generation

    Full text link
    Factoid questions are questions that require short fact-based answers. Automatic generation (AQG) of factoid questions from a given text can contribute to educational activities, interactive question answering systems, search engines, and other applications. The goal of our research is to generate factoid source-question-answer triplets based on a specific domain. We propose a four-component pipeline, which obtains as input a training corpus of domain-specific documents, along with a set of declarative sentences from the same domain, and generates as output a set of factoid questions that refer to the source sentences but are slightly different from them, so that a question-answering system or a person can be asked a question that requires a deeper understanding and knowledge than a simple word-matching. Contrary to existing domain-specific AQG systems that utilize the template-based approach to question generation, we propose to transform each source sentence into a set of questions by applying a series of domain-independent rules (a syntactic-based approach). Our pipeline was evaluated in the domain of cyber security using a series of experiments on each component of the pipeline separately and on the end-to-end system. The proposed approach generated a higher percentage of acceptable questions than a prior state-of-the-art AQG system

    Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

    No full text
    Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines

    Creating and Characterizing a Diverse Corpus of Sarcasm in Dialogue

    Full text link
    The use of irony and sarcasm in social media allows us to study them at scale for the first time. However, their diversity has made it difficult to construct a high-quality corpus of sarcasm in dialogue. Here, we describe the process of creating a large- scale, highly-diverse corpus of online debate forums dialogue, and our novel methods for operationalizing classes of sarcasm in the form of rhetorical questions and hyperbole. We show that we can use lexico-syntactic cues to reliably retrieve sarcastic utterances with high accuracy. To demonstrate the properties and quality of our corpus, we conduct supervised learning experiments with simple features, and show that we achieve both higher precision and F than previous work on sarcasm in debate forums dialogue. We apply a weakly-supervised linguistic pattern learner and qualitatively analyze the linguistic differences in each class.Comment: 11 pages, 4 figures, SIGDIAL 201
    • …
    corecore