2,840 research outputs found

    A Comparative analysis: QA evaluation questions versus real-world queries

    Get PDF
    This paper presents a comparative analysis of user queries to a web search engine, questions to a Q&A service (answers.com), and questions employed in question answering (QA) evaluations at TREC and CLEF. The analysis shows that user queries to search engines contain mostly content words (i.e. keywords) but lack structure words (i.e. stopwords) and capitalization. Thus, they resemble natural language input after case folding and stopword removal. In contrast, topics for QA evaluation and questions to answers.com mainly consist of fully capitalized and syntactically well-formed questions. Classification experiments using a na¨ıve Bayes classifier show that stopwords play an important role in determining the expected answer type. A classification based on stopwords is considerably more accurate (47.5% accuracy) than a classification based on all query words (40.1% accuracy) or on content words (33.9% accuracy). To simulate user input, questions are preprocessed by case folding and stopword removal. Additional classification experiments aim at reconstructing the syntactic wh-word frame of a question, i.e. the embedding of the interrogative word. Results indicate that this part of questions can be reconstructed with moderate accuracy (25.7%), but for a classification problem with a much larger number of classes compared to classifying queries by expected answer type (2096 classes vs. 130 classes). Furthermore, eliminating stopwords can lead to multiple reconstructed questions with a different or with the opposite meaning (e.g. if negations or temporal restrictions are included). In conclusion, question reconstruction from short user queries can be seen as a new realistic evaluation challenge for QA systems

    Scalable, Efficient and Precise Natural Language Processing in the Semantic Web

    Get PDF
    The Internet of Things (IoT) is an emerging phenomenon in the public space. Users with accessibility needs could especially benefit from these “smart” devices if they were able to interact with them through speech. This thesis presents a Compositional Semantics and framework for developing extensible and expressive Natural Language Query Interfaces to the Semantic Web, addressing privacy and auditability needs in the process. This could be particularly useful in healthcare or legal applications, where confidentiality of information is a key concer

    A graphical user interface for Boolean query specification

    Get PDF
    On-line information repositories commonly provide keyword search facilities via textual query languages based on Boolean logic. However, there is evidence to suggest that the syntactical demands of such languages can lead to user errors and adversely affect the time that it takes users to form queries. Users also face difficulties because of the conflict in semantics between AND and OR when used in Boolean logic and English language. We suggest that graphical query languages, in particular Venn-like diagrams, can alleviate the problems that users experience when forming Boolean expressions with textual languages. We describe Vquery, a Venn-diagram based user interface to the New Zealand Digital Library (NZDL). The design of Vquery has been partly motivated by analysis of NZDL usage. We found that few queries contain more than three terms, use of the intersection operator dominates and that query refinement is common. A study of the utility of Venn diagrams for query specification indicates that with little or no training users can interpret and form Venn-like diagrams which accurately correspond to Boolean expressions. The utility of Vquery is considered and directions for future work are proposed

    KARL: A Knowledge-Assisted Retrieval Language

    Get PDF
    Data classification and storage are tasks typically performed by application specialists. In contrast, information users are primarily non-computer specialists who use information in their decision-making and other activities. Interaction efficiency between such users and the computer is often reduced by machine requirements and resulting user reluctance to use the system. This thesis examines the problems associated with information retrieval for non-computer specialist users, and proposes a method for communicating in restricted English that uses knowledge of the entities involved, relationships between entities, and basic English language syntax and semantics to translate the user requests into formal queries. The proposed method includes an intelligent dictionary, syntax and semantic verifiers, and a formal query generator. In addition, the proposed system has a learning capability that can improve portability and performance. With the increasing demand for efficient human-machine communication, the significance of this thesis becomes apparent. As human resources become more valuable, software systems that will assist in improving the human-machine interface will be needed and research addressing new solutions will be of utmost importance. This thesis presents an initial design and implementation as a foundation for further research and development into the emerging field of natural language database query systems

    A negation query engine for complex query transformations

    Get PDF
    Natural language interfaces to ontologies allow users to query the system using natural language queries. These systems take natural language query as input and transform it to formal query language equivalent to retrieve the desired information from ontologies. The existing natural language interfaces to ontologies offer support for handling negation queries; however, they offer limited support for dealing with them. This paper proposes a negation query handling engine which can handle relatively complex natural language queries than the existing systems. The proposed engine effectively understands the intent of the user query on the basis of a sophisticated algorithm, which is governed by a set of techniques and transformation rules. The proposed engine was evaluated using the Mooney data set and AquaLog dataset, and it manifested encouraging results

    Self-adaptive Based Model for Ambiguity Resolution of The Linked Data Query for Big Data Analytics

    Get PDF
    Integration of heterogeneous data sources is a crucial step in big data analytics, although it creates ambiguity issues during mapping between the sources due to the variation in the query terms, data structure and granularity conflicts. However, there are limited researches on effective big data integration to address the ambiguity issue for big data analytics. This paper introduces a self-adaptive model for big data integration by exploiting the data structure during querying in order to mitigate and resolve ambiguities. An assessment of a preliminary work on the Geography and Quran dataset is reported to illustrate the feasibility of the proposed model that motivates future work such as solving complex query
    corecore