2,967 research outputs found

    Cooperative answers in database systems

    Get PDF
    A major concern of researchers who seek to improve human-computer communication involves how to move beyond literal interpretations of queries to a level of responsiveness that takes the user's misconceptions, expectations, desires, and interests into consideration. At Maryland, we are investigating how to better meet a user's needs within the framework of the cooperative answering system of Gal and Minker. We have been exploring how to use semantic information about the database to formulate coherent and informative answers. The work has two main thrusts: (1) the construction of a logic formula which embodies the content of a cooperative answer; and (2) the presentation of the logic formula to the user in a natural language form. The information that is available in a deductive database system for building cooperative answers includes integrity constraints, user constraints, the search tree for answers to the query, and false presuppositions that are present in the query. The basic cooperative answering theory of Gal and Minker forms the foundation of a cooperative answering system that integrates the new construction and presentation methods. This paper provides an overview of the cooperative answering strategies used in the CARMIN cooperative answering system, an ongoing research effort at Maryland. Section 2 gives some useful background definitions. Section 3 describes techniques for collecting cooperative logical formulae. Section 4 discusses which natural language generation techniques are useful for presenting the logic formula in natural language text. Section 5 presents a diagram of the system

    COOPERATIVE QUERY ANSWERING FOR APPROXIMATE ANSWERS WITH NEARNESS MEASURE IN HIERARCHICAL STRUCTURE INFORMATION SYSTEMS

    Get PDF
    Cooperative query answering for approximate answers has been utilized in various problem domains. Many challenges in manufacturing information retrieval, such as: classifying parts into families in group technology implementation, choosing the closest alternatives or substitutions for an out-of-stock part, or finding similar existing parts for rapid prototyping, could be alleviated using the concept of cooperative query answering. Most cooperative query answering techniques proposed by researchers so far concentrate on simple queries or single table information retrieval. Query relaxations in searching for approximate answers are mostly limited to attribute value substitutions. Many hierarchical structure information systems, such as manufacturing information systems, store their data in multiple tables that are connected to each other using hierarchical relationships - "aggregation", "generalization/specialization", "classification", and "category". Due to the nature of hierarchical structure information systems, information retrieval in such domains usually involves nested or jointed queries. In addition, searching for approximate answers in hierarchical structure databases not only considers attribute value substitutions, but also must take into account attribute or relation substitutions (i.e., WIDTH to DIAMETER, HOLE to GROOVE). For example, shape transformations of parts or features are possible and commonly practiced. A bar could be transformed to a rod. Such characteristics of hierarchical information systems, simple query or single-relation query relaxation techniques used in most cooperative query answering systems are not adequate. In this research, we proposed techniques for neighbor knowledge constructions, and complex query relaxations. We enhanced the original Pattern-based Knowledge Induction (PKI) and Distribution Sensitive Clustering (DISC) so that they can be used in neighbor hierarchy constructions at both tuple and attribute levels. We developed a cooperative query answering model to facilitate the approximate answer searching for complex queries. Our cooperative query answering model is comprised of algorithms for determining the causes of null answer, expanding qualified tuple set, expanding intersected tuple set, and relaxing multiple condition simultaneously. To calculate the semantic nearness between exact-match answers and approximate answers, we also proposed a nearness measuring function, called "Block Nearness", that is appropriate for the query relaxation methods proposed in this research

    Object linking in repositories

    Get PDF
    This topic is covered in three sections. The first section explores some of the architectural ramifications of extending the Eichmann/Atkins lattice-based classification scheme to encompass the assets of the full life cycle of software development. A model is considered that provides explicit links between objects in addition to the edges connecting classification vertices in the standard lattice. The second section gives a description of the efforts to implement the repository architecture using a commercially available object-oriented database management system. Some of the features of this implementation are described, and some of the next steps to be taken to produce a working prototype of the repository are pointed out. In the final section, it is argued that design and instantiation of reusable components have competing criteria (design-for-reuse strives for generality, design-with-reuse strives for specificity) and that providing mechanisms for each can be complementary rather than antagonistic. In particular, it is demonstrated how program slicing techniques can be applied to customization of reusable components

    No-But-Semantic-Match: Computing Semantically Matched XML Keyword Search Results

    Get PDF
    Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-kk results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches.Comment: 24 pages, 21 figures, 6 tables, submitted to The VLDB Journal for possible publicatio

    Schema-agnostic entity retrieval in highly heterogeneous semi-structured environments

    Get PDF
    [no abstract

    On the Query Refinement in Searching a Bibliographic Database

    Get PDF

    Work out the semantic web search: The cooperative way

    Get PDF
    In this paper we propose a Cooperative Question Answering System that takes as input natural language queries and is able to return a cooperative answer based on semantic web resources, more specifically DBpedia represented in OWL/RDF as knowledge base and WordNet to build similar questions. Our system resorts to ontologies not only for reasoning but also to find answers and is independent of prior knowledge of the semantic resources by the user. The natural language question is translated into its semantic representation and then answered by consulting the semantics sources of information. The system is able to clarify the problems of ambiguity and helps finding the path to the correct answer. If there are multiple answers to the question posed (or to the similar questions for which DBpedia contains answers), they will be grouped according to their semantic meaning, providing a more cooperative and clarified answer to the user

    Supporting Multi-Criteria Decision Support Queries over Disparate Data Sources

    Get PDF
    In the era of big data revolution, marked by an exponential growth of information, extracting value from data enables analysts and businesses to address challenging problems such as drug discovery, fraud detection, and earthquake predictions. Multi-Criteria Decision Support (MCDS) queries are at the core of big-data analytics resulting in several classes of MCDS queries such as OLAP, Top-K, Pareto-optimal, and nearest neighbor queries. The intuitive nature of specifying multi-dimensional preferences has made Pareto-optimal queries, also known as skyline queries, popular. Existing skyline algorithms however do not address several crucial issues such as performing skyline evaluation over disparate sources, progressively generating skyline results, or robustly handling workload with multiple skyline over join queries. In this dissertation we thoroughly investigate topics in the area of skyline-aware query evaluation. In this dissertation, we first propose a novel execution framework called SKIN that treats skyline over joins as first class citizens during query processing. This is in contrast to existing techniques that treat skylines as an add-on, loosely integrated with query processing by being placed on top of the query plan. SKIN is effective in exploiting the skyline characteristics of the tuples within individual data sources as well as across disparate sources. This enables SKIN to significantly reduce two primary costs, namely the cost of generating the join results and the cost of skyline comparisons to compute the final results. Second, we address the crucial business need to report results early; as soon as they are being generated so that users can formulate competitive decisions in near real-time. On top of SKIN, we built a progressive query evaluation framework ProgXe to transform the execution of queries involving skyline over joins to become non-blocking, i.e., to be progressively generating results early and often. By exploiting SKIN\u27s principle of processing query at multiple levels of abstraction, ProgXe is able to: (1) extract the output dependencies in the output spaces by analyzing both the input and output space, and (2) exploit this knowledge of abstract-level relationships to guarantee correctness of early output. Third, real-world applications handle query workloads with diverse Quality of Service (QoS) requirements also referred to as contracts. Time sensitive queries, such as fraud detection, require results to progressively output with minimal delay, while ad-hoc and reporting queries can tolerate delay. In this dissertation, by building on the principles of ProgXe we propose the Contract-Aware Query Execution (CAQE) framework to support the open problem of contract driven multi-query processing. CAQE employs an adaptive execution strategy to continuously monitor the run-time satisfaction of queries and aggressively take corrective steps whenever the contracts are not being met. Lastly, to elucidate the portability of the core principle of this dissertation, the reasoning and query processing at different levels of data abstraction, we apply them to solve an orthogonal research question to auto-generate recommendation queries that facilitate users in exploring a complex database system. User queries are often too strict or too broad requiring a frustrating trial-and-error refinement process to meet the desired result cardinality while preserving original query semantics. Based on the principles of SKIN, we propose CAPRI to automatically generate refined queries that: (1) attain the desired cardinality and (2) minimize changes to the original query intentions. In our comprehensive experimental study of each part of this dissertation, we demonstrate the superiority of the proposed strategies over state-of-the-art techniques in both efficiency, as well as resource consumption

    Gunrock: GPU Graph Analytics

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU
    • …
    corecore