27,425 research outputs found

    Opportunistic linked data querying through approximate membership metadata

    Get PDF
    Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface

    QUARQ: QUick approximate and relaxed querying

    Get PDF
    Executing queries over Linked Open Data (LOD) is a complex task. The total number of sources triggered by a single query cannot be known in advance, nor the reasoning complexity applied to each source. In order to avoid this uncertainty, practitioners download full replicas of the open data and build applications on top of the datasets in a controlled environment. With this centralized approach, they lose dynamic data changes, and often they cannot account for the inference capabilities defined in the associated ontologies. In this work, we explore the feasibility of predicting the performance of Flexible Querying over Linked Open Data [1]. Concretely, we propose QUARQ: QUick Approximate and Relaxed Querying, a tool that using ML provides intelligence to the process of generating alternative queries that run more efficiently than the original ones. With this tool, we propose avoiding the use of replicated Linked Data by seizing the shareable nature of Linked Data and eluding the impracticality of maintaining copies up-to-date or the need to work with outdated data

    Querying and Merging Heterogeneous Data by Approximate Joins on Higher-Order Terms

    Get PDF

    Learning to Predict the Wisdom of Crowds

    Full text link
    The problem of "approximating the crowd" is that of estimating the crowd's majority opinion by querying only a subset of it. Algorithms that approximate the crowd can intelligently stretch a limited budget for a crowdsourcing task. We present an algorithm, "CrowdSense," that works in an online fashion to dynamically sample subsets of labelers based on an exploration/exploitation criterion. The algorithm produces a weighted combination of a subset of the labelers' votes that approximates the crowd's opinion.Comment: Presented at Collective Intelligence conference, 2012 (arXiv:1204.2991
    • 

    corecore