2 research outputs found

    Facetize: An Interactive Tool for Cleaning and Transforming Datasets for Facilitating Exploratory Search

    Full text link
    There is a plethora of datasets in various formats which are usually stored in files, hosted in catalogs, or accessed through SPARQL endpoints. In most cases, these datasets cannot be straightforwardly explored by end users, for satisfying recall-oriented information needs. To fill this gap, in this paper we present the design and implementation of Facetize, an editor that allows users to transform (in an interactive manner) datasets, either static (i.e. stored in files), or dynamic (i.e. being the results of SPARQL queries), to datasets that can be directly explored effectively by themselves or other users. The latter (exploration) is achieved through the familiar interaction paradigm of Faceted Search (and Preference-enriched Faceted Search). Specifically in this paper we describe the requirements, we introduce the required set of transformations, and then we detail the functionality and the implementation of the editor Facetize that realizes these transformations. The supported operations cover a wide range of tasks (selection, visibility, deletions, edits, definition of hierarchies, intervals, derived attributes, and others) and Facetize enables the user to carry them out in a user-friendly and guided manner, without presupposing any technical background (regarding data representation or query languages). Finally we present the results of an evaluation with users. To the best of your knowledge, this is the first editor for this kind of tasks.Comment: 10 pages, 4 figures, 1 table (systems x functionalities matrix

    Efficiently Charting RDF

    Full text link
    We propose a visual query language for interactively exploring large-scale knowledge graphs. Starting from an overview, the user explores bar charts through three interactions: class expansion, property expansion, and subject/object expansion. A major challenge faced is performance: a state-of-the-art SPARQL engine may require tens of minutes to compute the multiway join, grouping and counting required to render a bar chart. A promising alternative is to apply approximation through online aggregation, trading precision for performance. However, state-of-the-art online aggregation algorithms such as Wander Join have two limitations for our exploration scenario: (1) a high number of rejected paths slows the convergence of the count estimations, and (2) no unbiased estimator exists for counts under the distinct operator. We thus devise a specialized algorithm for online aggregation that augments Wander Join with exact partial computations to reduce the number of rejected paths encountered, as well as a novel estimator that we prove to be unbiased in the case of the distinct operator. In an experimental study with random interactions exploring two large-scale knowledge graphs, our algorithm shows a clear reduction in error with respect to computation time versus Wander Join
    corecore