2 research outputs found
Facetize: An Interactive Tool for Cleaning and Transforming Datasets for Facilitating Exploratory Search
There is a plethora of datasets in various formats which are usually stored
in files, hosted in catalogs, or accessed through SPARQL endpoints. In most
cases, these datasets cannot be straightforwardly explored by end users, for
satisfying recall-oriented information needs. To fill this gap, in this paper
we present the design and implementation of Facetize, an editor that allows
users to transform (in an interactive manner) datasets, either static (i.e.
stored in files), or dynamic (i.e. being the results of SPARQL queries), to
datasets that can be directly explored effectively by themselves or other
users. The latter (exploration) is achieved through the familiar interaction
paradigm of Faceted Search (and Preference-enriched Faceted Search).
Specifically in this paper we describe the requirements, we introduce the
required set of transformations, and then we detail the functionality and the
implementation of the editor Facetize that realizes these transformations. The
supported operations cover a wide range of tasks (selection, visibility,
deletions, edits, definition of hierarchies, intervals, derived attributes, and
others) and Facetize enables the user to carry them out in a user-friendly and
guided manner, without presupposing any technical background (regarding data
representation or query languages). Finally we present the results of an
evaluation with users. To the best of your knowledge, this is the first editor
for this kind of tasks.Comment: 10 pages, 4 figures, 1 table (systems x functionalities matrix
Efficiently Charting RDF
We propose a visual query language for interactively exploring large-scale
knowledge graphs. Starting from an overview, the user explores bar charts
through three interactions: class expansion, property expansion, and
subject/object expansion. A major challenge faced is performance: a
state-of-the-art SPARQL engine may require tens of minutes to compute the
multiway join, grouping and counting required to render a bar chart. A
promising alternative is to apply approximation through online aggregation,
trading precision for performance. However, state-of-the-art online aggregation
algorithms such as Wander Join have two limitations for our exploration
scenario: (1) a high number of rejected paths slows the convergence of the
count estimations, and (2) no unbiased estimator exists for counts under the
distinct operator. We thus devise a specialized algorithm for online
aggregation that augments Wander Join with exact partial computations to reduce
the number of rejected paths encountered, as well as a novel estimator that we
prove to be unbiased in the case of the distinct operator. In an experimental
study with random interactions exploring two large-scale knowledge graphs, our
algorithm shows a clear reduction in error with respect to computation time
versus Wander Join