19,460 research outputs found
Model Theory and Entailment Rules for RDF Containers, Collections and Reification
An RDF graph is, at its core, just a set of statements consisting of subjects, predicates and objects. Nevertheless, since its inception
practitioners have asked for richer data structures such as containers (for
open lists, sets and bags), collections (for closed lists) and reification (for
quoting and provenance). Though this desire has been addressed in the
RDF primer and RDF Schema specification, they are explicitely ignored
in its model theory. In this paper we formalize the intuitive semantics
(as suggested by the RDF primer, the RDF Schema and RDF semantics specifications) of these compound data structures by two orthogonal
extensions of the RDFS model theory (RDFCC for RDF containers and
collections, and RDFR for RDF reification). Second, we give a set of
entailment rules that is sound and complete for the RDFCC and RDFR
model theories. We show that complexity of RDFCC and RDFR entailment remains the same as that of simple RDF entailment
RDF-TR: Exploiting structural redundancies to boost RDF compression
The number and volume of semantic data have grown impressively over the last decade, promoting compression as an essential tool for RDF preservation, sharing and management. In contrast to universal compressors, RDF compression techniques are able to detect and exploit specific forms of redundancy in RDF data. Thus, state-of-the-art RDF compressors excel at exploiting syntactic and semantic redundancies, i.e., repetitions in the serialization format and information that can be inferred implicitly. However, little attention has been paid to the existence of structural patterns within the RDF dataset; i.e. structural redundancy. In this paper, we analyze structural regularities in real-world datasets, and show three schema-based sources of redundancies that underpin the schema-relaxed nature of RDF. Then, we propose RDF-Tr (RDF Triples Reorganizer), a preprocessing technique that discovers and removes this kind of redundancy before the RDF dataset is effectively compressed. In particular, RDF-Tr groups subjects that are described by the same predicates, and locally re-codes the objects related to these predicates. Finally, we integrate
RDF-Tr with two RDF compressors, HDT and k2-triples. Our experiments show that using RDF-Tr with these compressors improves by up to 2.3 times their original effectiveness, outperforming the most prominent state-of-the-art techniques
Compacting frequent star patterns in RDF graphs
Knowledge graphs have become a popular formalism for representing entities and their properties using a graph data model, e.g., the Resource Description Framework (RDF). An RDF graph comprises entities of the same type connected to objects or other entities using labeled edges annotated with properties. RDF graphs usually contain entities that share the same objects in a certain group of properties, i.e., they match star patterns composed of these properties and objects. In case the number of these entities or properties in these star patterns is large, the size of the RDF graph and query processing are negatively impacted; we refer these star patterns as frequent star patterns. We address the problem of identifying frequent star patterns in RDF graphs and devise the concept of factorized RDF graphs, which denote compact representations of RDF graphs where the number of frequent star patterns is minimized. We also develop computational methods to identify frequent star patterns and generate a factorized RDF graph, where compact RDF molecules replace frequent star patterns. A compact RDF molecule of a frequent star pattern denotes an RDF subgraph that instantiates the corresponding star pattern. Instead of having all the entities matching the original frequent star pattern, a surrogate entity is added and related to the properties of the frequent star pattern; it is linked to the entities that originally match the frequent star pattern. Since the edges between the entities and the objects in the frequent star pattern are replaced by edges between these entities and the surrogate entity of the compact RDF molecule, the size of the RDF graph is reduced. We evaluate the performance of our factorization techniques on several RDF graph benchmarks and compare with a baseline built on top gSpan, a state-of-the-art algorithm to detect frequent patterns. The outcomes evidence the efficiency of proposed approach and show that our techniques are able to reduce execution time of the baseline approach in at least three orders of magnitude. Additionally, RDF graph size can be reduced by up to 66.56% while data represented in the original RDF graph is preserved
Compacting Frequent Star Patterns in RDF Graphs
Knowledge graphs have become a popular formalism for representing entities
and their properties using a graph data model, e.g., the Resource Description
Framework (RDF). An RDF graph comprises entities of the same type connected to
objects or other entities using labeled edges annotated with properties. RDF
graphs usually contain entities that share the same objects in a certain group
of properties, i.e., they match star patterns composed of these properties and
objects. In case the number of these entities or properties in these star
patterns is large, the size of the RDF graph and query processing are
negatively impacted; we refer these star patterns as frequent star patterns. We
address the problem of identifying frequent star patterns in RDF graphs and
devise the concept of factorized RDF graphs, which denote compact
representations of RDF graphs where the number of frequent star patterns is
minimized. We also develop computational methods to identify frequent star
patterns and generate a factorized RDF graph, where compact RDF molecules
replace frequent star patterns. A compact RDF molecule of a frequent star
pattern denotes an RDF subgraph that instantiates the corresponding star
pattern. Instead of having all the entities matching the original frequent star
pattern, a surrogate entity is added and related to the properties of the
frequent star pattern; it is linked to the entities that originally match the
frequent star pattern. We evaluate the performance of our factorization
techniques on several RDF graph benchmarks and compare with a baseline built on
top of gSpan, a state-of-the-art algorithm to detect frequent patterns. The
outcomes evidence the efficiency of proposed approach and show that our
techniques are able to reduce execution time of the baseline approach in at
least three orders of magnitude reducing the RDF graph size by up to 66.56%
Storing RDF as a Graph
RDF is the first W3C standard for enriching information resources of the Web with detailed meta data. The semantics of RDF data is defined using a RDF schema. The most expressive language for querying RDF is RQL, which enables querying of semantics. In order to support RQL, a RDF storage system has to map the RDF graph model onto its storage structure. Several storage systems for RDF data have been developed, which store the RDF data as triples in a relational database. To evaluate an RQL query on those triple structures, the graph model has to be rebuilt from the triples.
In this paper, we presented a new approach to store RDF data as a graph in a object-oriented database. Our approach avoids the costly rebuilding of the graph and efficiently queries the storage structure directly. The advantages of our approach have been shown by performance test on our prototype implementation OO-Store
Prototyping Information Visualization in 3D City Models: a Model-based Approach
When creating 3D city models, selecting relevant visualization techniques is
a particularly difficult user interface design task. A first obstacle is that
current geodata-oriented tools, e.g. ArcGIS, have limited 3D capabilities and
limited sets of visualization techniques. Another important obstacle is the
lack of unified description of information visualization techniques for 3D city
models. If many techniques have been devised for different types of data or
information (wind flows, air quality fields, historic or legal texts, etc.)
they are generally described in articles, and not really formalized. In this
paper we address the problem of visualizing information in (rich) 3D city
models by presenting a model-based approach for the rapid prototyping of
visualization techniques. We propose to represent visualization techniques as
the composition of graph transformations. We show that these transformations
can be specified with SPARQL construction operations over RDF graphs. These
specifications can then be used in a prototype generator to produce 3D scenes
that contain the 3D city model augmented with data represented using the
desired technique.Comment: Proc. of 3DGeoInfo 2014 Conference, Dubai, November 201
Compressed k2-Triples for Full-In-Memory RDF Engines
Current "data deluge" has flooded the Web of Data with very large RDF
datasets. They are hosted and queried through SPARQL endpoints which act as
nodes of a semantic net built on the principles of the Linked Data project.
Although this is a realistic philosophy for global data publishing, its query
performance is diminished when the RDF engines (behind the endpoints) manage
these huge datasets. Their indexes cannot be fully loaded in main memory, hence
these systems need to perform slow disk accesses to solve SPARQL queries. This
paper addresses this problem by a compact indexed RDF structure (called
k2-triples) applying compact k2-tree structures to the well-known
vertical-partitioning technique. It obtains an ultra-compressed representation
of large RDF graphs and allows SPARQL queries to be full-in-memory performed
without decompression. We show that k2-triples clearly outperforms
state-of-the-art compressibility and traditional vertical-partitioning query
resolution, remaining very competitive with multi-index solutions.Comment: In Proc. of AMCIS'201
E-Learning and microformats: a learning object harvesting model and a sample application
In order to support interoperability of learning tools and reusability of resources, this paper introduces a framework for harvesting learning objects from web-based content. Therefore, commonly-known web technologies are examined with respect to their suitability for harvesting embedded meta-data. Then, a lightweight application profile and a microformat for learning objects are proposed based on well-known learning object metadata standards. Additionally, we describe a web service which utilizes XSL transformation (GRDDL) to extract learning objects from different web pages, and provide a SQI target as a retrieval facility using a more complex query language called SPARQL. Finally, we outline the applicability of our framework on the basis of a search client employing the new SQI service for searching and retrieving learning objects
- …