2,850 research outputs found
Compressed k2-Triples for Full-In-Memory RDF Engines
Current "data deluge" has flooded the Web of Data with very large RDF
datasets. They are hosted and queried through SPARQL endpoints which act as
nodes of a semantic net built on the principles of the Linked Data project.
Although this is a realistic philosophy for global data publishing, its query
performance is diminished when the RDF engines (behind the endpoints) manage
these huge datasets. Their indexes cannot be fully loaded in main memory, hence
these systems need to perform slow disk accesses to solve SPARQL queries. This
paper addresses this problem by a compact indexed RDF structure (called
k2-triples) applying compact k2-tree structures to the well-known
vertical-partitioning technique. It obtains an ultra-compressed representation
of large RDF graphs and allows SPARQL queries to be full-in-memory performed
without decompression. We show that k2-triples clearly outperforms
state-of-the-art compressibility and traditional vertical-partitioning query
resolution, remaining very competitive with multi-index solutions.Comment: In Proc. of AMCIS'201
Ontology Summit 2008 Communiqué: Towards an open ontology repository
Each annual Ontology Summit initiative makes a statement appropriate to each Summits theme as part of our general advocacy designed to bring ontology science and engineering into the mainstream. The theme this year is "Towards an Open Ontology Repository". This communiqué represents the joint position of those who were engaged in the year's summit discourse on an Open Ontology Repository (OOR) and of those who endorse below. In this discussion, we have agreed that an "ontology repository is a facility where ontologies and related information artifacts can be stored, retrieved and managed."
We believe in the promise of semantic technologies based on logic, databases and the Semantic Web, a Web of exposed data and of interpretations of that data (i.e., of semantics), using common standards. Such technologies enable distinguishable, computable, reusable, and sharable meaning of Web and other artifacts, including data, documents, and services. We also believe that making that vision a reality requires additional supporting resources and these resources should be open, extensible, and provide common services over the ontologies
Semantic processing of EHR data for clinical research
There is a growing need to semantically process and integrate clinical data
from different sources for clinical research. This paper presents an approach
to integrate EHRs from heterogeneous resources and generate integrated data in
different data formats or semantics to support various clinical research
applications. The proposed approach builds semantic data virtualization layers
on top of data sources, which generate data in the requested semantics or
formats on demand. This approach avoids upfront dumping to and synchronizing of
the data with various representations. Data from different EHR systems are
first mapped to RDF data with source semantics, and then converted to
representations with harmonized domain semantics where domain ontologies and
terminologies are used to improve reusability. It is also possible to further
convert data to application semantics and store the converted results in
clinical research databases, e.g. i2b2, OMOP, to support different clinical
research settings. Semantic conversions between different representations are
explicitly expressed using N3 rules and executed by an N3 Reasoner (EYE), which
can also generate proofs of the conversion processes. The solution presented in
this paper has been applied to real-world applications that process large scale
EHR data.Comment: Accepted for publication in Journal of Biomedical Informatics, 2015,
preprint versio
A spiral model for adding automatic, adaptive authoring to adaptive hypermedia
At present a large amount of research exists into the design and implementation of adaptive systems. However, not many target the complex task of authoring in such systems, or their evaluation. In order to tackle these problems, we have looked into the causes of the complexity. Manual annotation has proven to be a bottleneck for authoring of adaptive hypermedia. One such solution is the reuse of automatically generated metadata. In our previous work we have proposed the integration of the generic Adaptive Hypermedia authoring environment, MOT ( My Online Teacher), and a semantic desktop environment, indexed by Beagle++. A prototype, Sesame2MOT Enricher v1, was built based upon this integration approach and evaluated. After the initial evaluations, a web-based prototype was built (web-based Sesame2MOT Enricher v2 application) and integrated in MOT v2, conforming with the findings of the first set of evaluations. This new prototype underwent another evaluation. This paper thus does a synthesis of the approach in general, the initial prototype, with its first evaluations, the improved prototype and the first results from the most recent evaluation round, following the next implementation cycle of the spiral model [Boehm, 88]
Decentralized provenance-aware publishing with nanopublications
Publication and archival of scientific results is still commonly considered the responsability of classical publishing companies. Classical forms of publishing, however, which center around printed narrative articles, no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. In this article, we propose to design scientific data publishing as a web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used as a low-level data publication layer to serve the Semantic Web in general. Our evaluation of the current network shows that this system is efficient and reliable
- …