2,179 research outputs found
Compressed k2-Triples for Full-In-Memory RDF Engines
Current "data deluge" has flooded the Web of Data with very large RDF
datasets. They are hosted and queried through SPARQL endpoints which act as
nodes of a semantic net built on the principles of the Linked Data project.
Although this is a realistic philosophy for global data publishing, its query
performance is diminished when the RDF engines (behind the endpoints) manage
these huge datasets. Their indexes cannot be fully loaded in main memory, hence
these systems need to perform slow disk accesses to solve SPARQL queries. This
paper addresses this problem by a compact indexed RDF structure (called
k2-triples) applying compact k2-tree structures to the well-known
vertical-partitioning technique. It obtains an ultra-compressed representation
of large RDF graphs and allows SPARQL queries to be full-in-memory performed
without decompression. We show that k2-triples clearly outperforms
state-of-the-art compressibility and traditional vertical-partitioning query
resolution, remaining very competitive with multi-index solutions.Comment: In Proc. of AMCIS'201
Interchanging lexical resources on the Semantic Web
Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in ‘‘data silos’’ and hence difficult to access. The Semantic Web and in particular the Linked Data initiative provide effective solutions to this problem, as well as possibilities for data reuse by inter-lexicon linking, and incorporation of data categories by dereferencable URIs. The Semantic Web focuses on the use of ontologies to describe semantics on the Web, but currently there is no standard for providing complex lexical information for such ontologies and for describing the relationship between the lexicon and the ontology. We present our model, lemon, which aims to address these gap
Effective reorganization and self-indexing of big semantic data
En esta tesis hemos analizado la redundancia estructural que los grafos RDF poseen y propuesto una técnica de preprocesamiento: RDF-Tr, que agrupa, reorganiza y recodifica los triples, tratando dos fuentes de redundancia estructural subyacentes a la naturaleza del esquema RDF. Hemos integrado RDF-Tr en HDT y k2-triples, reduciendo el tamaño que obtienen los compresores originales, superando a las técnicas más prominentes del estado del arte. Hemos denominado HDT++ y k2-triples++ al resultado de aplicar RDF-Tr en cada compresor.
En el ámbito de la compresión RDF se utilizan estructuras compactas para construir autoíndices RDF, que proporcionan acceso eficiente a los datos sin descomprimirlos. HDT-FoQ es utilizado para publicar y consumir grandes colecciones de datos RDF. Hemos extendido HDT++, llamándolo iHDT++, para resolver patrones SPARQL, consumiendo menos memoria que HDT-FoQ, a la vez que acelera la resolución de la mayoría de las consultas, mejorando la relación espacio-tiempo del resto de autoíndices.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Doctorado en Informátic
Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of Data
El actual diluvio de datos está inundando la web con grandes volúmenes de datos representados en RDF, dando lugar a la denominada 'Web de Datos'. En esta tesis proponemos, en primer lugar, un estudio profundo de aquellos textos que nos permitan abordar un conocimiento global de la estructura real de los conjuntos de datos RDF, HDT, que afronta la representación eficiente de grandes volúmenes de datos RDF a través de estructuras optimizadas para su almacenamiento y transmisión en red. HDT representa efizcamente un conjunto de datos RDF a través de su división en tres componentes: la cabecera (Header), el diccionario (Dictionary) y la estructura de sentencias RDF (Triples). A continuación, nos centramos en proveer estructuras eficientes de dichos componentes, ocupando un espacio comprimido al tiempo que se permite el acceso directo a cualquier dat
The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility
The Distributed Ontology Language (DOL) is currently being standardized
within the OntoIOp (Ontology Integration and Interoperability) activity of
ISO/TC 37/SC 3. It aims at providing a unified framework for (1) ontologies
formalized in heterogeneous logics, (2) modular ontologies, (3) links between
ontologies, and (4) annotation of ontologies. This paper presents the current
state of DOL's standardization. It focuses on use cases where distributed
ontologies enable interoperability and reusability. We demonstrate relevant
features of the DOL syntax and semantics and explain how these integrate into
existing knowledge engineering environments.Comment: Terminology and Knowledge Engineering Conference (TKE) 2012-06-20 to
2012-06-21 Madrid, Spai
- …