132 research outputs found
Co-evolution of RDF Datasets
Linking Data initiatives have fostered the publication of large number of RDF
datasets in the Linked Open Data (LOD) cloud, as well as the development of
query processing infrastructures to access these data in a federated fashion.
However, different experimental studies have shown that availability of LOD
datasets cannot be always ensured, being RDF data replication required for
envisioning reliable federated query frameworks. Albeit enhancing data
availability, RDF data replication requires synchronization and conflict
resolution when replicas and source datasets are allowed to change data over
time, i.e., co-evolution management needs to be provided to ensure consistency.
In this paper, we tackle the problem of RDF data co-evolution and devise an
approach for conflict resolution during co-evolution of RDF datasets. Our
proposed approach is property-oriented and allows for exploiting semantics
about RDF properties during co-evolution management. The quality of our
approach is empirically evaluated in different scenarios on the DBpedia-live
dataset. Experimental results suggest that proposed proposed techniques have a
positive impact on the quality of data in source datasets and replicas.Comment: 18 pages, 4 figures, Accepted in ICWE, 201
A decade of Semantic Web research through the lenses of a mixed methods approach
The identification of research topics and trends is an important scientometric activity, as it can help guide the direction of future research. In the Semantic Web area, initially topic and trend detection was primarily performed through qualitative, top-down style approaches, that rely on expert knowledge. More recently, data-driven, bottom-up approaches have been proposed that offer a quantitative analysis of the evolution of a research domain. In this paper, we aim to provide a broader and more complete picture of Semantic Web topics and trends by adopting a mixed methods methodology, which allows for the combined use of both qualitative and quantitative approaches. Concretely, we build on a qualitative analysis of the main seminal papers, which adopt a top-down approach, and on quantitative results derived with three bottom-up data-driven approaches (Rexplore, Saffron, PoolParty), on a corpus of Semantic Web papers published between 2006 and 2015. In this process, we both use the latter for “fact-checking” on the former and also to derive key findings in relation to the strengths and weaknesses of top-down and bottom up approaches to research topic identification. Although we provide a detailed study on the past decade of Semantic Web research, the findings and the methodology are relevant not only for our community but beyond the area of the Semantic Web to other research fields as well
Ontology: A Linked Data Hub for Mathematics
In this paper, we present an ontology of mathematical knowledge concepts that
covers a wide range of the fields of mathematics and introduces a balanced
representation between comprehensive and sensible models. We demonstrate the
applications of this representation in information extraction, semantic search,
and education. We argue that the ontology can be a core of future integration
of math-aware data sets in the Web of Data and, therefore, provide mappings
onto relevant datasets, such as DBpedia and ScienceWISE.Comment: 15 pages, 6 images, 1 table, Knowledge Engineering and the Semantic
Web - 5th International Conferenc
Data Integration for Open Data on the Web
In this lecture we will discuss and introduce challenges of
integrating openly available Web data and how to solve them. Firstly,
while we will address this topic from the viewpoint of Semantic Web
research, not all data is readily available as RDF or Linked Data, so
we will give an introduction to different data formats prevalent on the
Web, namely, standard formats for publishing and exchanging tabular,
tree-shaped, and graph data. Secondly, not all Open Data is really completely
open, so we will discuss and address issues around licences, terms
of usage associated with Open Data, as well as documentation of data
provenance. Thirdly, we will discuss issues connected with (meta-)data
quality issues associated with Open Data on the Web and how Semantic
Web techniques and vocabularies can be used to describe and remedy
them. Fourth, we will address issues about searchability and integration
of Open Data and discuss in how far semantic search can help to overcome
these. We close with briefly summarizing further issues not covered
explicitly herein, such as multi-linguality, temporal aspects (archiving,
evolution, temporal querying), as well as how/whether OWL and RDFS
reasoning on top of integrated open data could be help
Sextant: Visualizing time-evolving linked geospatial data
The linked open data cloud is constantly evolving as datasets get continuously updated with newer versions. As a result, representing, querying, and visualizing the temporal dimension of linked data is crucial. This is especially important for geospatial datasets that form the backbone of large scale open data publication efforts in many sectors of the economy (e.g., the public sector, the Earth Observation sector). Although there has been some work on the representation and querying of linked geospatial data that change over time, to the best of our knowledge, there is currently no tool that offers spatio-temporal visualization of such data. This is in contrast with the existence of many tools for the visualization of the temporal evolution of geospatial data in the GIS area. In this article, we present Sextant, a Web-based system for the visualization and exploration of time-evolving linked geospatial data and the creation, sharing, and collaborative editing of “temporally-enriched” thematic maps which are produced by combining different sources of such data. We present the architecture of Sextant, give examples of its use and present applications in which we have deployed it
JRC-Names: Multilingual Entity Name variants and titles as Linked Data
Since 2004 the European Commission’s Joint Research Centre (JRC) has been analysing the online version of
printed media in over twenty languages and has automatically recognised and compiled large amounts of named
entities (persons and organisations) and their many name variants. The collected variants not only include standard
spellings in various countries, languages and scripts, but also frequently found spelling mistakes or lesser used
name forms, all occurring in real-life text (e.g. Benjamin/Binyamin/Bibi/Benyamín/Biniamin/Беньямин/ بنیامین Netanyahu/
Netanjahu/Nétanyahou/Netahnyahu/Нетаньяху/ نتنیاهو ). This entity name variant data, known as JRCNames,
has been available for public download since 2011. In this article, we report on our efforts to render
JRC-Names as Linked Data (LD), using the lexicon model for ontologies lemon. Besides adhering to Semantic
Web standards, this new release goes beyond the initial one in that it includes titles found next
to the names, as well as date ranges when the titles and the name variants were found. It also establishes
links towards existing datasets, such as DBpedia and Talk-Of-Europe. As multilingual linguistic linked
dataset, JRC-Names can help bridge the gap between structured data and natural languages, thus supporting
large-scale data integration, e.g. cross-lingual mapping, and web-based content processing, e.g. entity linking.
JRC-Names is publicly available through the dataset catalogue of the European Union’s Open Data Portal.JRC.G.2-Global security and crisis managemen
Reasoning with Data Flows and Policy Propagation Rules
Data-oriented systems and applications are at the centre of current developments of the World Wide Web. In these scenarios, assessing what policies propagate from the licenses of data sources to the output of a given data-intensive system is an important problem. Both policies and data flows can be described with Semantic Web languages. Although it is possible to define Policy Propagation Rules (PPR) by associating policies to data flow steps, this activity results in a huge number of rules to be stored and managed. In a recent paper, we introduced strategies for reducing the size of a PPR knowledge base by using an ontology of the possible relations between data objects, the Datanode ontology, and applying the (A)AAAA methodology, a knowledge engineering approach that exploits Formal Concept Analysis (FCA). In this article, we investigate whether this reasoning is feasible and how it can be performed. For this purpose, we study the impact of compressing a rule base associated with an inference mechanism on the performance of the reasoning process. Moreover, we report on an extension of the (A)AAAA methodology that includes a coherency check algorithm, that makes this reasoning possible. We show how this compression, in addition to being beneficial to the management of the knowledge base, also has a positive impact on the performance and resource requirements of the reasoning process for policy propagation
Linked Data Entity Summarization
On the Web, the amount of structured and Linked Data about entities is constantly growing. Descriptions of single entities often include thousands of statements and it becomes difficult to comprehend the data, unless a selection of the most relevant facts is provided. This doctoral thesis addresses the problem of Linked Data entity summarization. The contributions involve two entity summarization approaches, a common API for entity summarization, and an approach for entity data fusion
- …