47 research outputs found
A General Framework for Representing, Reasoning and Querying with Annotated Semantic Web Data
We describe a generic framework for representing and reasoning with annotated
Semantic Web data, a task becoming more important with the recent increased
amount of inconsistent and non-reliable meta-data on the web. We formalise the
annotated language, the corresponding deductive system and address the query
answering problem. Previous contributions on specific RDF annotation domains
are encompassed by our unified reasoning formalism as we show by instantiating
it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we
provide a generic method for combining multiple annotation domains allowing to
represent, e.g. temporally-annotated fuzzy RDF. Furthermore, we address the
development of a query language -- AnQL -- that is inspired by SPARQL,
including several features of SPARQL 1.1 (subqueries, aggregates, assignment,
solution modifiers) along with the formal definitions of their semantics
Decentralized Control and Adaptation in Distributed Applications via Web and Semantic Web Technologies
The presented work provides an approach and an implementation for enabling decentralized control in distributed applications composed of heterogeneous components by benefiting from the interoperability provided by the Web stack and relying on semantic technologies for enabling data integration. In particular, the concept of Smart Components enables adaptability at runtime through an adaptation layer and is complemented by a reference architecture as well as a prototypical implementation
Κατανεμημένη αποτίμηση επερωτήσεων και συλλογιστική για το μοντέλο RDF σε δίκτυα ομοτίμων κόμβων
Με το ενδιαφέρον για τις εφαρμογές του Σημασιολογικού Ιστού να αυξάνεται
ραγδαία, το μοντέλο RDF και RDFS έχει γίνει ένα από τα πιο ευρέως
χρησιμοποιούμενα μοντέλα δεδομένων για την αναπαράσταση και την ενσωμάτωση
δομημένης πληροφορίας στον Ιστό. Το πλήθος των διαθέσιμων πηγών πληροφορίας RDF
συνεχώς αυξάνεται με αποτέλεσμα να υπάρχει μια επιτακτική ανάγκη για τη
διαχείριση RDF δεδομένων. Σε αυτή τη διατριβή επικεντρωνόμαστε στην
κατανεμημένη διαχείριση RDF δεδομένων σε δίκτυα ομότιμων κόμβων. Σχεδιάζουμε
και υλοποιούμε το σύστημα Atlas, ένα πλήρως κατανεμημένο σύστημα για την
αποθήκευση RDF και RDFS δεδομένων, την αποτίμηση και βελτιστοποίηση επερωτήσεων
στη γλώσσα SPARQL και τη συλλογιστική στο μοντέλο RDFS. Το σύστημα Atlas
χρησιμοποιεί κατανεμημένους πίνακες κατακερματισμού, μια δημοφιλή περίπτωση
δικτύων ομότιμων κόμβων. Αρχικά, αναλύουμε κατανεμημένους αλγόριθμους για
συλλογιστική RDFS χρησιμοποιώντας κατανεμημένους πίνακες κατακερματισμού.
Υλοποιηούμε διάφορες παραλλαγές των αλγορίθμων προς τα εμπρός αλυσίδα εκτέλεσης
και προς τα πίσω αλυσίδα εκτέλεσης καθώς και έναν αλγόριθμο που χρησιμοποιεί
την τεχνική μετασχηματισμού των κανόνων σε μαγικό σύνολο. Αποδεικνύουμε
θεωρητικά την ορθότητα των αλγορίθμων αυτών και προσφέρουμε μια συγκριτική
μελέτη τόσο αναλυτικά όσο και πειραματικά. Παράλληλα, προτείνουμε αλγορίθμους
και τεχνικές για την αποτίμηση και τη βελτιστοποίηση επερωτήσεων στη γλώσσα
SPARQL για RDF δεδομένα που είναι αποθηκευμένα σε κατανεμημένους πίνακες
κατακερματισμού. Οι τεχνικές βελτιστοποίησης βασίζονται σε εκτιμήσεις
επιλεκτικότητας και έχουν στόχο τη μείωση του χρόνου απόκρισης της επερώτησης
καθώς και της κατανάλωσης εύρους ζώνης του δικτύου. Η εκτεταμένη πειραματική
αξιολόγηση των μεθόδων βελτιστοποίησης γίνεται σε μια τοπική συστάδα
υπολογιστών χρησιμοποιώντας ένα ευρέως διαδεδομένο σημείο αναφοράς μετρήσεων.With the interest in Semantic Web applications rising rapidly, the Resource
Description Framework (RDF) and its accompanying vocabulary description
language, RDF Schema (RDFS), have become one of the most widely used data
models for representing and integrating structured information in the Web. With
the vast amount of available RDF data sources on the Web increasing rapidly,
there is an urgent need for RDF data management. In this thesis, we focus on
distributed RDF data management in peer-to-peer (P2P) networks. More
specifically, we present results that advance the state-of-the-art in the
research area of distributed RDF query processing and reasoning in P2P
networks. We fully design and implement a P2P system, called Atlas, for the
distributed query processing and reasoning of RDF and RDFS data. Atlas is built
on top of distributed hash tables (DHTs), a commonly-used case of P2P networks.
Initially, we study RDFS reasoning algorithms on top of DHTs. We design and
develop distributed forward and backward chaining algorithms, as well as an
algorithm which works in a bottom-up fashion using the magic sets
transformation technique. We study theoretically the correctness of our
reasoning algorithms and prove that they are sound and complete. We also
provide a comparative study of our algorithms both analytically and
experimentally. In the experimental part of our study, we obtain measurements
in the realistic large-scale distributed environment of PlanetLab as well as in
the more controlled environment of a local cluster. Moreover, we propose
algorithms for SPARQL query processing and optimization over RDF(S) databases
stored on top of distributed hash tables. We fully implement and evaluate a
DHT-based optimizer. The goal of the optimizer is to minimize the time for
answering a query as well as the bandwidth consumed during the query
evaluation. The optimization algorithms use selectivity estimates to determine
the chosen query plan. Our algorithms and techniques have been extensively
evaluated in a local cluster
Federated Query Processing for the Semantic Web
The recent years have witnessed a constant growth in the amount of RDF data available on the Web. This growth is largely based on the increasing rate of data publication on the Web by different actors such governments, life science researchers or geographical institutes. RDF data generation is mainly done by converting already existing legacy data resources into RDF (e.g. converting data stored in relational databases into RDF), but also by creating that RDF data directly (e.g. sensors). These RDF data are normally exposed by means of Linked Data-enabled URIs and SPARQL endpoints. Given the sustained growth that we are experiencing in the number of SPARQL endpoints available, the need to be able to send federated SPARQL queries across them has also grown. Tools for accessing sets of RDF data repositories are starting to appear, differing between them on the way in which they allow users to access these data (allowing users to specify directly what RDF data set they want to query, or making this process transparent to them). To overcome this heterogeneity in federated query processing solutions, the W3C SPARQL working group is defining a federation extension for SPARQL 1.1, which allows combining in a single query, graph patterns that can be evaluated in several endpoints. In this PhD thesis, we describe the syntax of that SPARQL extension for providing access to distributed RDF data sets and formalise its semantics. We adapt existing techniques for distributed data access in relational databases in order to deal with SPARQL endpoints, which we have implemented in our federation query evaluation system (SPARQL-DQP). We describe the static optimisation techniques that we implemented in our system and we carry out a series of experiments that show that our optimisations significantly speed up the query evaluation process in presence of large query results and optional operator
Foundations of Fuzzy Logic and Semantic Web Languages
This book is the first to combine coverage of fuzzy logic and Semantic Web languages. It provides in-depth insight into fuzzy Semantic Web languages for non-fuzzy set theory and fuzzy logic experts. It also helps researchers of non-Semantic Web languages get a better understanding of the theoretical fundamentals of Semantic Web languages. The first part of the book covers all the theoretical and logical aspects of classical (two-valued) Semantic Web languages. The second part explains how to generalize these languages to cope with fuzzy set theory and fuzzy logic
Linked Open Data - Creating Knowledge Out of Interlinked Data: Results of the LOD2 Project
Database Management; Artificial Intelligence (incl. Robotics); Information Systems and Communication Servic
Foundations of Fuzzy Logic and Semantic Web Languages
This book is the first to combine coverage of fuzzy logic and Semantic Web languages. It provides in-depth insight into fuzzy Semantic Web languages for non-fuzzy set theory and fuzzy logic experts. It also helps researchers of non-Semantic Web languages get a better understanding of the theoretical fundamentals of Semantic Web languages. The first part of the book covers all the theoretical and logical aspects of classical (two-valued) Semantic Web languages. The second part explains how to generalize these languages to cope with fuzzy set theory and fuzzy logic
Linked Research on the Decentralised Web
This thesis is about research communication in the context of the Web. I analyse literature which reveals how researchers are making use of Web technologies for knowledge dissemination, as well as how individuals are disempowered by the centralisation of certain systems, such as academic publishing platforms and social media. I share my findings on the feasibility of a decentralised and interoperable information space where researchers can control their identifiers whilst fulfilling the core functions of scientific communication: registration, awareness, certification, and archiving.
The contemporary research communication paradigm operates under a diverse set of sociotechnical constraints, which influence how units of research information and personal data are created and exchanged. Economic forces and non-interoperable system designs mean that researcher identifiers and research contributions are largely shaped and controlled by third-party entities; participation requires the use of proprietary systems.
From a technical standpoint, this thesis takes a deep look at semantic structure of research artifacts, and how they can be stored, linked and shared in a way that is controlled by individual researchers, or delegated to trusted parties. Further, I find that the ecosystem was lacking a technical Web standard able to fulfill the awareness function of research communication. Thus, I contribute a new communication protocol, Linked Data Notifications (published as a W3C Recommendation) which enables decentralised notifications on the Web, and provide implementations pertinent to the academic publishing use case. So far we have seen decentralised notifications applied in research dissemination or collaboration scenarios, as well as for archival activities and scientific experiments.
Another core contribution of this work is a Web standards-based implementation of a clientside tool, dokieli, for decentralised article publishing, annotations and social interactions. dokieli can be used to fulfill the scholarly functions of registration, awareness, certification, and archiving, all in a decentralised manner, returning control of research contributions and discourse to individual researchers.
The overarching conclusion of the thesis is that Web technologies can be used to create a fully functioning ecosystem for research communication. Using the framework of Web architecture, and loosely coupling the four functions, an accessible and inclusive ecosystem can be realised whereby users are able to use and switch between interoperable applications without interfering with existing data.
Technical solutions alone do not suffice of course, so this thesis also takes into account the need for a change in the traditional mode of thinking amongst scholars, and presents the Linked Research initiative as an ongoing effort toward researcher autonomy in a social system, and universal access to human- and machine-readable information. Outcomes of this outreach work so far include an increase in the number of individuals self-hosting their research artifacts, workshops publishing accessible proceedings on the Web, in-the-wild experiments with open and public peer-review, and semantic graphs of contributions to conference proceedings and journals (the Linked Open Research Cloud).
Some of the future challenges include: addressing the social implications of decentralised Web publishing, as well as the design of ethically grounded interoperable mechanisms; cultivating privacy aware information spaces; personal or community-controlled on-demand archiving services; and further design of decentralised applications that are aware of the core functions of scientific communication
Doctor of Philosophy
dissertationLinked data are the de-facto standard in publishing and sharing data on the web. To date, we have been inundated with large amounts of ever-increasing linked data in constantly evolving structures. The proliferation of the data and the need to access and harvest knowledge from distributed data sources motivate us to revisit several classic problems in query processing and query optimization. The problem of answering queries over views is commonly encountered in a number of settings, including while enforcing security policies to access linked data, or when integrating data from disparate sources. We approach this problem by efficiently rewriting queries over the views to equivalent queries over the underlying linked data, thus avoiding the costs entailed by view materialization and maintenance. An outstanding problem of query rewriting is the number of rewritten queries is exponential to the size of the query and the views, which motivates us to study problem of multiquery optimization in the context of linked data. Our solutions are declarative and make no assumption for the underlying storage, i.e., being store-independent. Unlike relational and XML data, linked data are schema-less. While tracking the evolution of schema for linked data is hard, keyword search is an ideal tool to perform data integration. Existing works make crippling assumptions for the data and hence fall short in handling massive linked data with tens to hundreds of millions of facts. Our study for keyword search on linked data brought together the classical techniques in the literature and our novel ideas, which leads to much better query efficiency and quality of the results. Linked data also contain rich temporal semantics. To cope with the ever-increasing data, we have investigated how to partition and store large temporal or multiversion linked data for distributed and parallel computation, in an effort to achieve load-balancing to support scalable data analytics for massive linked data