230 research outputs found

    LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs

    Full text link
    The number of linked data sources and the size of the linked open data graph keep growing every day. As a consequence, semantic RDF services are more and more confronted with various "big data" problems. Query processing in the presence of inferences is one them. For instance, to complete the answer set of SPARQL queries, RDF database systems evaluate semantic RDFS relationships (subPropertyOf, subClassOf) through time-consuming query rewriting algorithms or space-consuming data materialization solutions. To reduce the memory footprint and ease the exchange of large datasets, these systems generally apply a dictionary approach for compressing triple data sizes by replacing resource identifiers (IRIs), blank nodes and literals with integer values. In this article, we present a structured resource identification scheme using a clever encoding of concepts and property hierarchies for efficiently evaluating the main common RDFS entailment rules while minimizing triple materialization and query rewriting. We will show how this encoding can be computed by a scalable parallel algorithm and directly be implemented over the Apache Spark framework. The efficiency of our encoding scheme is emphasized by an evaluation conducted over both synthetic and real world datasets.Comment: 8 pages, 1 figur

    Creation and extension of ontologies for describing communications in the context of organizations

    Get PDF
    Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfillment of the requirements for the degree of Master in Computer ScienceThe use of ontologies is nowadays a sufficiently mature and solid field of work to be considered an efficient alternative in knowledge representation. With the crescent growth of the Semantic Web, it is expectable that this alternative tends to emerge even more in the near future. In the context of a collaboration established between FCT-UNL and the R&D department of a national software company, a new solution entitled ECC – Enterprise Communications Center was developed. This application provides a solution to manage the communications that enter, leave or are made within an organization, and includes intelligent classification of communications and conceptual search techniques in a communications repository. As specificity may be the key to obtain acceptable results with these processes, the use of ontologies becomes crucial to represent the existing knowledge about the specific domain of an organization. This work allowed us to guarantee a core set of ontologies that have the power of expressing the general context of the communications made in an organization, and of a methodology based upon a series of concrete steps that provides an effective capability of extending the ontologies to any business domain. By applying these steps, the minimization of the conceptualization and setup effort in new organizations and business domains is guaranteed. The adequacy of the core set of ontologies chosen and of the methodology specified is demonstrated in this thesis by its effective application to a real case-study, which allowed us to work with the different types of sources considered in the methodology and the activities that support its construction and evolution

    DISTRIBUTION OF DATA ON THE WEB

    Get PDF
    Data are most often represented [1] in tabular form, where each row represents a few records that are described and each column represents some properties of these items. Cells in a table have particular values for these properties. This article shows a sample of data about a school, completed during the school-leaving examination, 2017. We analyze several different strategies on how these data can be distributed on the web. In all these strategies, some data will be represented on a computer while other parts will be represented on another computer.school leaving-examination, data distribution of Web, strategies for the distribution of data on Web, distributed data, RDF solution, semantic web

    Adding DL-Lite TBoxes to Proper Knowledge Bases

    Get PDF
    Levesque’s proper knowledge bases (proper KBs) correspond to infinite sets of ground positive and negative facts, with the notable property that for FOL formulas in a certain normal form, which includes conjunctive queries and positive queries possibly extended with a controlled form of negation, entailment reduces to formula evaluation. However proper KBs represent extensional knowledge only. In description logic terms, they correspond to ABoxes. In this paper, we augment them with DL-Lite TBoxes, expressing intensional knowledge (i.e., the ontology of the domain). DL-Lite has the notable property that conjunctive query answering over TBoxes and standard description logic ABoxes is re- ducible to formula evaluation over the ABox only. Here, we investigate whether such a property extends to ABoxes consisting of proper KBs. Specifically, we consider two DL-Lite variants: DL-Literdfs , roughly corresponding to RDFS, and DL-Lite_core , roughly corresponding to OWL 2 QL. We show that when a DL- Lite_rdfs TBox is coupled with a proper KB, the TBox can be compiled away, reducing query answering to evaluation on the proper KB alone. But this reduction is no longer possible when we associate proper KBs with DL-Lite_core TBoxes. Indeed, we show that in the latter case, query answering even for conjunctive queries becomes coNP-hard in data complexity

    Konzeption eines RDF-Vokabulars für die Darstellung von COUNTER-Nutzungsstatistiken: innerhalb des Electronic Resource Management Systems der Universitätsbibliothek Leipzig

    Get PDF
    Die vorliegende Masterarbeit dokumentiert die Erstellung eines RDF-basierten Vokabulars zur Darstellung von Nutzungsstatistiken elektronischer Ressourcen, die nach dem COUNTER-Standard erstellt wurden. Die konkrete Anwendung dieses Vokabulars bildet das Electronic Resource Management System (ERMS), welches momentan von der Universitätsbibliothek Leipzig im Rahmen des kooperativen Projektes AMSL entwickelt wird. Dieses basiert auf Linked Data, soll die veränderten Verwaltungsprozesse elektronischer Ressourcen abbilden können und gleichzeitig anbieterunabhängig und flexibel sein. Das COUNTER-Vokabular soll aber auch über diese Anwendung hinaus einsetzbar sein. Die Arbeit gliedert sich in die beiden Teile Grundlagen und Modellierung. Im ersten Teil wird zu nächst die bibliothekarische Notwendigkeit von ERM-Systemen herausgestellt und der Fokus der Betrachtung auf das Teilgebiet der Nutzungsstatistiken und die COUNTER-Standardisierung gelenkt. Anschließend werden die technischen Grundlagen der Modellierung betrachtet, um die Arbeit auch für nicht mit Linked Data vertraute Leser verständlich zu machen. Darauf folgt der Modellierungsteil, der mit einer Anforderungsanalyse sowie der Analyse des den COUNTER-Dateien zugrunde liegenden XML-Schemas beginnt. Daran schließt sich die Modellierung des Vokabulars mit Hilfe von RDFS und OWL an. Aufbauend auf angestellten Überlegungen zur Übertragung von XML-Statistiken nach RDF und der Vergabe von URIs werden anschließend reale Beispieldateien manuell konvertiert und in einem kurzen Test erfolgreich überprüft. Den Abschluss bilden ein Fazit der Arbeit sowie ein Ausblick auf das weitere Verfahren mit den Ergebnissen. Das erstellte RDF-Vokabular ist bei GitHub unter der folgenden URL zur Weiterverwendung hinterlegt: https://github.com/a-nnika/counter.vocab:Inhaltsverzeichnis Abbildungsverzeichnis 6 Tabellenverzeichnis 7 Abkürzungsverzeichnis 8 1 Einleitung 9 1.1 Problematik, Ziel und Abgrenzung 9 1.2 Zielgruppe, Methodik und Aufbau 11 1.3 Forschungsstand und Quellenlage 13 TEIL I - Grundlagen 17 2 Bibliothekarische Ausgangssituation 18 2.1 Electronic Resource Management 18 2.2 Nutzungsdaten elektronischer Ressourcen 20 2.3 Projekt AMSL 23 3 Technischer Hintergrund 26 3.1 XML 26 3.2 Linked Data und Semantic Web 27 3.3 Grundkonzepte der Modellierung 29 3.4 RDF 30 3.4.1 Datenmodell 30 3.4.2 Serialisierungen 34 3.5 RDFS 36 3.6 OWL 38 TEIL II - Modellierung 41 4 Vorarbeiten 42 4.1 Anforderungsanalyse 42 4.2 Analyse des COUNTER XML-Schemas 45 4.2.1 Grundstruktur 45 4.2.2 Details 48 4.3 Grundkonzeption 54 4.4 Verwendete Programme 56 4.4.1 Notepad++ 56 4.4.2 Raptor 58 4.4.3 OntoWiki 59 5 Realisierung des RDF-Vokabulars 61 5.1 Grundlegende Modellierung: RDFS 61 5.2 Erweiterung: OWL 70 5.3 Übertragung von XML-Daten nach RDF 75 5.4 URI-Vergabe 78 6 Test des Vokabulars 83 6.1 Planung des Tests 83 6.2 Erstellung von Testdatensätzen 85 6.3 Testergebnisse 87 7 Fazit und Ausblick 90 Literatur- und Quellenverzeichnis 93 Selbstständigkeitserklärung 101 Anhänge

    Dependency Types Validation of Precedence Diagram Method Using Ontology

    Get PDF
    Precedence Diagram Method (PDM) is a visual representation technique that depicts the activities involved in a project. It is a tool for scheduling activities using nodes to represent activities and their connections with arrows to illustrate activity dependencies. During project execution, activities in the project may be considered to change and may lead to inconsistency of dependency types. Consequently, dates and deadlines are missed so these mistakes could prove very costly. Since an ontology describes the relationship between the concepts within a domain. Thus, the ontology could be used to represent knowledge of PDM. This paper proposes PDM Ontology in OWL and rules for inferring the new knowledge from existing PDM and validating PDM using SWRL

    Using Ontologies for Semantic Data Integration

    Get PDF
    While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past two decades, the idea of using semantics for data integration has become increasingly crucial, and has received much attention in the AI, database, web, and data mining communities. Here, we focus on a specific paradigm for semantic data integration, called Ontology-Based Data Access (OBDA). The goal of this paper is to provide an overview of OBDA, pointing out both the techniques that are at the basis of the paradigm, and the main challenges that remain to be addressed
    corecore