918 research outputs found
Using Provenance for Quality Assessment and Repair in Linked Open Data
As the number of data sources publishing their data on the Web of Data is growing, we are experiencing an immense growth of the Linked Open Data cloud. The lack of control on the published sources, which could be untrustworthy or unreliable, along with their dynamic nature that often invalidates links and causes conflicts or other discrepancies, could lead to poor quality data. In order to judge data quality, a number of quality indicators have been proposed, coupled with quality metrics that quantify the “quality level” of a dataset. In addition to the above, some approaches address how to improve the quality of the datasets through a repair process that focuses on how to correct invalidities caused by constraint violations by either removing or adding triples. In this paper we argue that provenance is a critical factor that should be taken into account during repairs to ensure that the most reliable data is kept. Based on this idea, we propose quality metrics that take into account provenance and evaluate their applicability as repair guidelines in a particular data fusion setting
Empowering Knowledge Bases: a Machine Learning Perspective
The construction of Knowledge Bases requires quite often
the intervention of knowledge engineering and domain experts, resulting
in a time consuming task. Alternative approaches have been developed
for building knowledge bases from existing sources of information such
as web pages and crowdsourcing; seminal examples are NELL, DBPedia,
YAGO and several others. With the goal of building very large sources of
knowledge, as recently for the case of Knowledge Graphs, even more complex
integration processes have been set up, involving multiple sources of
information, human expert intervention, crowdsourcing. Despite signi -
cant e orts for making Knowledge Graphs as comprehensive and reliable
as possible, they tend to su er of incompleteness and noise, due to the
complex building process. Nevertheless, even for highly human curated
knowledge bases, cases of incompleteness can be found, for instance with
disjointness axioms missing quite often. Machine learning methods have
been proposed with the purpose of re ning, enriching, completing and
possibly raising potential issues in existing knowledge bases while showing
the ability to cope with noise. The talk will concentrate on classes
of mostly symbol-based machine learning methods, speci cally focusing
on concept learning, rule learning and disjointness axioms learning problems,
showing how the developed methods can be exploited for enriching
existing knowledge bases. During the talk it will be highlighted as, a
key element of the illustrated solutions, is represented by the integration
of: background knowledge, deductive reasoning and the evidence coming
from the mass of the data. The last part of the talk will be devoted
to the presentation of an approach for injecting background knowledge
into numeric-based embedding models to be used for predictive tasks on
Knowledge Graphs
Universal OWL Axiom Enrichment for Large Knowledge Bases
Abstract. The Semantic Web has seen a rise in the availability and usage of knowledge bases over the past years, in particular in the Linked Open Data initiative. Despite this growth, there is still a lack of knowl-edge bases that consist of high quality schema information and instance data adhering to this schema. Several knowledge bases only consist of schema information, while others are, to a large extent, a mere collec-tion of facts without a clear structure. The combination of rich schema and instance data would allow powerful reasoning, consistency check-ing, and improved querying possibilities as well as provide more generic ways to interact with the underlying data. In this article, we present a light-weight method to enrich knowledge bases accessible via SPARQL endpoints with almost all types of OWL 2 axioms. This allows to semi-automatically create schemata, which we evaluate and discuss using DB-pedia.
Integration of Scientific Information through Linked Data : Preliminary Report
By implementing Linked Open Data principles throughout the scientific community, it is possible to make publications more visible and foster collaboration, both between universities, researching groups and partners. However, data should be curated and published, and a mature infrastructure needs to be provided to support it. In this work, we analyse the Linked Data’s weaknesses and propose an application prototype to evaluate the state-of-the-art methodologies and tools. As case of study, data from the authors’ researching groups will be publicly available in an attempt to integrate the scientific data of the Argentinian community which is an open issue. We plan to use this data to generate bottom-up methodological guidelines and thus enrich ontology-based conceptual models.Sociedad Argentina de Informática e Investigación Operativa (SADIO
An unsupervised approach to disjointness learning based on terminological cluster trees
In the context of the Semantic Web regarded as a Web of Data, research efforts have been devoted to improving the quality of the ontologies that are used as vocabularies to enable complex services based on automated reasoning. From various surveys it emerges that many domains would require better ontologies that include non-negligible constraints for properly conveying the intended semantics. In this respect, disjointness axioms are representative of this general problem: these axioms are essential for making the negative knowledge about the domain of interest explicit yet they are often overlooked during the modeling process (thus affecting the efficacy of the reasoning services). To tackle this problem, automated methods for discovering these axioms can be used as a tool for supporting knowledge engineers in modeling new ontologies or evolving existing ones. The current solutions, either based on statistical correlations or relying on external corpora, often do not fully exploit the terminology. Stemming from this consideration, we have been investigating on alternative methods to elicit disjointness axioms from existing ontologies based on the induction of terminological cluster trees, which are logic trees in which each node stands for a cluster of individuals which emerges as a sub-concept. The growth of such trees relies on a divide-and-conquer procedure that assigns, for the cluster representing the root node, one of the concept descriptions generated via a refinement operator and selected according to a heuristic based on the minimization of the risk of overlap between the candidate sub-clusters (quantified in terms of the distance between two prototypical individuals). Preliminary works have showed some shortcomings that are tackled in this paper. To tackle the task of disjointness axioms discovery we have extended the terminological cluster tree induction framework with various contributions: 1) the adoption of different distance measures for clustering the individuals of a knowledge base; 2) the adoption of different heuristics for selecting the most promising concept descriptions; 3) a modified version of the refinement operator to prevent the introduction of inconsistency during the elicitation of the new axioms. A wide empirical evaluation showed the feasibility of the proposed extensions and the improvement with respect to alternative approaches
Problem-based learning supported by semantic techniques
Problem-based learning has been applied over the last three decades to a diverse range of learning environments. In this educational approach, different problems are posed to the learners so that they can develop different solutions while learning about the problem domain. When applied to conceptual modelling, and particularly to Qualitative Reasoning, the solutions to problems are models that represent the behaviour of a dynamic system. The learner?s task then is to bridge the gap between their initial model, as their first attempt to represent the system, and the target models that provide solutions to that problem. We propose the use of semantic technologies and resources to help in bridging that gap by providing links to terminology and formal definitions, and matching techniques to allow learners to benefit from existing models
PatOMat - Versatile Framework for Pattern-Based Ontology Transformation
The purpose of the PatOMat transformation framework is to bridge between different modeling styles of web ontologies. We provide a formal model of pattern-based ontology transformation, explain its implementation in PatOMat, and manifest the flexibility of the framework on diverse use cases
Annual reports of the selectmen, treasurer and town clerk of the town of Weare, together with the report of the school board, for the year ending February 15, 1895.
This is an annual report containing vital statistics for a town/city in the state of New Hampshire
- …