48 research outputs found
Foundational Ontologies meet Ontology Matching: A Survey
Ontology matching is a research area aimed at finding ways to make different ontologies interoperable. Solutions to the problem have been proposed from different disciplines, including databases, natural language processing, and machine learning. The role of foundational ontologies for ontology matching is an important one. It is multifaceted and with room for development. This paper presents an overview of the different tasks involved in ontology matching that consider foundational ontologies. We discuss the strengths and weaknesses of existing proposals and highlight the challenges to be addressed in the future
Completing and Debugging Ontologies: state of the art and challenges
As semantically-enabled applications require high-quality ontologies,
developing and maintaining ontologies that are as correct and complete as
possible is an important although difficult task in ontology engineering. A key
step is ontology debugging and completion. In general, there are two steps:
detecting defects and repairing defects. In this paper we discuss the state of
the art regarding the repairing step. We do this by formalizing the repairing
step as an abduction problem and situating the state of the art with respect to
this framework. We show that there are still many open research problems and
show opportunities for further work and advancing the field.Comment: 56 page
An Interactive Guidance Process Supporting Consistent Updates of RDFS Graphs
International audienceWith existing tools, when creating a new object in the Semantic Web, users benefit neither from existing objects and their properties, nor from the already known properties of the new object. We propose UTILIS, an interactive process to help users add new objects. While creating a new object, relaxation rules are applied to its current description to find similar objects, whose properties serve as suggestions to expand the description. A user study conducted on a group of master students shows that students, even the ones disconcerted by the unconventional interface, used UTILIS suggestions. In most cases, they could find the searched element in the first three sets of properties of similar objects. Moreover, with UTILIS users did not create any duplicate whereas with the other tool used in the study more than half of them did
Pattern-based design applied to cultural heritage knowledge graphs
Ontology Design Patterns (ODPs) have become an established and recognised
practice for guaranteeing good quality ontology engineering. There are several
ODP repositories where ODPs are shared as well as ontology design methodologies
recommending their reuse. Performing rigorous testing is recommended as well
for supporting ontology maintenance and validating the resulting resource
against its motivating requirements. Nevertheless, it is less than
straightforward to find guidelines on how to apply such methodologies for
developing domain-specific knowledge graphs. ArCo is the knowledge graph of
Italian Cultural Heritage and has been developed by using eXtreme Design (XD),
an ODP- and test-driven methodology. During its development, XD has been
adapted to the need of the CH domain e.g. gathering requirements from an open,
diverse community of consumers, a new ODP has been defined and many have been
specialised to address specific CH requirements. This paper presents ArCo and
describes how to apply XD to the development and validation of a CH knowledge
graph, also detailing the (intellectual) process implemented for matching the
encountered modelling problems to ODPs. Relevant contributions also include a
novel web tool for supporting unit-testing of knowledge graphs, a rigorous
evaluation of ArCo, and a discussion of methodological lessons learned during
ArCo development
Hybrid fuzzy multi-objective particle swarm optimization for taxonomy extraction
Ontology learning refers to an automatic extraction of ontology to produce the ontology learning layer cake which consists of five kinds of output: terms, concepts, taxonomy relations, non-taxonomy relations and axioms. Term extraction is a prerequisite for all aspects of ontology learning. It is the automatic mining of complete terms from the input document. Another important part of ontology is taxonomy, or the hierarchy of concepts. It presents a tree view of the ontology and shows the inheritance between subconcepts and superconcepts. In this research, two methods were proposed for improving the performance of the extraction result. The first method uses particle swarm optimization in order to optimize the weights of features. The advantage of particle swarm optimization is that it can calculate and adjust the weight of each feature according to the appropriate value, and here it is used to improve the performance of term and taxonomy extraction. The second method uses a hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems that ensures that the membership functions and fuzzy system rule sets are optimized. The advantage of using a fuzzy system is that the imprecise and uncertain values of feature weights can be tolerated during the extraction process. This method is used to improve the performance of taxonomy extraction. In the term extraction experiment, five extracted features were used for each term from the document. These features were represented by feature vectors consisting of domain relevance, domain consensus, term cohesion, first occurrence and length of noun phrase. For taxonomy extraction, matching Hearst lexico-syntactic patterns in documents and the web, and hypernym information form WordNet were used as the features that represent each pair of terms from the texts. These two proposed methods are evaluated using a dataset that contains documents about tourism. For term extraction, the proposed method is compared with benchmark algorithms such as Term Frequency Inverse Document Frequency, Weirdness, Glossary Extraction and Term Extractor, using the precision performance evaluation measurement. For taxonomy extraction, the proposed methods are compared with benchmark methods of Feature-based and weighting by Support Vector Machine using the f-measure, precision and recall performance evaluation measurements. For the first method, the experiment results concluded that implementing particle swarm optimization in order to optimize the feature weights in terms and taxonomy extraction leads to improved accuracy of extraction result compared to the benchmark algorithms. For the second method, the results concluded that the hybrid technique that uses multi-objective particle swarm optimization and fuzzy systems leads to improved performance of taxonomy extraction results when compared to the benchmark methods, while adjusting the fuzzy membership function and keeping the number of fuzzy rules to a minimum number with a high degree of accuracy
Framing Named Entity Linking Error Types
Named Entity Linking (NEL) and relation extraction forms the backbone of Knowledge Base Population tasks. The recent rise of
large open source Knowledge Bases and the continuous focus on improving NEL performance has led to the creation of automated
benchmark solutions during the last decade. The benchmarking of NEL systems offers a valuable approach to understand a NEL
system’s performance quantitatively. However, an in-depth qualitative analysis that helps improving NEL methods by identifying error
causes usually requires a more thorough error analysis. This paper proposes a taxonomy to frame common errors and applies this
taxonomy in a survey study to assess the performance of four well-known Named Entity Linking systems on three recent gold standards.
Keywords: Named Entity Linking, Linked Data Quality, Corpora, Evaluation, Error Analysi
Uncertainty-sensitive reasoning for inferring sameAs facts in linked data
albakri2016aInternational audienceDiscovering whether or not two URIs described in Linked Data -- in the same or different RDF datasets -- refer to the same real-world entity is crucial for building applications that exploit the cross-referencing of open data. A major challenge in data interlinking is to design tools that effectively deal with incomplete and noisy data, and exploit uncertain knowledge. In this paper, we model data interlinking as a reasoning problem with uncertainty. We introduce a probabilistic framework for modelling and reasoning over uncertain RDF facts and rules that is based on the semantics of probabilistic Datalog. We have designed an algorithm, ProbFR, based on this framework. Experiments on real-world datasets have shown the usefulness and effectiveness of our approach for data linkage and disambiguation
On the Importance of Drill-Down Analysis for Assessing Gold Standards and Named Entity Linking Performance
Rigorous evaluations and analyses of evaluation results are key towards improving Named Entity Linking systems. Nevertheless, most current evaluation tools are focused on benchmarking and comparative evaluations. Therefore, they only provide aggregated statistics such as precision, recall and F1-measure to assess system performance and no means for conducting detailed analyses up to the level of individual annotations.
This paper addresses the need for transparent benchmarking and fine-grained error analysis by introducing Orbis, an extensible framework that supports drill-down analysis, multiple annotation tasks and resource versioning. Orbis complements approaches like those deployed through the GERBIL and TAC KBP tools and helps developers to better understand and address shortcomings in their Named Entity Linking tools.
We present three uses cases in order to demonstrate the usefulness of Orbis for both research and production systems: (i)improving Named Entity Linking tools; (ii) detecting gold standard errors; and (iii) performing Named Entity Linking evaluations with multiple versions of the included resources
An ontological investigation over human relations in linked data
The research presented in this article is motivated by the increasing importance of complex human relations in linked data, either extracted from social networks, or found in existing databases. The FOAF vocabulary, targeted in our research, plays a central role in those data, and is a model for lightweight ontologies largely used in linked data, such as the DBpedia ontology and schema-org. We provide an overview of FOAF and other approaches for describing human relations, followed by a detailed analysis and critique of the FOAF Relationship Vocabulary, the most important FOAF extension. We propose an explicit formal axiomatization of this vocabulary, and an ontological analysis concerning the properties used to describe human relationships. We analyze the distribution of human relations based on their epistemological status, and define an ontoepistemic meta-property as characteristic of some of these predicates. Our analysis is generalizable to semantic modeling of social networks. Additionally, the modeling patterns used in other relevant linked data vocabularies are analyzed for comparison