371 research outputs found
Novel Algorithms for Cross-Ontology Multi-Level Data Mining
The wide spread use of ontologies in many scientific areas creates a wealth of ontologyannotated data and necessitates the development of ontology-based data mining algorithms. We have developed generalization and mining algorithms for discovering cross-ontology relationships via ontology-based data mining. We present new interestingness measures to evaluate the discovered cross-ontology relationships. The methods presented in this dissertation employ generalization as an ontology traversal technique for the discovery of interesting and informative relationships at multiple levels of abstraction between concepts from different ontologies. The generalization algorithms combine ontological annotations with the structure and semantics of the ontologies themselves to discover interesting crossontology relationships. The first algorithm uses the depth of ontological concepts as a guide for generalization. The ontology annotations are translated to higher levels of abstraction one level at a time accompanied by incremental association rule mining. The second algorithm conducts a generalization of ontology terms to all their ancestors via transitive ontology relations and then mines cross-ontology multi-level association rules from the generalized transactions. Our interestingness measures use implicit knowledge conveyed by the relation semantics of the ontologies to capture the usefulness of cross-ontology relationships. We describe the use of information theoretic metrics to capture the interestingness of cross-ontology relationships and the specificity of ontology terms with respect to an annotation dataset. Our generalization and data mining agorithms are applied to the Gene Ontology and the postnatal Mouse Anatomy Ontology. The results presented in this work demonstrate that our generalization algorithms and interestingness measures discover more interesting and better quality relationships than approaches that do not use generalization. Our algorithms can be used by researchers and ontology developers to discover inter-ontology connections. Additionally, the cross-ontology relationships discovered using our algorithms can be used by researchers to understand different aspects of entities that interest them
A model for digital preservation repository risk relationships
The paper introduces the Preserved Object and Repository Risk Ontology (PORRO), a model that relates preservation functionality with associated risks and opportunities for their mitigation. Building on work undertaken in a range of EU and UK funded research projects (including the Digital Curation Centre , DigitalPreservationEurope and DELOS ), this ontology illustrates relationships between fundamental digital library goals and their parameters; associated rights and responsibilities; practical activities and resources involved in their accomplishment; and risks facing digital libraries and their collections. Its purpose is to facilitate a comprehensive understanding of risk causality and to illustrate opportunities for mitigation and avoidance.
The ontology reflects evidence accumulated from a series of institutional audits and evaluations, including a specific subset of digital libraries in the DELOS project which led to the definition of a digital library preservation risk profile. Its applicability is intended to be widespread, and its coverage expected to evolve to reflect developments within the community.
Attendees will gain an understanding of the model and learn how they can utilize this online resource to inform their own risk management activities
PowerAqua: fishing the semantic web
The Semantic Web (SW) offers an opportunity to develop novel, sophisticated forms of question answering (QA). Specifically, the availability of distributed semantic markup on a large scale opens the way to QA systems which can make use of such semantic information to provide precise, formally derived answers to questions. At the same time the distributed, heterogeneous, large-scale nature of the semantic information introduces significant challenges. In this paper we describe the design of a QA system, PowerAqua, designed to exploit semantic markup on the web to provide answers to questions posed in natural language. PowerAqua does not assume that the user has any prior information about the semantic resources. The system takes as input a natural language query, translates it into a set of logical queries, which are then answered by consulting and aggregating information derived from multiple heterogeneous semantic sources
Recommended from our members
DOOR: towards a formalization of ontology relations
In this paper, we describe our ongoing effort in describing and formalizing semantic relations that link ontolo- gies with each others on the Semantic Web in order to create an ontology, DOOR, to represent, manipulate and reason upon these relations. DOOR is a Descriptive Ontology of Ontology Relations which intends to define relations such as inclusion, versioning, similarity and agreement using ontological primitives as well as rules. Here, we provide a detailed description of the methodology used to design the DOOR ontology, as well as an overview of its content. We also describe how DOOR is used in a complete framework (called KANNEL) for detecting and managing semantic relations between ontologies in large ontology repositories. Applied in the context of a large collection of automatically crawled ontologies, DOOR and KANNEL provide a starting point for analyzing the underlying structure of the network of ontologies that is the Semantic Web
Recommended from our members
A platform for semantic web studies
The Semantic Web can be seen as a large, heterogeneous network of ontologies and semantic documents. Characterizing these ontologies, the way they relate and the way they are organized can help in better understanding how knowledge is produced and published online. It also provides new ways to explore and exploit this large collection of ontologies. In this paper, we present the foundation of a research platform for characterizing the Semantic Web, relying on the collection of ontologies and the functionalities provided by the Watson Semantic Web search engine. We more specifically focus on formalizing and monitoring relationships between ontologies online, considering a variety of different relations (similarity, versioning, agreement, modularity) and how they can help us obtaining meaningful overviews of the current state of the Semantic Web
Using a domain ontology for the semantic-statistical classification of specialist hypertexts
In this feasibility study we aim at contributing at the practical use of domain ontologies for hypertext classification by introducing an algorithm generating potential keywords. The algorithm uses structural markup information and lemmatized word lists as well as a domain ontology on linguistics. We present the calculation and ranking of keyword candidates based on ontology relationships, word position, frequency information, and statistical significance as evidenced by log-likelihood tests. Finally, the results of our machine-driven classification are validated empirically against manually assigned keywords
Combining Homolog and Motif Similarity Data with Gene Ontology Relationships for Protein Function Prediction
Uncharacterized proteins pose a challenge not just to functional genomics, but also to biology in general. The knowledge of biochemical functions of such proteins is very critical for designing efficient therapeutic techniques. The bot- tleneck in hypothetical proteins annotation is the difficulty in collecting and aggregating enough biological information about the protein itself. In this paper, we propose and evaluate a protein annotation technique that aggregates different biological infor- mation conserved across many hypothetical proteins. To enhance the performance and to increase the prediction accuracy, we incorporate term specific relationships based on Gene Ontology (GO). Our method combines PPI (Protein Protein Interactions) data, protein motifs information, protein sequence similarity and protein homology data, with a context similarity measure based on Gene Ontology, to accurately infer functional information for unannotated proteins. We apply our method on Saccharomyces Cerevisiae species proteins. The aggregation of different sources of evidence with GO relationships increases the precision and accuracy of prediction compared to other methods reported in literature. We predicted with a precision and accuracy of 100% for more than half proteins of the input set and with an overall 81.35% precision and 80.04% accurac
- …