2,019 research outputs found
Modularization for the Cell Ontology
One of the premises of the OBO Foundry is that development of an orthogonal set of ontologies will increase domain expert contributions and logical interoperability, and decrease maintenance workload. For these reasons, the Cell Ontology (CL) is being re-engineered. This process requires the extraction of sub-modules from existing OBO ontologies, which presents a number of practical engineering challenges. These extracted modules may be intended to cover a narrow or a broad set of species. In addition, applications and resources that make use of the Cell Ontology have particular modularization requirements, such as the ability to extract custom subsets or unions of the Cell Ontology with other OBO ontologies. These extracted modules may be intended to cover a narrow or a broad set of species, which presents unique complications.

We discuss some of these requirements, and present our progress towards a customizable simple-to-use modularization tool that leverages existing OWL-based tools and opens up their use for the CL and other ontologies
Knowledge-based methods for automatic extraction of domain-specific ontologies
Semantic web technology aims at developing methodologies for representing large amount of knowledge in web accessible form. The semantics of knowledge should be easy to interpret and understand by computer programs, so that sharing and utilizing knowledge across the Web would be possible. Domain specific ontologies form the basis for knowledge representation in the semantic web. Research on automated development of ontologies from texts has become increasingly important because manual construction of ontologies is labor intensive and costly, and, at the same time, large amount of texts for individual domains is already available in electronic form. However, automatic extraction of domain specific ontologies is challenging due to the unstructured nature of texts and inherent semantic ambiguities in natural language. Moreover, the large size of texts to be processed renders full-fledged natural language processing methods infeasible. In this dissertation, we develop a set of knowledge-based techniques for automatic extraction of ontological components (concepts, taxonomic and non-taxonomic relations) from domain texts. The proposed methods combine information retrieval metrics, lexical knowledge-base(like WordNet), machine learning techniques, heuristics, and statistical approaches to meet the challenge of the task. These methods are domain-independent and automatic approaches. For extraction of concepts, the proposed WNSCA+{PE, POP} method utilizes the lexical knowledge base WordNet to improve precision and recall over the traditional information retrieval metrics. A WordNet-based approach, the compound term heuristic, and a supervised learning approach are developed for taxonomy extraction. We also developed a weighted word-sense disambiguation method for use with the WordNet-based approach. An unsupervised approach using log-likelihood ratios is proposed for extracting non-taxonomic relations. Further more, a supervised approach is investigated to learn the semantic constraints for identifying relations from prepositional phrases. The proposed methods are validated by experiments with the Electronic Voting and the Tender Offers, Mergers, and Acquisitions domain corpus. Experimental results and comparisons with some existing approaches clearly indicate the superiority of our methods. In summary, a good combination of information retrieval, lexical knowledge base, statistics and machine learning methods in this study has led to the techniques efficient and effective for extracting ontological components automatically
Recommended from our members
Ontology learning for Semantic Web Services
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 18/10/2010.The expansion of Semantic Web Services is restricted by traditional ontology engineering methods. Manual ontology development is time consuming, expensive and a resource exhaustive task. Consequently, it is important to support ontology engineers by
automating the ontology acquisition process to help deliver the Semantic Web vision.
Existing Web Services offer an affluent source of domain knowledge for ontology
engineers. Ontology learning can be seen as a plug-in in the Web Service ontology
development process, which can be used by ontology engineers to develop and maintain
an ontology that evolves with current Web Services. Supporting the domain engineer
with an automated tool whilst building an ontological domain model, serves the purpose
of reducing time and effort in acquiring the domain concepts and relations from Web
Service artefacts, whilst effectively speeding up the adoption of Semantic Web Services, thereby allowing current Web Services to accomplish their full potential. With that in mind, a Service Ontology Learning Framework (SOLF) is developed and
applied to a real set of Web Services. The research contributes a rigorous method that
effectively extracts domain concepts, and relations between these concepts, from Web
Services and automatically builds the domain ontology. The method applies pattern-based
information extraction techniques to automatically learn domain concepts and
relations between those concepts. The framework is automated via building a tool that implements the techniques. Applying the SOLF and the tool on different sets of services results in an automatically built domain ontology model that represents semantic knowledge in the underlying domain. The framework effectiveness, in extracting domain concepts and relations, is evaluated
by its appliance on varying sets of commercial Web Services including the financial domain. The standard evaluation metrics, precision and recall, are employed to determine both the accuracy and coverage of the learned ontology models. Both the
lexical and structural dimensions of the models are evaluated thoroughly. The evaluation results are encouraging, providing concrete outcomes in an area that is little researched
Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources
The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach
Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings
We consider the task of inferring is-a relationships from large text corpora.
For this purpose, we propose a new method combining hyperbolic embeddings and
Hearst patterns. This approach allows us to set appropriate constraints for
inferring concept hierarchies from distributional contexts while also being
able to predict missing is-a relationships and to correct wrong extractions.
Moreover -- and in contrast with other methods -- the hierarchical nature of
hyperbolic space allows us to learn highly efficient representations and to
improve the taxonomic consistency of the inferred hierarchies. Experimentally,
we show that our approach achieves state-of-the-art performance on several
commonly-used benchmarks
The problem of learning non-taxonomic relationships of ontologies from text
Manual construction of ontologies by domain experts and knowledge engineers is a costly task. Thus, automatic and/or semi-automatic approaches to their development are needed. Ontology Learning aims at identifying its constituent elements, such as non-taxonomic relationships, from textual information sources. This article presents a discussion of the problem of Learning Non-Taxonomic Relationships of Ontologies and defines its generic process. Four techniques representing the state of the art of Learning Non-Taxonomic Relationships of Ontologies are described and the solutions they provide are discussed along with their advantages and limitations
Comparing human and automatic thesaurus mapping approaches in the agricultural domain
Knowledge organization systems (KOS), like thesauri and other controlled
vocabularies, are used to provide subject access to information systems across
the web. Due to the heterogeneity of these systems, mapping between
vocabularies becomes crucial for retrieving relevant information. However,
mapping thesauri is a laborious task, and thus big efforts are being made to
automate the mapping process. This paper examines two mapping approaches
involving the agricultural thesaurus AGROVOC, one machine-created and one human
created. We are addressing the basic question "What are the pros and cons of
human and automatic mapping and how can they complement each other?" By
pointing out the difficulties in specific cases or groups of cases and grouping
the sample into simple and difficult types of mappings, we show the limitations
of current automatic methods and come up with some basic recommendations on
what approach to use when.Comment: 10 pages, Int'l Conf. on Dublin Core and Metadata Applications 200
- …