777 research outputs found
Analysis of the human diseasome reveals phenotype modules across common, genetic, and infectious diseases
Phenotypes are the observable characteristics of an organism arising from its
response to the environment. Phenotypes associated with engineered and natural
genetic variation are widely recorded using phenotype ontologies in model
organisms, as are signs and symptoms of human Mendelian diseases in databases
such as OMIM and Orphanet. Exploiting these resources, several computational
methods have been developed for integration and analysis of phenotype data to
identify the genetic etiology of diseases or suggest plausible interventions. A
similar resource would be highly useful not only for rare and Mendelian
diseases, but also for common, complex and infectious diseases. We apply a
semantic text- mining approach to identify the phenotypes (signs and symptoms)
associated with over 8,000 diseases. We demonstrate that our method generates
phenotypes that correctly identify known disease-associated genes in mice and
humans with high accuracy. Using a phenotypic similarity measure, we generate a
human disease network in which diseases that share signs and symptoms cluster
together, and we use this network to identify phenotypic disease modules
Deploying mutation impact text-mining software with the SADI Semantic Web Services framework
Background: Mutation impact extraction is an important task designed to harvest relevant annotations from scientific documents for reuse in multiple contexts. Our previous work on text mining for mutation impacts resulted in (i) the development of a GATE-based pipeline that mines texts for information about impacts of mutations on proteins, (ii) the population of this information into our OWL DL mutation impact ontology, and (iii) establishing an experimental semantic database for storing the results of text mining. Results: This article explores the possibility of using the SADI framework as a medium for publishing our mutation impact software and data. SADI is a set of conventions for creating web services with semantic descriptions that facilitate automatic discovery and orchestration. We describe a case study exploring and demonstrating the utility of the SADI approach in our context. We describe several SADI services we created based on our text mining API and data, and demonstrate how they can be used in a number of biologically meaningful scenarios through a SPARQL interface (SHARE) to SADI services. In all cases we pay special attention to the integration of mutation impact services with external SADI services providing information about related biological entities, such as proteins, pathways, and drugs. Conclusion: We have identified that SADI provides an effective way of exposing our mutation impact data suc
Design and implementation of a filter engine for semantic web documents
This report describes our project that addresses the challenge of changes in the semantic web. Some studies have already been done for the so-called adaptive semantic web, such as applying inferring rules. In this study, we apply the technology of Event Notification System (ENS). Treating changes as events, we
developed a notification system for such events
An E-Learning Semantic Grid for Life science Education
There are a lot of life science databases and services on the Internet nowadays, especially in life science e-science. In this paper, we will present an E-Learning Semantic Grid that integrates these resources provided by both teachers and scientists for life science education. It uses domain ontologies to integrate these heterogeneous life science database and service resources, and supports ontology-based e-learning data-sharing and service-coordination for life science teachers and students in an e-learning virtual organization. Our system provides life science students with semantically superior experience in learning activities, and also extends the function of life science e-science. It has a promising future in the domain of life science education
Ontology Alignment using Biologically-inspired Optimisation Algorithms
It is investigated how biologically-inspired optimisation methods can be used to compute alignments between ontologies. Independent of particular similarity metrics, the developed techniques demonstrate anytime behaviour and high scalability. Due to the inherent parallelisability of these population-based algorithms it is possible to exploit dynamically scalable cloud infrastructures - a step towards the provisioning of Alignment-as-a-Service solutions for future semantic applications
The consistent representation of scientific knowledge : investigations into the ontology of karyotypes and mitochondria
PhD ThesisOntologies are widely used in life sciences to model scienti c knowledge. The engineering
of these ontologies is well-studied and there are a variety of methodologies
and techniques, some of which have been re-purposed from software engineering
methodologies and techniques. However, due to the complex nature of bio-ontologies,
they are not resistant to errors and mistakes. This is especially true for more expressive
and/or larger ontologies.
In order to improve on this issue, we explore a variety of software engineering techniques
that were re-purposed in order to aid ontology engineering. This exploration
is driven by the construction of two light-weight ontologies, The Mitochondrial Disease
Ontology and The Karyotype Ontology. These ontologies have speci c and
useful computational goals, as well as providing exemplars for our methodology.
This thesis discusses the modelling decisions undertaken as well as the overall success
of each ontological model. Due to the added knowledge capture steps required
for the mitochondrial knowledge, The Karyotype Ontology is further developed than
The Mitochondrial Disease Ontology.
Speci cally, this thesis explores the use of a pattern-driven and programmatic approach
to bio-medical ontology engineering. During the engineering of our biomedical
ontologies, we found many of the components of each model were similar
in logical and textual de nitions. This was especially true for The Karyotype Ontology.
In software engineering a common technique to avoid replication is to abstract
through the use of patterns. Therefore we utilised localised patterns to model
these highly repetitive models. There are a variety of possible tools for the encoding
of these patterns, but we found ontology development using Graphical User
Interface (GUI) tools to be time-consuming due to the necessity of manual GUI
interaction when the ontology needed updating. With the development of Tawny-
OWL, a programmatic tool for ontology construction, we are able to overcome this
issue, with the added bene t of using a single syntax to express both simple and
- i -
patternised parts of the ontology.
Lastly, we brie
y discuss how other methodologies and tools from software engineering,
namely unit tests, di ng, version control and Continuous Integration (CI) were
re-purposed and how they aided the engineering of our two domain ontologies.
Together, this knowledge increases our understanding in ontology engineering techniques.
By re-purposing software engineering methodologies, we have aided construction,
quality and maintainability of two novel ontologies, and have demonstrated
their applicability more generally
Towards linked open gene mutations data
<p>Abstract</p> <p>Background</p> <p>With the advent of high-throughput technologies, a great wealth of variation data is being produced. Such information may constitute the basis for correlation analyses between genotypes and phenotypes and, in the future, for personalized medicine. Several databases on gene variation exist, but this kind of information is still scarce in the Semantic Web framework.</p> <p>In this paper, we discuss issues related to the integration of mutation data in the Linked Open Data infrastructure, part of the Semantic Web framework. We present the development of a mapping from the IARC TP53 Mutation database to RDF and the implementation of servers publishing this data.</p> <p>Methods</p> <p>A version of the IARC TP53 Mutation database implemented in a relational database was used as first test set. Automatic mappings to RDF were first created by using D2RQ and later manually refined by introducing concepts and properties from domain vocabularies and ontologies, as well as links to Linked Open Data implementations of various systems of biomedical interest.</p> <p>Since D2RQ query performances are lower than those that can be achieved by using an RDF archive, generated data was also loaded into a dedicated system based on tools from the Jena software suite.</p> <p>Results</p> <p>We have implemented a D2RQ Server for TP53 mutation data, providing data on a subset of the IARC database, including gene variations, somatic mutations, and bibliographic references. The server allows to browse the RDF graph by using links both between classes and to external systems. An alternative interface offers improved performances for SPARQL queries. The resulting data can be explored by using any Semantic Web browser or application.</p> <p>Conclusions</p> <p>This has been the first case of a mutation database exposed as Linked Data. A revised version of our prototype, including further concepts and IARC TP53 Mutation database data sets, is under development.</p> <p>The publication of variation information as Linked Data opens new perspectives: the exploitation of SPARQL searches on mutation data and other biological databases may support data retrieval which is presently not possible. Moreover, reasoning on integrated variation data may support discoveries towards personalized medicine.</p
- âŚ