Search CORE

210 research outputs found

A cooperative framework for molecular biology database integration using image object selection

Author: Khan N.
Khan N.
Publication venue
Publication date: 01/01/2004
Field of study

The theme and the concept of 'Molecular Biology Database Integration' and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: diversity of molecular biology databases schemas, schema constructs and schema implementation multi-database query using image object keying, database integration technologies using context graph, automated navigation among these databases. This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work

Middlesex University Research Repository

A cooperative framework for molecular biology database integration using image object selection.

Author: Khan N.
Khan N.
Publication venue
Publication date: 01/01/2004
Field of study

The theme and the concept of 'Molecular Biology Database Integration’ and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: - diversity of molecular biology databases schemas, schema constructs and schema implementation -multi-database query using image object keying -database integration technologies using context graph - automated navigation among these databases This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This/involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work

Middlesex University Research Repository

The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

Author: Aerts Jan
Afzal Hammad
Antezana Erick
Arakawa Kazuharu
Aranda Bruno
Asai Kiyoshi
Belleau Francois
Bolleman Jerven
Bonnal Raoul Jp
Chapman Brad
Chun Hong-Woo
Cock Peter Ja
Eriksson Tore
Gordon Paul Mk
Goto Naohisa
Hayashi Kazuhiro
Horn Heiko
Ishiwata Ryosuke
Kaminuma Eli
Kasprzyk Arek
Katayama Toshiaki
Kawaji Hideya
Kawamoto Shoko
Kawashima Shuichi
Kido Nobuhiro
Kim Young Joo
Kinjo Akira R
Konishi Fumikazu
Kwon Kyung-Hoon
Labarga Alberto
Lamprecht Anna-Lena
Lin Yu
Lindenbaum Pierre
McCarthy Luke
Micklem Gos
Morita Hideyuki
Murakami Katsuhiko
Nagao Koji
Nakao Mitsuteru
Nishida Kozo
Nishimura Kunihiro
Nishizawa Tatsuya
Ogishima Soichi
Okamoto Shinobu
Okubo Kosaku
Ono Keiichiro
Oouchida Kenta
Oshita Kazuki
Park Keun-Joon
Prins Pjotr
Saito Taro L
Samwald Matthias
Satagopam Venkata P
Shigemoto Yasumasa
Smith Richard
Splendiani Andrea
Sugawara Hideaki
Takagi Toshihisa
Taylor James
Vos Rutger A
Wilkinson Mark D
Withers David
Yamaguchi Atsuko
Yamamoto Yasunori
Yamasaki Chisato
Zmasek Christian M
Publication venue: J Biomed Semantics
Publication date: 01/01/2013
Field of study

BACKGROUND: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Crossref

Springer - Publisher Connector

PubMed Central

Copenhagen University Research Information System

eScholarship - University of California

Apollo (Cambridge)

CDAOStore: A Phylogenetic Repository Using Logic Programming and Web Services

Author: Chisham Brandon
Pontelli Enrico
Son Tran Cao
Wright Ben
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Technical Communications of the 27th International Conference on Logic Programming (ICLP\u2711)
Publication date: 01/01/2011
Field of study

The CDAOStore is a portal aimed at facilitating the storage and retrieval of data and metadata associated to studies in the field of evolutionary biology and phylogenetic analysis. The novelty of CDAOStore lies in the use of a semantic-based approach to the storage and querying of data. This enables CDAOStore to overcome the data format restrictions and complexities of other repositories (e.g., TreeBase) and to provide a domain-specific query interface, derived from studies of querying requirements for phylogenetic databases. CDAOStore represents the first full implementation of the EvoIO stack, an inter-operation stack composed of a formal ontology (the Comparative Data Analysis Ontology), an XML exchange format (NeXML), and a web services API (PhyloWS). CDAOStore has been implemented on top of an RDF triple store, using a combination of standard web technologies and logic programming technology. In particular, we employed Prolog to support some of the format transformation tasks and, more importantly, in the implementation of several of the domain-specific queries, whose structure is beyond the reach of standard RDF query languages (e.g., SPARQL). CDAOStore is operational and it already hosts over 90 million RDF triples, imported from TreeBase or submitted by other domain scientists

Dagstuhl Research Online Publication Server

Infectious Disease Ontology

Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

PhilPapers

CiteSeerX

Crossref

Selecting biomedical data sources according to user preferences

Author: Barillot Emmanuel
Cohen-Boulakia Sarah
Froidevaux Christine
Graziani Stephane
Lair Severine
Radvanyi Francois
Stransky Nicolas
Publication venue: ScholarlyCommons
Publication date: 01/07/2005
Field of study

Motivation: Biologists are now faced with the problem of integrating information from multiple heterogeneous public sources with their own experimental data contained in individual sources. The selection of the sources to be considered is thus critically important. Results: Our aim is to support biologists by developing a module based on an algorithm that presents a selection of sources relevant to their query and matched to their own preferences. We approached this task by investigating the characteristics of biomedical data and introducing several preference criteria useful for bioinformaticians. This work was carried out in the framework of a project which aims to develop an integrative platform for the multiple parametric analysis of cancer. We illustrate our study through an elementary biomedical query occurring in a CGH analysis scenario

ScholarlyCommons@Penn

SEBIO: A Semantic BioInformatics Platform for the New E-Science

Author: Crespo Ángel García
Gómez Juan Miguel
Han Sung Kook
Lorenzo Damaris Fuentes
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 27/01/2012
Field of study

Knowledge integration and exchange of data within and among organizations is a universally recognized need in bioinformatics and genomics research through the e-science field. The main problem looming over the lack of integration is the fact that the current Web is an environment primarily developed for human users and micro-array data resources lack widely accepted standards; this leads to a tremendous data heterogeneity. Using semantic technologies as a key technology for interoperation of various datasets enables knowledge integration of the vast amount of biological and biomedical data. In this paper, we aim at providing a semantically-enhanced bioinformatics platform (SEBIO), which handles these issues effectively. We will describe the problems arisen and the solutions applied so far. For that, the SEBIO approach is unfolded and its main components explained, to see in more detail how perfectly it copes with the aforementioned difficulties

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Bioconductor: open software development for computational biology and bioinformatics.

Author: Bates Douglas
Bolstad Ben
Carey Vincent
Dettling Marcel
Dudoit Sandrine
Ellis Byron
Gautier Laurent
Ge Yongchao
Gentleman Robert
Gentry Jeff
Hornik Kurt
Hothorn Torsten
Huber Wolfgang
Iacus Stefano
Irizarry Rafael
Leisch Friedrich
Li Cheng
Maechler Martin
Rossini Anthony
Sawitzki Gunther
Smith Colin
Smyth Gordon
Tierney Luke
Yang Jean
Zhang Jianhua
Publication venue: eScholarship, University of California
Publication date: 01/01/2004
Field of study

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples

Repository for Publications and Research Data

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

ZHAW digitalcollection

Collection Of Biostatistics Research Archive

Online Research Database In Technology

University of Melbourne Institutional Repository