Search CORE

88,851 research outputs found

PathSys: integrating molecular interaction graphs for systems biology

Author: Baitaluk Michael
Godbole Shubhada
Gupta Amarnath
Qian Xufei
Raval Alpan
Ray Animesh
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The goal of information integration in systems biology is to combine information from a number of databases and data sets, which are obtained from both high and low throughput experiments, under one data management scheme such that the cumulative information provides greater biological insight than is possible with individual information sources considered separately. RESULTS: Here we present PathSys, a graph-based system for creating a combined database of networks of interaction for generating integrated view of biological mechanisms. We used PathSys to integrate over 14 curated and publicly contributed data sources for the budding yeast (S. cerevisiae) and Gene Ontology. A number of exploratory questions were formulated as a combination of relational and graph-based queries to the integrated database. Thus, PathSys is a general-purpose, scalable, graph-data warehouse of biological information, complete with a graph manipulation and a query language, a storage mechanism and a generic data-importing mechanism through schema-mapping. CONCLUSION: Results from several test studies demonstrate the effectiveness of the approach in retrieving biologically interesting relations between genes and proteins, the networks connecting them, and of the utility of PathSys as a scalable graph-based warehouse for interaction-network integration and a hypothesis generator system. The PathSys's client software, named BiologicalNetworks, developed for navigation and analyses of molecular networks, is available as a Java Web Start application at

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

Representing and analysing molecular and cellular function in the computer

Author: Eldridge M
Gilbert D
Helden JV
Mancuso R
Naim A
Wernisch L
Wodak SJ
Publication venue: 'American Society for Biochemistry & Molecular Biology (ASBMB)'
Publication date: 01/01/2000
Field of study

Determining the biological function of a myriad of genes, and understanding how they interact to yield a living cell, is the major challenge of the post genome-sequencing era. The complexity of biological systems is such that this cannot be envisaged without the help of powerful computer systems capable of representing and analysing the intricate networks of physical and functional interactions between the different cellular components. In this review we try to provide the reader with an appreciation of where we stand in this regard. We discuss some of the inherent problems in describing the different facets of biological function, give an overview of how information on function is currently represented in the major biological databases, and describe different systems for organising and categorising the functions of gene products. In a second part, we present a new general data model, currently under development, which describes information on molecular function and cellular processes in a rigorous manner. The model is capable of representing a large variety of biochemical processes, including metabolic pathways, regulation of gene expression and signal transduction. It also incorporates taxonomies for categorising molecular entities, interactions and processes, and it offers means of viewing the information at different levels of resolution, and dealing with incomplete knowledge. The data model has been implemented in the database on protein function and cellular processes 'aMAZE' (http://www.ebi.ac.uk/research/pfbp/), which presently covers metabolic pathways and their regulation. Several tools for querying, displaying, and performing analyses on such pathways are briefly described in order to illustrate the practical applications enabled by the model

HAL AMU

DI-fusion

Brunel University Research Archive

Towards a Taxonomically Intelligent Phylogenetic Database

Author: Roderic Page
Publication venue
Publication date: 18/09/2007
Field of study

This note outlines some of the key intellectual obstacles that stand in the way of creating a usable phylogenetic database. These challenges include the need to accommodate multiple taxonomic names and classifications, and the need for tools to query trees in biologically meaningful ways. Until these problems are addressed, and a taxonomically intelligent phylogenetic database created, much of our phylogenetic knowledge will languish in the pages of journals

Crossref

Nature Precedings

Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Author: Bastien Olivier
Birkholtz Lyn-Marie
Breton Vincent
Grando Delphine
Hofmann-Apitius Martin
Jacq Nicolas
Joubert Fourie
Kasam Vinod
Louw Abraham I
Maréchal Eric
Ortet Philippe
Roy Sylvaine
Saïdani Nadia
Wells Gordon
Zimmermann Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

Hal - Université Grenoble Alpes

HAL AMU

Fraunhofer-ePrints

HAL Clermont Université

HAL Descartes

HAL-CEA

ProdInra

arXiv.org e-Print Archive

HAL-IN2P3

Springer - Publisher Connector

PubMed Central

UPSpace at the University of Pretoria

ProtNN: Fast and Accurate Nearest Neighbor Protein Function Prediction based on Graph Embedding in Structural and Topological Space

Author: Dhifli Wajdi
Diallo Abdoulaye Baniré
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/01/2016
Field of study

Studying the function of proteins is important for understanding the molecular mechanisms of life. The number of publicly available protein structures has increasingly become extremely large. Still, the determination of the function of a protein structure remains a difficult, costly, and time consuming task. The difficulties are often due to the essential role of spatial and topological structures in the determination of protein functions in living cells. In this paper, we propose ProtNN, a novel approach for protein function prediction. Given an unannotated protein structure and a set of annotated proteins, ProtNN finds the nearest neighbor annotated structures based on protein-graph pairwise similarities. Given a query protein, ProtNN finds the nearest neighbor reference proteins based on a graph representation model and a pairwise similarity between vector embedding of both query and reference protein-graphs in structural and topological spaces. ProtNN assigns to the query protein the function with the highest number of votes across the set of k nearest neighbor reference proteins, where k is a user-defined parameter. Experimental evaluation demonstrates that ProtNN is able to accurately classify several datasets in an extremely fast runtime compared to state-of-the-art approaches. We further show that ProtNN is able to scale up to a whole PDB dataset in a single-process mode with no parallelization, with a gain of thousands order of magnitude of runtime compared to state-of-the-art approaches

arXiv.org e-Print Archive

Springer - Publisher Connector

IMP Science Gateway: from the Portal to the Hub of Virtual Experimental Labs in Materials Science

Author: Baskova Olexandra
Bekenev Lev
Gatsenko Olexander
Gordienko Yuri
Stirenko Sergii
Zasimchuk Elena
Publication venue
Publication date: 22/04/2014
Field of study

"Science gateway" (SG) ideology means a user-friendly intuitive interface between scientists (or scientific communities) and different software components + various distributed computing infrastructures (DCIs) (like grids, clouds, clusters), where researchers can focus on their scientific goals and less on peculiarities of software/DCI. "IMP Science Gateway Portal" (http://scigate.imp.kiev.ua) for complex workflow management and integration of distributed computing resources (like clusters, service grids, desktop grids, clouds) is presented. It is created on the basis of WS-PGRADE and gUSE technologies, where WS-PGRADE is designed for science workflow operation and gUSE - for smooth integration of available resources for parallel and distributed computing in various heterogeneous distributed computing infrastructures (DCI). The typical scientific workflows with possible scenarios of its preparation and usage are presented. Several typical use cases for these science applications (scientific workflows) are considered for molecular dynamics (MD) simulations of complex behavior of various nanostructures (nanoindentation of graphene layers, defect system relaxation in metal nanocrystals, thermal stability of boron nitride nanotubes, etc.). The user experience is analyzed in the context of its practical applications for MD simulations in materials science, physics and nanotechnologies with available heterogeneous DCIs. In conclusion, the "science gateway" approach - workflow manager (like WS-PGRADE) + DCI resources manager (like gUSE)- gives opportunity to use the SG portal (like "IMP Science Gateway Portal") in a very promising way, namely, as a hub of various virtual experimental labs (different software components + various requirements to resources) in the context of its practical MD applications in materials science, physics, chemistry, biology, and nanotechnologies.Comment: 6 pages, 5 figures, 3 tables; 6th International Workshop on Science Gateways, IWSG-2014 (Dublin, Ireland, 3-5 June, 2014). arXiv admin note: substantial text overlap with arXiv:1404.545

arXiv.org e-Print Archive

Crossref

Bioconductor: open software development for computational biology and bioinformatics.

Author: Bates Douglas
Bolstad Ben
Carey Vincent
Dettling Marcel
Dudoit Sandrine
Ellis Byron
Gautier Laurent
Ge Yongchao
Gentleman Robert
Gentry Jeff
Hornik Kurt
Hothorn Torsten
Huber Wolfgang
Iacus Stefano
Irizarry Rafael
Leisch Friedrich
Li Cheng
Maechler Martin
Rossini Anthony
Sawitzki Gunther
Smith Colin
Smyth Gordon
Tierney Luke
Yang Jean
Zhang Jianhua
Publication venue: eScholarship, University of California
Publication date: 01/01/2004
Field of study

The Bioconductor project is an initiative for the collaborative creation of extensible software for computational biology and bioinformatics. The goals of the project include: fostering collaborative development and widespread use of innovative software, reducing barriers to entry into interdisciplinary scientific research, and promoting the achievement of remote reproducibility of research results. We describe details of our aims and methods, identify current challenges, compare Bioconductor to other open bioinformatics projects, and provide working examples

Repository for Publications and Research Data

AIR Universita degli studi di Milano

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

ZHAW digitalcollection

Collection Of Biostatistics Research Archive

Online Research Database In Technology

University of Melbourne Institutional Repository