22 research outputs found

    Human metabolic atlas: an online resource for human metabolism

    Get PDF
    Human tissue-specific genome-scale metabolic models (GEMs) provide comprehensive understanding of human metabolism, which is of great value to the biomedical research community. To make this kind of data easily accessible to the public, we have designed and deployed the human metabolic atlas (HMA) website (http://www.metabolicatlas.org). This online resource provides comprehensive information about human metabolism, including the results of metabolic network analyses. We hope that it can also serve as an information exchange interface for human metabolism knowledge within the research community. The HMA consists of three major components: Repository, Hreed (Human REaction Entities Database) and Atlas. Repository is a collection of GEMs for specific human cell types and human-related microorganisms in SBML (System Biology Markup Language) format. The current release consists of several types of GEMs: a generic human GEM, 82 GEMs for normal cell types, 16 GEMs for different cancer cell types, 2 curated GEMs and 5 GEMs for human gut bacteria. Hreed contains detailed information about biochemical reactions. A web interface for Hreed facilitates an access to the Hreed reaction data, which can be easily retrieved by using specific keywords or names of related genes, proteins, compounds and cross-references. Atlas web interface can be used for visualization of the GEMs collection overlaid on KEGG metabolic pathway maps with a zoom/pan user interface. The HMA is a unique tool for studying human metabolism, ranging in scope from an individual cell, to a specific organ, to the overall human body. This resource is freely available under a Creative Commons Attribution-NonCommercial 4.0 International License

    Robust and consistent biomarker candidates identification by a machine learning approach applied to pancreatic ductal adenocarcinoma metastasis.

    Get PDF
    Machine Learning (ML) plays a crucial role in biomedical research. Nevertheless, it still has limitations in data integration and irreproducibility. To address these challenges, robust methods are needed. Pancreatic ductal adenocarcinoma (PDAC), a highly aggressive cancer with low early detection rates and survival rates, is used as a case study. PDAC lacks reliable diagnostic biomarkers, especially metastatic biomarkers, which remains an unmet need. In this study, we propose an ML-based approach for discovering disease biomarkers, apply it to the identification of a PDAC metastatic composite biomarker candidate, and demonstrate the advantages of harnessing data resources. We utilised primary tumour RNAseq data from five public repositories, pooling samples to maximise statistical power and integrating data by correcting for technical variance. Data were split into train and validation sets. The train dataset underwent variable selection via a 10-fold cross-validation process that combined three algorithms in 100 models per fold. Genes found in at least 80% of models and five folds were considered robust to build a consensus multivariate model. A random forest model was constructed using selected genes from the train dataset and tested in the validation set. We also assessed the goodness of prediction by recalibrating a model using only the validation data. The biological context and relevance of signals was explored through enrichment and pathway analyses using QIAGEN Ingenuity Pathway Analysis and GeneMANIA. We developed a pipeline that can detect robust signatures to build composite biomarkers. We tested the pipeline in PDAC, exploiting transcriptomics data from different sources, proposing a composite biomarker candidate comprised of fifteen genes consistently selected that showed very promising predictive capability. Biological contextualisation revealed links with cancer progression and metastasis, underscoring their potential relevance. All code is available in GitHub. This study establishes a robust framework for identifying composite biomarkers across various disease contexts. We demonstrate its potential by proposing a plausible composite biomarker candidate for PDAC metastasis. By reusing data from public repositories, we highlight the sustainability of our research and the wider applications of our pipeline. The preliminary findings shed light on a promising validation and application path

    A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae

    Get PDF
    RNA-seq, has recently become an attractive method of choice in the studies of transcriptomes, promising several advantages compared with microarrays. In this study, we sought to assess the contribution of the different analytical steps involved in the analysis of RNA-seq data generated with the Illumina platform, and to perform a cross-platform comparison based on the results obtained through Affymetrix microarray. As a case study for our work we, used the Saccharomyces cerevisiae strain CEN.PK 113-7D, grown under two different conditions (batch and chemostat). Here, we asses the influence of genetic variation on the estimation of gene expression level using three different aligners for read-mapping (Gsnap, Stampy and TopHat) on S288c genome, the capabilities of five different statistical methods to detect differential gene expression (baySeq, Cuffdiff, DESeq, edgeR and NOISeq) and we explored the consistency between RNA-seq analysis using reference genome and de novo assembly approach. High reproducibility among biological replicates (correlation >= 0.99) and high consistency between the two platforms for analysis of gene expression levels (correlation >= 0.91) are reported. The results from differential gene expression identification derived from the different statistical methods, as well as their integrated analysis results based on gene ontology annotation are in good agreement. Overall, our study provides a useful and comprehensive comparison between the two platforms (RNA-seq and microrrays) for gene expression analysis and addresses the contribution of the different steps involved in the analysis of RNA-seq data

    Global copy number profiling of cancer genomes

    Get PDF
    Summary: In this article, we introduce a robust and efficient strategy for deriving global and allele-specific copy number alternations (CNA) from cancer whole exome sequencing data based on Log R ratios and B-allele frequencies. Applying the approach to the analysis of over 200 skin cancer samples, we demonstrate its utility for discovering distinct CNA events and for deriving ancillary information such as tumor purity

    Reconstruction of Genome-Scale Active Metabolic Networks for 69 Human Cell Types and 16 Cancer Types Using INIT

    Get PDF
    Development of high throughput analytical methods has given physicians the potential access to extensive and patient-specific data sets, such as gene sequences, gene expression profiles or metabolite footprints. This opens for a new approach in health care, which is both personalized and based on system-level analysis. Genome-scale metabolic networks provide a mechanistic description of the relationships between different genes, which is valuable for the analysis and interpretation of large experimental data-sets. Here we describe the generation of genome-scale active metabolic networks for 69 different cell types and 16 cancer types using the INIT (Integrative Network Inference for Tissues) algorithm. The INIT algorithm uses cell type specific information about protein abundances contained in the Human Proteome Atlas as the main source of evidence. The generated models constitute the first step towards establishing a Human Metabolic Atlas, which will be a comprehensive description (accessible online) of the metabolism of different human cell types, and will allow for tissue-level and organism-level simulations in order to achieve a better understanding of complex diseases. A comparative analysis between the active metabolic networks of cancer types and healthy cell types allowed for identification of cancer-specific metabolic features that constitute generic potential drug targets for cancer treatment

    Database and Visualization for Advanced Systems Biology

    No full text
    In the information age, there is plenty of information available publicly in the field of biology. Utilization of biological data is still slow and inefficient compared to the amount of data generated. This problem arise due to the specific characteristics of biological data, which are complex, dynamic and variable. With the introduction of high throughput technologies, the gap between data creation and utilization has become wider. This issue is critical and poses a challenge in the field of systems biology, where data from several sources are needed for model construction and analysis.In order to build a data ecosystem to support human tissue specific genome reconstruction and further analysis, a collection of libraries, applications and a web site have been developed. A dedicated database management system was designed specifically for metabolic and related data to support human tissue specific genome scale metabolic model reconstruction providing data standardization and data integration. Two database APIs, Corgi and Dactyls, were developed following the Object-oriented data model to fulfill the database management system’s functions. This database management system was used to manage, provide and exchange information concerning particularly human metabolism. Furthermore was developed the visualization system, Ondine that allows overlaying of data and information on metabolic pathway maps with a zoom/pan user interface.In order to efficiently deploy human tissue specific metabolic information from a collection of genome-scale metabolic models (GEMs), the Human Metabolic Atlas (HMA) website was created as an online resource to provide comprehensive human metabolic information as models and as a database for further specific analysis. In addition, the Atlas also serves as a tool for communicating with the wider research community. The Atlas, providing a visualization of the metabolic map implemented on the Ondine engine, provides comparative information of metabolism among deposited GEMs. Hreed is intended to provide accurate information about human metabolism in order to exchange data with the community and to support metabolic network based modeling and analysis through both the graphical and application programming interfaces. This data ecosystem development and implementation is the starting step for the enhancement of data utilization in systems biology

    Database and Visualization for Advanced Systems Biology

    No full text
    <p>In the information age, there is plenty of information available publicly in the field of biology. Utilization of biological data is still slow and inefficient compared to the amount of data generated. This problem arise due to the specific characteristics of biological data, which are complex, dynamic and variable. With the introduction of high throughput technologies, the gap between data creation and utilization has become wider. This issue is critical and poses a challenge in the field of systems biology, where data from several sources are needed for model construction and analysis.</p> <p>In order to build a data ecosystem to support human tissue specific genome reconstruction and further analysis, a collection of libraries, applications and a web site have been developed. A dedicated database management system was designed specifically for metabolic and related data to support human tissue specific genome scale metabolic model reconstruction providing data standardization and data integration. Two database APIs, Corgi and Dactyls, were developed following the Object-oriented data model to fulfill the database management system’s functions. This database management system was used to manage, provide and exchange information concerning particularly human metabolism. Furthermore was developed the visualization system, Ondine that allows overlaying of data and information on metabolic pathway maps with a zoom/pan user interface.</p> <p>In order to efficiently deploy human tissue specific metabolic information from a collection of genome-scale metabolic models (GEMs), the Human Metabolic Atlas (HMA) website was created as an online resource to provide comprehensive human metabolic information as models and as a database for further specific analysis. In addition, the Atlas also serves as a tool for communicating with the wider research community. The Atlas, providing a visualization of the metabolic map implemented on the Ondine engine, provides comparative information of metabolism among deposited GEMs. Hreed is intended to provide accurate information about human metabolism in order to exchange data with the community and to support metabolic network based modeling and analysis through both the graphical and application programming interfaces. This data ecosystem development and implementation is the starting step for the enhancement of data utilization in systems biology.</p

    Human metabolic atlas: an online resource for human metabolism

    No full text
    Human tissue-specific genome-scale metabolic models (GEMs) provide comprehensive understanding of human metabolism, which is of great value to the biomedical research community. To make this kind of data easily accessible to the public, we have designed and deployed the human metabolic atlas (HMA) website (http://www.metabolicatlas.org). This online resource provides comprehensive information about human metabolism, including the results of metabolic network analyses. We hope that it can also serve as an information exchange interface for human metabolism knowledge within the research community. The HMA consists of three major components: Repository, Hreed (Human REaction Entities Database) and Atlas. Repository is a collection of GEMs for specific human cell types and human-related microorganisms in SBML (System Biology Markup Language) format. The current release consists of several types of GEMs: a generic human GEM, 82 GEMs for normal cell types, 16 GEMs for different cancer cell types, 2 curated GEMs and 5 GEMs for human gut bacteria. Hreed contains detailed information about biochemical reactions. A web interface for Hreed facilitates an access to the Hreed reaction data, which can be easily retrieved by using specific keywords or names of related genes, proteins, compounds and cross-references. Atlas web interface can be used for visualization of the GEMs collection overlaid on KEGG metabolic pathway maps with a zoom/pan user interface. The HMA is a unique tool for studying human metabolism, ranging in scope from an individual cell, to a specific organ, to the overall human body. This resource is freely available under a Creative Commons Attribution-NonCommercial 4.0 International License

    Mol-Zero-GAN: Zero-Shot Adaptation of Molecular Generative Adversarial Network for Specific Protein Targets

    No full text
    Drug discovery is a process that finds new potential drug candidates for curing diseases and is also vital to improving the wellness of people. Enhancing deep learning approaches, e.g., molecular generation models, increases the drug discovery process\u27s efficiency. However, there is a problem in this field in creating drug candidates with desired properties such as the quantitative estimate of druglikeness (QED), synthesis accessibility (SA), and binding affinity (BA), and there is a challenge for training generative model for specific protein targets that has less pharmaceutical data. In this research, we present Mol-Zero-GAN, a framework that aims to solve the problem based on Bayesian optimization (BO) to find the model optimal weights\u27 singular values, factorized by singular value decomposition, and can generate drug candidates with desired properties with no additional data. The proposed framework can produce drugs with the desired properties on protein targets of interest by optimizing the model\u27s weights. Our framework outperforms the state-of-the-art methods sharing the same objectives. Mol-Zero-GAN is publicly available at https://github.com/cucpbioinfo/Mol-Zero-GA
    corecore