27 research outputs found

    Critical assessment of human metabolic pathway databases: a stepping stone for future integration

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiple pathway databases are available that describe the human metabolic network and have proven their usefulness in many applications, ranging from the analysis and interpretation of high-throughput data to their use as a reference repository. However, so far the various human metabolic networks described by these databases have not been systematically compared and contrasted, nor has the extent to which they differ been quantified. For a researcher using these databases for particular analyses of human metabolism, it is crucial to know the extent of the differences in content and their underlying causes. Moreover, the outcomes of such a comparison are important for ongoing integration efforts.</p> <p>Results</p> <p>We compared the genes, EC numbers and reactions of five frequently used human metabolic pathway databases. The overlap is surprisingly low, especially on reaction level, where the databases agree on 3% of the 6968 reactions they have combined. Even for the well-established tricarboxylic acid cycle the databases agree on only 5 out of the 30 reactions in total. We identified the main causes for the lack of overlap. Importantly, the databases are partly complementary. Other explanations include the number of steps a conversion is described in and the number of possible alternative substrates listed. Missing metabolite identifiers and ambiguous names for metabolites also affect the comparison.</p> <p>Conclusions</p> <p>Our results show that each of the five networks compared provides us with a valuable piece of the puzzle of the complete reconstruction of the human metabolic network. To enable integration of the networks, next to a need for standardizing the metabolite names and identifiers, the conceptual differences between the databases should be resolved. Considerable manual intervention is required to reach the ultimate goal of a unified and biologically accurate model for studying the systems biology of human metabolism. Our comparison provides a stepping stone for such an endeavor.</p

    A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns.

    Get PDF
    In cancer, the primary tumour's organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor samples and 88% and 83% respectively on independent primary and metastatic samples, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    The road to knowledge: from biology to databases and back again

    Get PDF
    De verschillen tussen vijf databanken die de stofwisseling in de mens beschrijven zijn onverwacht groot. Dat beïnvloedt mogelijk de resultaten van computeranalyses die gebruik maken van deze databanken. Databanken die de stofwisseling in de mens beschrijven, vormen een steeds belangrijkere basis voor verder onderzoek. Mirande Stobbe deed er onderzoek naar. Voor één specifiek stofwisselingsproces, de citroenzuurcyclus, heeft Stobbe de verschillen tussen tien databanken opgelost en zo de beschrijving van het proces verbeterd. Zij ontwikkelde ook een webapplicatie (www.c2cards.nl) waarmee verschillen tussen databanken inzichtelijk worden gemaakt. Hiermee is de basis gelegd voor het oplossen van verschillen tussen databanken en een nauwkeuriger beschrijving van de menselijke stofwisseling

    Building the future of bioinformatics through student-facilitated conferencing.

    Get PDF
    Sharing results, techniques, and challenges is paramount to advance our understanding of any field of science. In the scientific community this exchange of ideas is mainly made possible through national and international conferences. Scientists have the opportunity to showcase their work, receive feedback, and improve their presentation skills. However, conferences can be large and intimidating for young researchers. In addition, for many of the more prestigious conferences, the very high number of submissions and low selection rate are major limitations to aspiring young researchers aiming to present their work to the scientific community. To improve student participation and proliferation of information, regional student groups have successfully organized conferences and symposia specifically aimed at students. This gives more students the opportunity to present their work and receive valuable experience and insight from peers and leaders in the field. At the same time, it is an ideal way for students to gain familiarity with the conference experience. In this paper, we highlight some of the benefits of participating in such student conferences, and we review the challenges we have encountered when organizing them. Both topics are illustrated in detail with examples from different ISCB Student Council Regional Student Groups

    Ten simple rules for a successful international consortium in big data omics

    No full text
    We acknowledge the support of the Spanish Ministry of Science and Innovation through the Instituto de Salud Carlos III, the Centro de Excelencia Severo Ochoa (CEX2020-001049-S, MCIN/AEI/10.13039/501100011033), and the co-funding with funds from the European Regional Development Fund corresponding to the Programa Operativo FEDER Plurirregional de España (POPE) 2014-2020 and MINECO/FEDER BIO2015-71792-P -awarded to IGG-. We thank the Departament de Salut and Departament de Recerca i Universitats of the Generalitat de Catalunya for its support through the CERCA programme and the co-funding with funds from the European Regional Development Fund corresponding to the Programa Operatiu FEDER de Catalunya 2014-2020. This project has received funding from the European Union's Horizon 2020 research and innovation programme through the EUCANCan project under grant agreement No 825835. The Institute for Research in Biomedicine Barcelona is a recipient of a Severo Ochoa Centre of Excellence Award (SEV-2015-0500) from the Spanish Ministry of Economy and Competitiveness and is supported by Centres de Recerca de Catalunya (Generalitat de Catalunya). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Consensus and conflict cards for metabolic pathway databases

    Get PDF
    <p>Background: The metabolic network of H. sapiens and many other organisms is described in multiple pathway databases. The level of agreement between these descriptions, however, has proven to be low. We can use these different descriptions to our advantage by identifying conflicting information and combining their knowledge into a single, more accurate, and more complete description. This task is, however, far from trivial.</p><p>Results: We introduce the concept of Consensus and Conflict Cards (C(2)Cards) to provide concise overviews of what the databases do or do not agree on. Each card is centered at a single gene, EC number or reaction. These three complementary perspectives make it possible to distinguish disagreements on the underlying biology of a metabolic process from differences that can be explained by different decisions on how and in what detail to represent knowledge. As a proof-of-concept, we implemented C(2)Cards(Human), as a web application http://www.molgenis.org/c2cards, covering five human pathway databases.</p><p>Conclusions: C(2)Cards can contribute to ongoing reconciliation efforts by simplifying the identification of consensus and conflicts between pathway databases and lowering the threshold for experts to contribute. Several case studies illustrate the potential of the C(2)Cards in identifying disagreements on the underlying biology of a metabolic process. The overviews may also point out controversial biological knowledge that should be subject of further research. Finally, the examples provided emphasize the importance of manual curation and the need for a broad community involvement.</p>

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
    corecore