43 research outputs found

    Erratum to: MINE: Module Identification in Networks

    Full text link

    Identifying Pathway Proteins in Networks using Convergence

    Get PDF
    One of the key goals of systems biology concerns the analysis of experimental biological data available to the scientific public. New technologies are rapidly developed to observe and report whole-scale biological phenomena; however, few methods exist with the ability to produce specific, testable hypotheses from this noisy ‘big’ data. In this work, we propose an approach that combines the power of data-driven network theory along with knowledge-based ontology to tackle this problem. Network models are especially powerful due to their ability to display elements of interest and their relationships as internetwork structures. Additionally, ontological data actually supplements the confidence of relationships within the model without clouding critical structure identification. As such, we postulate that given a (gene/protein) marker set of interest, we can systematically identify the core of their interactions (if they are indeed working together toward a biological function), via elimination of original markers and addition of additional necessary markers. This concept, which we refer to as “convergence,” harnesses the idea of “guilt-by-association” and recursion to identify whether a core of relationships exists between markers. In this study, we test graph theoretic concepts such as shortest-path, k-Nearest- Neighbor and clustering) to identify cores iteratively in data- and knowledge-based networks in the canonical yeast Pheromone Mating Response pathway. Additionally, we provide results for convergence application in virus infection, hearing loss, and Parkinson’s disease. Our results indicate that if a marker set has common discrete function, this approach is able to identify that function, its interacting markers, and any new elements necessary to complete the structural core of that function. The result below find that the shortest path function is the best approach of those used, finding small target sets that contain a majority or all of the markers in the gold standard pathway. The power of this approach lies in its ability to be used in investigative studies to inform decisions concerning target selection

    Using ILP to Identify Pathway Activation Patterns in Systems Biology

    Get PDF
    We show a logical aggregation method that, combined with propositionalization methods, can construct novel structured biological features from gene expression data. We do this to gain understanding of pathway mechanisms, for instance, those associated with a particular disease. We illustrate this method on the task of distinguishing between two types of lung cancer; Squamous Cell Carcinoma (SCC) and Adenocarcinoma (AC). We identify pathway activation patterns in pathways previously implicated in the development of cancers. Our method identified a model with comparable predictive performance to the winning algorithm of a recent challenge, while providing biologically relevant explanations that may be useful to a biologist

    Enhancing Next-Generation Sequencing-Guided Cancer Care Through Cognitive Computing

    Get PDF
    Background: Using next-generation sequencing (NGS) to guide cancer therapy has created challenges in analyzing and reporting large volumes of genomic data to patients and caregivers. Specifically, providing current, accurate information on newly approved therapies and open clinical trials requires considerable manual curation performed mainly by human “molecular tumor boards” (MTBs). The purpose of this study was to determine the utility of cognitive computing as performed by Watson for Genomics (WfG) compared with a human MTB. Materials and Methods: One thousand eighteen patient cases that previously underwent targeted exon sequencing at the University of North Carolina (UNC) and subsequent analysis by the UNCseq informatics pipeline and the UNC MTB between November 7, 2011, and May 12, 2015, were analyzed with WfG, a cognitive computing technology for genomic analysis. Results: Using a WfG-curated actionable gene list, we identified additional genomic events of potential significance (not discovered by traditional MTB curation) in 323 (32%) patients. The majority of these additional genomic events were considered actionable based upon their ability to qualify patients for biomarker-selected clinical trials. Indeed, the opening of a relevant clinical trial within 1 month prior to WfG analysis provided the rationale for identification of a new actionable event in nearly a quarter of the 323 patients. This automated analysis took <3 minutes per case. Conclusion: These results demonstrate that the interpretation and actionability of somatic NGS results are evolving too rapidly to rely solely on human curation. Molecular tumor boards empowered by cognitive computing could potentially improve patient care by providing a rapid, comprehensive approach for data analysis and consideration of up-to-date availability of clinical trials. Implications for Practice: The results of this study demonstrate that the interpretation and actionability of somatic next-generation sequencing results are evolving too rapidly to rely solely on human curation. Molecular tumor boards empowered by cognitive computing can significantly improve patient care by providing a fast, cost-effective, and comprehensive approach for data analysis in the delivery of precision medicine. Patients and physicians who are considering enrollment in clinical trials may benefit from the support of such tools applied to genomic data

    Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

    Get PDF
    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome

    Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project

    Get PDF
    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome
    corecore