8,727 research outputs found

    Computational Models for Transplant Biomarker Discovery.

    Get PDF
    Translational medicine offers a rich promise for improved diagnostics and drug discovery for biomedical research in the field of transplantation, where continued unmet diagnostic and therapeutic needs persist. Current advent of genomics and proteomics profiling called "omics" provides new resources to develop novel biomarkers for clinical routine. Establishing such a marker system heavily depends on appropriate applications of computational algorithms and software, which are basically based on mathematical theories and models. Understanding these theories would help to apply appropriate algorithms to ensure biomarker systems successful. Here, we review the key advances in theories and mathematical models relevant to transplant biomarker developments. Advantages and limitations inherent inside these models are discussed. The principles of key -computational approaches for selecting efficiently the best subset of biomarkers from high--dimensional omics data are highlighted. Prediction models are also introduced, and the integration of multi-microarray data is also discussed. Appreciating these key advances would help to accelerate the development of clinically reliable biomarker systems

    Computational solutions for addressing heterogeneity in DNA methylation data

    Get PDF
    DNA methylation, a reversible epigenetic modification, has been implicated with various bi- ological processes including gene regulation. Due to the multitude of datasets available, it is a premier candidate for computational tool development, especially for investigating hetero- geneity within and across samples. We differentiate between three levels of heterogeneity in DNA methylation data: between-group, between-sample, and within-sample heterogeneity. Here, we separately address these three levels and present new computational approaches to quantify and systematically investigate heterogeneity. Epigenome-wide association studies relate a DNA methylation aberration to a phenotype and therefore address between-group heterogeneity. To facilitate such studies, which necessar- ily include data processing, exploratory data analysis, and differential analysis of DNA methy- lation, we extended the R-package RnBeads. We implemented novel methods for calculating the epigenetic age of individuals, novel imputation methods, and differential variability analysis. A use-case of the new features is presented using samples from Ewing sarcoma patients. As an important driver of epigenetic differences between phenotypes, we systematically investigated associations between donor genotypes and DNA methylation states in methylation quantitative trait loci (methQTL). To that end, we developed a novel computational framework –MAGAR– for determining statistically significant associations between genetic and epigenetic variations. We applied the new pipeline to samples obtained from sorted blood cells and complex bowel tissues of healthy individuals and found that tissue-specific and common methQTLs have dis- tinct genomic locations and biological properties. To investigate cell-type-specific DNA methylation profiles, which are the main drivers of within-group heterogeneity, computational deconvolution methods can be used to dissect DNA methylation patterns into latent methylation components. Deconvolution methods require pro- files of high technical quality and the identified components need to be biologically interpreted. We developed a computational pipeline to perform deconvolution of complex DNA methyla- tion data, which implements crucial data processing steps and facilitates result interpretation. We applied the protocol to lung adenocarcinoma samples and found indications of tumor in- filtration by immune cells and associations of the detected components with patient survival. Within-sample heterogeneity (WSH), i.e., heterogeneous DNA methylation patterns at a ge- nomic locus within a biological sample, is often neglected in epigenomic studies. We present the first systematic benchmark of scores quantifying WSH genome-wide using simulated and experimental data. Additionally, we created two novel scores that quantify DNA methyla- tion heterogeneity at single CpG resolution with improved robustness toward technical biases. WSH scores describe different types of WSH in simulated data, quantify differential hetero- geneity, and serve as a reliable estimator of tumor purity. Due to the broad availability of DNA methylation data, the levels of heterogeneity in DNA methylation data can be comprehensively investigated. We contribute novel computational frameworks for analyzing DNA methylation data with respect to different levels of hetero- geneity. We envision that this toolbox will be indispensible for understanding the functional implications of DNA methylation patterns in health and disease.DNA Methylierung ist eine reversible, epigenetische Modifikation, die mit verschiedenen biologischen Prozessen wie beispielsweise der Genregulation in Verbindung steht. Eine Vielzahl von DNA MethylierungsdatensĂ€tzen bildet die perfekte Grundlage zur Entwicklung von Softwareanwendungen, insbesondere um HeterogenitĂ€t innerhalb und zwischen Proben zu beschreiben. Wir unterscheiden drei Ebenen von HeterogenitĂ€t in DNA Methylierungsdaten: zwischen Gruppen, zwischen Proben und innerhalb einer Probe. Hier betrachten wir die drei Ebenen von HeterogenitĂ€t in DNA Methylierungsdaten unabhĂ€ngig voneinander und prĂ€sentieren neue AnsĂ€tze um die HeterogenitĂ€t zu beschreiben und zu quantifizieren. Epigenomweite Assoziationsstudien verknĂŒpfen eine DNA MethylierungsverĂ€nderung mit einem PhĂ€notypen und beschreiben HeterogenitĂ€t zwischen Gruppen. Um solche Studien, welche Datenprozessierung, sowie exploratorische und differentielle Datenanalyse beinhalten, zu vereinfachen haben wir die R-basierte Softwareanwendung RnBeads erweitert. Die Erweiterungen beinhalten neue Methoden, um das epigenetische Alter vorherzusagen, neue SchĂ€tzungsmethoden fĂŒr fehlende Datenpunkte und eine differentielle VariabilitĂ€tsanalyse. Die Analyse von Ewing-Sarkom Patientendaten wurde als Anwendungsbeispiel fĂŒr die neu entwickelten Methoden gewĂ€hlt. Wir untersuchten Assoziationen zwischen Genotypen und DNA Methylierung von einzelnen CpGs, um sogenannte methylation quantitative trait loci (methQTL) zu definieren. Diese stellen einen wichtiger Faktor dar, der epigenetische Unterschiede zwischen Gruppen induziert. Hierzu entwickelten wir ein neues Softwarepaket (MAGAR), um statistisch signifikante Assoziationen zwischen genetischer und epigenetischer Variation zu identifizieren. Wir wendeten diese Pipeline auf Blutzelltypen und komplexe Biopsien von gesunden Individuen an und konnten gemeinsame und gewebespezifische methQTLs in verschiedenen Bereichen des Genoms lokalisieren, die mit unterschiedlichen biologischen Eigenschaften verknĂŒpft sind. Die Hauptursache fĂŒr HeterogenitĂ€t innerhalb einer Gruppe sind zelltypspezifische DNA Methylierungsmuster. Um diese genauer zu untersuchen kann Dekonvolutionssoftware die DNA Methylierungsmatrix in unabhĂ€ngige Variationskomponenten zerlegen. Dekonvolutionsmethoden auf Basis von DNA Methylierung benötigen technisch hochwertige Profile und die identifizierten Komponenten mĂŒssen biologisch interpretiert werden. In dieser Arbeit entwickelten wir eine computerbasierte Pipeline zur DurchfĂŒhrung von Dekonvolutionsexperimenten, welche die Datenprozessierung und Interpretation der Resultate beinhaltet. Wir wendeten das entwickelte Protokoll auf Lungenadenokarzinome an und fanden Anzeichen fĂŒr eine Tumorinfiltration durch Immunzellen, sowie Verbindungen zum Überleben der Patienten. HeterogenitĂ€t innerhalb einer Probe (within-sample heterogeneity, WSH), d.h. heterogene Methylierungsmuster innerhalb einer Probe an einer genomischen Position, wird in epigenomischen Studien meist vernachlĂ€ssigt. Wir prĂ€sentieren den ersten Vergleich verschiedener, genomweiter WSH Maße auf simulierten und experimentellen Daten. ZusĂ€tzlich entwickelten wir zwei neue Maße um WSH fĂŒr einzelne CpGs zu berechnen, welche eine verbesserte Robustheit gegenĂŒber technischen Faktoren aufweisen. WSH Maße beschreiben verschiedene Arten von WSH, quantifizieren differentielle HeterogenitĂ€t und sagen Tumorreinheit vorher. Aufgrund der breiten VerfĂŒgbarkeit von DNA Methylierungsdaten können die Ebenen der HeterogenitĂ€t ganzheitlich beschrieben werden. In dieser Arbeit prĂ€sentieren wir neue Softwarelösungen zur Analyse von DNA Methylierungsdaten in Bezug auf die verschiedenen Ebenen der HeterogenitĂ€t. Wir sind davon ĂŒberzeugt, dass die vorgestellten Softwarewerkzeuge unverzichtbar fĂŒr das VerstĂ€ndnis von DNA Methylierung im kranken und gesunden Stadium sein werden

    Cross-species analysis of genetically engineered mouse models of MAPK-driven colorectal cancer identifies hallmarks of the human disease

    Get PDF
    Effective treatment options for advanced colorectal cancer (CRC) are limited, survival rates are poor and this disease continues to be a leading cause of cancer-related deaths worldwide. Despite being a highly heterogeneous disease, a large subset of individuals with sporadic CRC typically harbor relatively few established ‘driver’ lesions. Here, we describe a collection of genetically engineered mouse models (GEMMs) of sporadic CRC that combine lesions frequently altered in human patients, including well-characterized tumor suppressors and activators of MAPK signaling. Primary tumors from these models were profiled, and individual GEMM tumors segregated into groups based on their genotypes. Unique allelic and genotypic expression signatures were generated from these GEMMs and applied to clinically annotated human CRC patient samples. We provide evidence that a Kras signature derived from these GEMMs is capable of distinguishing human tumors harboring KRAS mutation, and tracks with poor prognosis in two independent human patient cohorts. Furthermore, the analysis of a panel of human CRC cell lines suggests that high expression of the GEMM Kras signature correlates with sensitivity to targeted pathway inhibitors. Together, these findings implicate GEMMs as powerful preclinical tools with the capacity to recapitulate relevant human disease biology, and support the use of genetic signatures generated in these models to facilitate future drug discovery and validation efforts

    Genomic analysis of macrophage gene signatures during idiopathic pulmonary fibrosis development

    Get PDF
    Idiopathic Pulmonary Fibrosis (IPF) is a chronic, progressive, irreversible lung disease. After diagnosis, the interstitial condition commonly presents 3-5 years of life expectancy if untreated. Despite the limited capacity of recapitulating IPF, animal models have been useful for identifying related pathways relevant for drug discovery and diagnostic tools development. Using these techniques, several immune-related mechanisms have been implicated to IPF. For instance, subpopulations of macrophages and monocytes-derived cells are recognized as centrally active in pulmonary immunological processes. One of the most used technologies is high-throughput gene expression analysis, which has been available for almost two decades now. The “omics” revolution has presented major impacts on macrophage and pulmonary fibrosis research. The present study aims to investigate macrophage dynamics within the context of IPF at the transcriptomic level. Using publicly available gene-expression data, we applied modern data science approaches to (1) understand longitudinal profiles within IPF models; (2) investigate correlation between macrophage genomic dynamics and IPF development; and (3) apply longitudinal profiles uncovered through multivariate data analysis to the development of new sets of predictors able to classify IPF and control samples accordingly. Principal Component Analysis and Hierarchical Clustering showed that our pipeline was able to construct a complex set of biomarker candidates that together outperformed gene expression alone in separating treatment groups in an IPF animal model dataset. We further assessed the predictive performance of our candidates on publicly available gene expression data from IPF patients. Once again, the constructed biomarker candidates were significantly differentiated between IPF and control samples. The data presented in this work strongly suggest that longitudinal data analysis holds major unappreciated potentials for translational medicine research
