20 research outputs found

    Analysis of Functional Phonetic Data

    Get PDF

    Prediction of overall survival for patients with metastatic castration-resistant prostate cancer : development of a prognostic model through a crowdsourced challenge with open clinical trial data

    Get PDF
    Background Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. Methods Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. Findings 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0.791; Bayes factor >5) and surpassed the reference model (iAUC 0.743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3.32, 95% CI 2.39-4.62, p Interpretation Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.Peer reviewe

    Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns

    Get PDF
    Epigenetic processes, including DNA methylation (DNAm), are among the mechanisms allowing integration of genetic and environmental factors to shape cellular function. While many studies have investigated either environmental or genetic contributions to DNAm, few have assessed their integrated effects. Here we examine the relative contributions of prenatal environmental factors and genotype on DNA methylation in neonatal blood at variably methylated regions (VMRs) in 4 independent cohorts (overall n = 2365). We use Akaike's information criterion to test which factors best explain variability of methylation in the cohort-specific VMRs: several prenatal environmental factors (E), genotypes in cis (G), or their additive (G + E) or interaction (GxE) effects. Genetic and environmental factors in combination best explain DNAm at the majority of VMRs. The CpGs best explained by either G, G + E or GxE are functionally distinct. The enrichment of genetic variants from GxE models in GWAS for complex disorders supports their importance for disease risk.Peer reviewe

    Integrated analysis of environmental and genetic influences on cord blood DNA methylation in new-borns

    Get PDF
    Epigenetic processes, including DNA methylation (DNAm), are among the mechanisms allowing integration of genetic and environmental factors to shape cellular function. While many studies have investigated either environmental or genetic contributions to DNAm, few have assessed their integrated effects. Here we examine the relative contributions of prenatal environmental factors and genotype on DNA methylation in neonatal blood at variably methylated regions (VMRs) in 4 independent cohorts (overall n = 2365). We use Akaike’s information criterion to test which factors best explain variability of methylation in the cohort-specific VMRs: several prenatal environmental factors (E), genotypes in cis (G), or their additive (G + E) or interaction (GxE) effects. Genetic and environmental factors in combination best explain DNAm at the majority of VMRs. The CpGs best explained by either G, G + E or GxE are functionally distinct. The enrichment of genetic variants from GxE models in GWAS for complex disorders supports their importance for disease risk

    Inferring catalysis in biological systems

    No full text
    Kondofersky I, Theis FJ, Fuchs C. Inferring catalysis in biological systems. IET Systems Biology. 2016;10(6):210-218.In systems biology, one is often interested in the communication patterns between several species, such as genes, enzymes or proteins. These patterns become more recognisable when temporal experiments are performed. This temporal communication can be structured by reaction networks such as gene regulatory networks or signalling pathways. Mathematical modelling of data arising from such networks can reveal important details, thus helping to understand the studied system. In many cases, however, corresponding models still deviate from the observed data. This may be due to unknown but present catalytic reactions. From a modelling perspective, the question of whether a certain reaction is catalysed leads to a large increase of model candidates. For large networks the calibration of all possible models becomes computationally infeasible. We propose a method which determines a substantially reduced set of appropriate model candidates and identifies the catalyst of each reaction at the same time. This is incorporated in a multiple-step procedure which first extends the network by additional latent variables and subsequently identifies catalyst candidates using similarity analysis methods. Results from synthetic data examples suggest a good performance even for non-informative data with few observations. Applied on CD95 apoptotic pathway our method provides new insights into apoptosis regulation

    Identifying latent dynamic components in biological systems

    No full text
    Kondofersky I, Fuchs C, Theis FJ. Identifying latent dynamic components in biological systems. IET Systems Biology. 2015;9(5):193-203.In computational systems biology, the general aim is to derive regulatory models from multivariate readouts, thereby generating predictions for novel experiments. In the past, many such models have been formulated for different biological applications. The authors consider the scenario where a given model fails to predict a set of observations with acceptable accuracy and ask the question whether this is because of the model lacking important external regulations. Real-world examples for such entities range from microRNAs to metabolic fluxes. To improve the prediction, they propose an algorithm to systematically extend the network by an additional latent dynamic variable which has an exogenous effect on the considered network. This variable's time course and influence on the other species is estimated in a two-step procedure involving spline approximation, maximum-likelihood estimation and model selection. Simulation studies show that such a hidden influence can successfully be inferred. The method is also applied to a signalling pathway model where they analyse real data and obtain promising results. Furthermore, the technique can be employed to detect incomplete network structures

    Atrx promotes heterochromatin formation at retrotransposons

    No full text
    Sadic D, Schmidt K, Groh S, et al. Atrx promotes heterochromatin formation at retrotransposons. EMBO reports. 2015;16(7):836-850.More than 50% of mammalian genomes consist of retrotransposon sequences. Silencing of retrotransposons by heterochromatin is essential to ensure genomic stability and transcriptional integrity. Here, we identified a short sequence element in intracisternal A particle (IAP) retrotransposons that is sufficient to trigger heterochromatin formation. We used this sequence in a genome-wide shRNA screen and identified the chromatin remodeler Atrx as a novel regulator of IAP silencing. Atrx binds to IAP elements and is necessary for efficient heterochromatin formation. In addition, Atrx facilitates a robust and largely inaccessible heterochromatin structure as Atrx knockout cells display increased chromatin accessibility at retrotransposons and non-repetitive heterochromatic loci. In summary, we demonstrate a direct role of Atrx in the establishment and robust maintenance of heterochromatin

    Three general concepts to improve risk prediction: good data, wisdom of the crowd, recalibration

    No full text
    Kondofersky I, Laimighofer M, Kurz C, et al. Three general concepts to improve risk prediction: good data, wisdom of the crowd, recalibration. F1000Research. 2016;5: 2671.In today's information age, the necessary means exist for clinical risk prediction to capitalize on a multitude of data sources, increasing the potential for greater accuracy and improved patient care. Towards this objective, the Prostate Cancer DREAM Challenge posted comprehensive information from three clinical trials recording survival for patients with metastatic castration-resistant prostate cancer treated with first-line docetaxel. A subset of an independent clinical trial was used for interim evaluation of model submissions, providing critical feedback to participating teams for tailoring their models to the desired target. Final submitted models were evaluated and ranked on the independent clinical trial. Our team, called "A Bavarian Dream", utilized many of the common statistical methods for data dimension reduction and summarization during the trial. Three general modeling principles emerged that were deemed helpful for building accurate risk prediction tools and ending up among the winning teams of both sub-challenges. These principles included: first, good data, encompassing the collection of important variables and imputation of missing data; second, wisdom of the crowd, extending beyond the usual model ensemble notion to the inclusion of experts on specific risk ranges; and third, recalibration, entailing transfer learning to the target source. In this study, we illustrate the application and impact of these principles applied to data from the Prostate Cancer DREAM Challenge
    corecore