20 research outputs found

    Multi Split Conformal Prediction

    Full text link
    Split conformal prediction is a computationally efficient method for performing distribution-free predictive inference in regression. It involves, however, a one-time random split of the data, and the result depends on the particular split. To address this problem, we propose multi split conformal prediction, a simple method based on Markov's inequality to aggregate single split conformal prediction intervals across multiple splits.Comment: 12 pages, 1 figure, 2 tabl

    Simulating gene silencing through intervention analysis

    Get PDF
    We propose a novel method for simulating the effects of gene silencing. Our approach combines relevant subject matter information provided by biological pathways with gene expression levels measured in regular conditions to predict the behavior of the system after one of the genes has been silenced. We achieve this by modeling gene silencing as an external intervention in a causal graphical model. To account for the uncertainty associated with the structure learning of the graphical model, we adopt a bootstrap approach. We illustrate our proposal on a Drosophila melanogaster gene silencing experiment

    Leveraging three-dimensional chromatin architecture for effective reconstruction of enhancer-target gene regulatory interactions

    Get PDF
    A growing amount of evidence in literature suggests that germline sequence variants and somatic mutations in non-coding distal regulatory elements may be crucial for defining disease risk and prognostic stratification of patients, in genetic disorders as well as in cancer. Their functional interpretation is challenging because genome-wide enhancer-target gene (ETG) pairing is an open problem in genomics. The solutions proposed so far do not account for the hierarchy of structural domains which define chromatin three-dimensional (3D) architecture. Here we introduce a change of perspective based on the definition of multi-scale structural chromatin domains, integrated in a statistical framework to define ETG pairs. In this work (i) we develop a computational and statistical framework to reconstruct a comprehensive map of ETG pairs leveraging functional genomics data; (ii) we demonstrate that the incorporation of chromatin 3D architecture information improves ETG pairing accuracy and (iii) we use multiple experimental datasets to extensively benchmark our method against previous solutions for the genome-wide reconstruction of ETG pairs. This solution will facilitate the annotation and interpretation of sequence variants in distal non-coding regulatory elements. We expect this to be especially helpful in clinically oriented applications of whole genome sequencing in cancer and undiagnosed genetic diseases research.A growing amount of evidence in literature suggests that germline sequence variants and somatic mutations in non-coding distal regulatory elements may be crucial for defining disease risk and prognostic stratification of patients, in genetic disorders as well as in cancer. Their functional interpretation is challenging because genome-wide enhancer–target gene (ETG) pairing is an open problem in genomics. The solutions proposed so far do not account for the hierarchy of structural domains which define chromatin three-dimensional (3D) architecture. Here we introduce a change of perspective based on the definition of multi-scale structural chromatin domains, integrated in a statistical framework to define ETG pairs. In this work (i) we develop a computational and statistical framework to reconstruct a comprehensive map of ETG pairs leveraging functional genomics data; (ii) we demonstrate that the incorporation of chromatin 3D architecture information improves ETG pairing accuracy and (iii) we use multiple experimental datasets to extensively benchmark our method against previous solutions for the genome-wide reconstruction of ETG pairs. This solution will facilitate the annotation and interpretation of sequence variants in distal non-coding regulatory elements. We expect this to be especially helpful in clinically oriented applications of whole genome sequencing in cancer and undiagnosed genetic diseases research

    High-throughput mediation analysis of human proteome and metabolome identifies mediators of post-bariatric surgical diabetes control

    Get PDF
    To improve the power of mediation in high-throughput studies, here we introduce High-throughput mediation analysis (Hitman), which accounts for direction of mediation and applies empirical Bayesian linear modeling. We apply Hitman in a retrospective, exploratory analysis of the SLIMM-T2D clinical trial in which participants with type 2 diabetes were randomized to Roux-en-Y gastric bypass (RYGB) or nonsurgical diabetes/weight management, and fasting plasma proteome and metabolome were assayed up to 3 years. RYGB caused greater improvement in HbA1c, which was mediated by growth hormone receptor (GHR). GHR’s mediation is more significant than clinical mediators, including BMI. GHR decreases at 3 months postoperatively alongside increased insulin-like growth factor binding proteins IGFBP1/BP2; plasma GH increased at 1 year. Experimental validation indicates (1) hepatic GHR expression decreases in post-bariatric rats; (2) GHR knockdown in primary hepatocytes decreases gluconeogenic gene expression and glucose production. Thus, RYGB may induce resistance to diabetogenic effects of GH signaling

    Statistická analýza přežití a incidenční funkce

    Get PDF
    Competing risks occur often in survival analysis. In present work, we study different ap- proaches to modeling competing risks data and use examples to illustrate the most impor- tant results. In the competing risks setting it is often of interest to calculate the cumulative incidence of a specific event. We first study non-parametric estimation and then present three approaches to regression modeling. We use simple numerical example to demonstrate the use of non-parametric methods and perform analysis of real data from Stanford Heart Transplant Program to illustrate and compare the chosen regression models

    ESTIMATING MEDIATION EFFECTS IN EPIGENOMIC STUDIES

    No full text
    This book is the collection of the Abstract / Short Papers submitted by the authors of the International Conference of The CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS), held in Milan (Italy) on September 13-15, 201
    corecore