217 research outputs found
an interpretable low complexity machine learning framework for robust exome based in silico diagnosis of crohn s disease patients
Abstract
Whole exome sequencing (WES) data are allowing researchers to pinpoint the causes of many Mendelian disorders. In time, sequencing data will be crucial to solve the genome interpretation puzzle, which aims at uncovering the genotype-to-phenotype relationship, but for the moment many conceptual and technical problems need to be addressed. In particular, very few attempts at the in-silico diagnosis of oligo-to-polygenic disorders have been made so far, due to the complexity of the challenge, the relative scarcity of the data and issues such as batch effects and data heterogeneity, which are confounder factors for machine learning (ML) methods. Here, we propose a method for the exome-based in-silico diagnosis of Crohn's disease (CD) patients which addresses many of the current methodological issues. First, we devise a rational ML-friendly feature representation for WES data based on the gene mutational burden concept, which is suitable for small sample sizes datasets. Second, we propose a Neural Network (NN) with parameter tying and heavy regularization, in order to limit its complexity and thus the risk of over-fitting. We trained and tested our NN on 3 CD case-controls datasets, comparing the performance with the participants of previous CAGI challenges. We show that, notwithstanding the limited NN complexity, it outperforms the previous approaches. Moreover, we interpret the NN predictions by analyzing the learned patterns at the variant and gene level and investigating the decision process leading to each prediction
Molecular Reclassification of Crohn's Disease by Cluster Analysis of Genetic Variants
Background Crohn's Disease (CD) has a heterogeneous presentation, and is typically classified according to extent and location of disease. The genetic susceptibility to CD is well known and genome-wide association scans (GWAS) and meta-analysis thereof have identified over 30 susceptibility loci. Except for the association between ileal CD and NOD2 mutations, efforts in trying to link CD genetics to clinical subphenotypes have not been very successful. We hypothesized that the large number of confirmed genetic variants enables (better) classification of CD patients. Methodology/Principal Findings To look for genetic-based subgroups, genotyping results of 46 SNPs identified from CD GWAS were analyzed by Latent Class Analysis (LCA) in CD patients and in healthy controls. Six genetic-based subgroups were identified in CD patients, which were significantly different from the five subgroups found in healthy controls. The identified CD-specific clusters are therefore likely to contribute to disease behavior. We then looked at whether we could relate the genetic-based subgroups to the currently used clinical parameters. Although modest differences in prevalence of disease location and behavior could be observed among the CD clusters, Random Forest analysis showed that patients could not be allocated to one of the 6 genetic-based subgroups based on the typically used clinical parameters alone. This points to a poor relationship between the genetic-based subgroups and the used clinical subphenotypes. Conclusions/Significance This approach serves as a first step to reclassify Crohn's disease. The used technique can be applied to other common complex diseases as well, and will help to complete patient characterization, in order to evolve towards personalized medicine. </sec
Extended analysis of a genome-wide association study in primary sclerosing cholangitis detects multiple novel risk loci.
A limited number of genetic risk factors have been reported in primary sclerosing cholangitis (PSC). To discover further genetic susceptibility factors for PSC, we followed up on a second tier of single nucleotide polymorphisms (SNPs) from a genome-wide association study (GWAS). We analyzed 45 SNPs in 1221 PSC cases and 3508 controls. The association results from the replication analysis and the original GWAS (715 PSC cases and 2962 controls) were combined in a meta-analysis comprising 1936 PSC cases and 6470 controls. We performed an analysis of bile microbial community composition in 39 PSC patients by 16S rRNA sequencing. Seventeen SNPs representing 12 distinct genetic loci achieved nominal significance (p(replication) <0.05) in the replication. The most robust novel association was detected at chromosome 1p36 (rs3748816; p(combined)=2.1 × 10(-8)) where the MMEL1 and TNFRSF14 genes represent potential disease genes. Eight additional novel loci showed suggestive evidence of association (p(repl) <0.05). FUT2 at chromosome 19q13 (rs602662; p(comb)=1.9 × 10(-6), rs281377; p(comb)=2.1 × 10(-6) and rs601338; p(comb)=2.7 × 10(-6)) is notable due to its implication in altered susceptibility to infectious agents. We found that FUT2 secretor status and genotype defined by rs601338 significantly influence biliary microbial community composition in PSC patients. We identify multiple new PSC risk loci by extended analysis of a PSC GWAS. FUT2 genotype needs to be taken into account when assessing the influence of microbiota on biliary pathology in PSC.Norwegian PSC Research Center
German Ministry of Education and Research (BMBF) through the National Genome Research Network (NGFN)
Integrated Research and Treatment Center - Transplantation
01EO0802
PopGen biobank
NIH
DK 8496
Mucosal Gene Expression of Antimicrobial Peptides in Inflammatory Bowel Disease Before and After First Infliximab Treatment
Background: Antimicrobial peptides (AMPs) protect the host intestinal mucosa against microorganisms. Abnormal expression of defensins was shown in inflammatory bowel disease (IBD), but it is not clear whether this is a primary defect. We investigated the impact of anti-inflammatory therapy with infliximab on the mucosal gene expression of AMPs in IBD. Methodology/Principal Findings: Mucosal gene expression of 81 AMPs was assessed in 61 IBD patients before and 4-6 weeks after their first infliximab infusion and in 12 control patients, using Affymetrix arrays. Quantitative real-time reverse-transcription PCR and immunohistochemistry were used to confirm microarray data. The dysregulation of many AMPs in colonic IBD in comparison with control colons was widely restored by infliximab therapy, and only DEFB1 expression remained significantly decreased after therapy in the colonic mucosa of IBD responders to infliximab. In ileal Crohn's disease (CD), expression of two neuropeptides with antimicrobial activity, PYY and CHGB, was significantly decreased before therapy compared to control ileums, and ileal PYY expression remained significantly decreased after therapy in CD responders. Expression of the downregulated AMPs before and after treatment (DEFB1 and PYY) correlated with villin 1 expression, a gut epithelial cell marker, indicating that the decrease is a consequence of epithelial damage. Conclusions/Significance: Our study shows that the dysregulation of AMPs in IBD mucosa is the consequence of inflammation, but may be responsible for perpetuation of inflammation due to ineffective clearance of microorganisms
Harmonizing genotype array data to understand genetic risk for brain amyloid burden in the AMYPAD PNHS Consortium
Funding: Innovative Medicines Initiative, Grant/Award Number: No 115952; European Union’s Horizon 2020 research; innovation programme; GE Healthcare; Springer Healthcare; MSCA, Grant/Award Number: #101108819; Alzheimer Association Research Fellowship, Grant/Award Number: #23AARF-1029663; Spanish Research Agency, Grant/Award Numbers: MICIU/AEI/10.13039/501100011033, RYC2022-038136-I; European Union FSE+, Grant/Award Number: PID2022-143106OA-I00; European Union FEDER; Alzheimer’s Disease Data Initiative; 23S06083-001; Stichting Alzheimer Onderzoek; Internal Funds KU Leuven; Flemish Research Foundation, Grant/Award Numbers: G0G1519N, G094418N; VLAIO, Grant/Award Number: HBC.2019.2523; NIHR biomedical research centre.INTRODUCTION: We sought to harmonize genotype data from the predementia AMYPAD (Amyloid Imaging to Prevent Alzheimer's Disease) Consortium, compute polygenic risk scores (PRS), and determine their association with global amyloid deposition. METHODS: Genetic data from five AMYPAD parent cohorts were harmonized, and PRS were computed for Alzheimer's disease (AD) susceptibility, cerebrospinal fluid (CSF) amyloid beta (Aβ)42, and CSF phosphorylated tau181. Cross-sectional amyloid (Centiloid [CL]) burden was available for all participants, and regression models determined if PRS were associated with CL burden. RESULTS: After harmonization, data for 867 participants showed that high CL burden was most strongly predicted by CSF Aβ42 PRS compared to traditional AD susceptibility PRS. DISCUSSION: This work emphasizes the importance of data harmonization and pooling of cohorts for large-powered studies. Findings suggest a genetic predisposition to amyloid pathology that may predispose individuals early in the AD continuum. This validates the potential use of PRS in clinical (trial) settings as a non-invasive tool to assess AD risk.Peer reviewe
Complete sequence of the 22q11.2 allele in 1,053 subjects with 22q11.2 deletion syndrome reveals modifiers of conotruncal heart defects
The 22q11.2 deletion syndrome (22q11.2DS) results from non-allelic homologous recombination between low-copy repeats termed LCR22. About 60%-70% of individuals with the typical 3 megabase (Mb) deletion from LCR22A-D have congenital heart disease, mostly of the conotruncal type (CTD), whereas others have normal cardiac anatomy. In this study, we tested whether variants in the hemizygous LCR22A-D region are associated with risk for CTDs on the basis of the sequence of the 22q11.2 region from 1,053 22q11.2DS individuals. We found a significant association (FDR p < 0.05) of the CTD subset with 62 common variants in a single linkage disequilibrium (LD) block in a 350 kb interval harboring CRKL. A total of 45 of the 62 variants were associated with increased risk for CTDs (odds ratio [OR) ranges: 1.64-4.75). Associations of four variants were replicated in a meta-analysis of three genome-wide association studies of CTDs in affected individuals without 22q11.2DS. One of the replicated variants, rs178252, is located in an open chromatin region and resides in the double-elite enhancer, GH22J020947, that is predicted to regulate CRKL (CRK-like proto-oncogene, cytoplasmic adaptor) expression. Approximately 23% of patients with nested LCR22C-D deletions have CTDs, and inactivation of Crkl in mice causes CTDs, thus implicating this gene as a modifier. Rs178252 and rs6004160 are expression quantitative trait loci (eQTLs) of CRKL. Furthermore, set-based tests identified an enhancer that is predicted to target CRKL and is significantly associated with CTD risk (GH22J020946, sequence kernal association test (SKAT) p = 7.21 × 10-5) in the 22q11.2DS cohort. These findings suggest that variance in CTD penetrance in the 22q11.2DS population can be explained in part by variants affecting CRKL expression
- …
