46 research outputs found

    BALLI: Bartlett-adjusted likelihood-based linear model approach for identifying differentially expressed genes with RNA-seq data

    Get PDF
    Background Transcriptomic profiles can improve our understanding of the phenotypic molecular basis of biological research, and many statistical methods have been proposed to identify differentially expressed genes (DEGs) under two or more conditions with RNA-seq data. However, statistical analyses with RNA-seq data are often limited by small sample sizes, and global variance estimates of RNA expression levels have been utilized as prior distributions for gene-specific variance estimates, making it difficult to generalize the methods to more complicated settings. We herein proposed a Bartlett-Adjusted Likelihood-based LInear mixed model approach (BALLI) to analyze more complicated RNA-seq data. The proposed method estimates the technical and biological variances with a linear mixed-effects model, with and without adjusting small sample bias using Bartlketts corrections. Results We conducted extensive simulations to compare the performance of BALLI with those of existing approaches (edgeR, DESeq2, and voom). Results from the simulation studies showed that BALLI correctly controlled the type-1 error rates at various nominal significance levels and produced better statistical power and precision estimates than those of other competing methods in various scenarios. Furthermore, BALLI was robust to variation of library size. It was also successfully applied to Holstein milk yield data, illustrating its practical value. Conclusions; BALLI is statistically more efficient and valid than existing methods, and we conclude that it is useful for identifying DEGs in RNA-seq analysis.This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (HI16C2037) and the National Research Foundation of Korea (2017M3A9F3046543). The funding body was not involved in the study design, data collection, analysis and interpretation, and writing the manuscript

    Multiphasic analysis of whole exome sequencing data identifies a novel mutation of ACTG1 in a nonsyndromic hearing loss family

    Get PDF
    BACKGROUND: The genetic heterogeneity of sensorineural hearing loss is a major hurdle to the efficient discovery of disease-causing genes. We designed a multiphasic analysis of copy number variation (CNV), linkage, and single nucleotide variation (SNV) of whole exome sequencing (WES) data for the efficient discovery of mutations causing nonsyndromic hearing loss (NSHL). RESULTS: From WES data, we identified five distinct CNV loci from a NSHL family, but they were not co-segregated among patients. Linkage analysis based on SNVs identified six candidate loci (logarithm of odds [LOD] >1.5). We selected 15 SNVs that co-segregated with NSHL in the family, which were located in six linkage candidate loci. Finally, the novel variant p.M305T in ACTG1 (DFNA20/26) was selected as a disease-causing variant. CONCLUSIONS: Here, we present a multiphasic CNV, linkage, and SNV analysis of WES data for the identification of a candidate mutation causing NSHL. Our stepwise, multiphasic approach enabled us to expedite the discovery of disease-causing variants from a large number of patient variants

    DeepParcellation: A novel deep learning method for robust brain magnetic resonance imaging parcellation in older East Asians

    Get PDF
    Accurate parcellation of cortical regions is crucial for distinguishing morphometric changes in aged brains, particularly in degenerative brain diseases. Normal aging and neurodegeneration precipitate brain structural changes, leading to distinct tissue contrast and shape in people aged >60 years. Manual parcellation by trained radiologists can yield a highly accurate outline of the brain; however, analyzing large datasets is laborious and expensive. Alternatively, newly-developed computational models can quickly and accurately conduct brain parcellation, although thus far only for the brains of Caucasian individuals. To develop a computational model for the brain parcellation of older East Asians, we trained magnetic resonance images of dimensions 256 × 256 × 256 on 5,035 brains of older East Asians (Gwangju Alzheimer’s and Related Dementia) and 2,535 brains of Caucasians. The novel N-way strategy combining three memory reduction techniques inception blocks, dilated convolutions, and attention gates was adopted for our model to overcome the intrinsic memory requirement problem. Our method proved to be compatible with the commonly used parcellation model for Caucasians and showed higher similarity and robust reliability in older aged and East Asian groups. In addition, several brain regions showing the superiority of the parcellation suggest that DeepParcellation has a great potential for applications in neurodegenerative diseases such as Alzheimer’s disease

    Analysis of significant protein abundance from multiple reaction-monitoring data

    Get PDF
    Background Discovering reliable protein biomarkers is one of the most important issues in biomedical research. The ELISA is a traditional technique for accurate quantitation of well-known proteins. Recently, the multiple reaction-monitoring (MRM) mass spectrometry has been proposed for quantifying newly discovered protein and has become a popular alternative to ELISA. For the MRM data analysis, linear mixed modeling (LMM) has been used to analyze MRM data. MSstats is one of the most widely used tools for MRM data analysis that is based on the LMMs. However, LMMs often provide various significance results, depending on model specification. Sometimes it would be difficult to specify a correct LMM method for the analysis of MRM data. Here, we propose a new logistic regression-based method for Significance Analysis of Multiple Reaction Monitoring (LR-SAM). Results Through simulation studies, we demonstrate that LMM methods may not preserve type I error, thus yielding high false- positive errors, depending on how random effects are specified. Our simulation study also shows that the LR-SAM approach performs similarly well as LMM approaches, in most cases. However, LR-SAM performs better than the LMMs, particularly when the effects sizes of peptides from the same protein are heterogeneous. Our proposed method was applied to MRM data for identification of proteins associated with clinical responses of treatment of 115 hepatocellular carcinoma (HCC) patients with the tyrosine kinase inhibitor sorafenib. Of 124 candidate proteins, LMM approaches provided 6 results varying in significance, while LR-SAM, by contrast, yielded 18 significant results that were quite reproducibly consistent. Conclusion As exemplified by an application to HCC data set, LR-SAM more effectively identified proteins associated with clinical responses of treatment than LMM did.This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI16C2037, HI15C2165). Publication of this article was sponsored by HI16C2037 grant

    APOE Promoter Polymorphism-219T/G is an Effect Modifier of the Influence of APOE ε4 on Alzheimer's Disease Risk in a Multiracial Sample

    Get PDF
    Variants in the APOE gene region may explain ethnic differences in the association of Alzheimer's disease (AD) with ε4. Ethnic differences in allele frequencies for three APOE region SNPs (single nucleotide polymorphisms) were identified and tested for association in 19,398 East Asians (EastA), including Koreans and Japanese, 15,836 European ancestry (EuroA) individuals, and 4985 African Americans, and with brain imaging measures of cortical atrophy in sub-samples of Koreans and EuroAs. Among ε4/ε4 individuals, AD risk increased substantially in a dose-dependent manner with the number of APOE promoter SNP rs405509 T alleles in EastAs (TT: OR (odds ratio) = 27.02, p = 8.80 × 10-94; GT: OR = 15.87, p = 2.62 × 10-9) and EuroAs (TT: OR = 18.13, p = 2.69 × 10-108; GT: OR = 12.63, p = 3.44 × 10-64), and rs405509-T homozygotes had a younger onset and more severe cortical atrophy than those with G-allele. Functional experiments using APOE promoter fragments demonstrated that TT lowered APOE expression in human brain and serum. The modifying effect of rs405509 genotype explained much of the ethnic variability in the AD/ε4 association, and increasing APOE expression might lower AD risk among ε4 homozygotes

    Multiancestry analysis of the HLA locus in Alzheimer's and Parkinson's diseases uncovers a shared adaptive immune response mediated by HLA-DRB1*04 subtypes

    Get PDF
    11 páginas, 4 figuras, 2 tablas. Datasets en su material suplementario. This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2302720120/-/DCSupplemental.Across multiancestry groups, we analyzed Human Leukocyte Antigen (HLA) associations in over 176,000 individuals with Parkinson's disease (PD) and Alzheimer's disease (AD) versus controls. We demonstrate that the two diseases share the same protective association at the HLA locus. HLA-specific fine-mapping showed that hierarchical protective effects of HLA-DRB1*04 subtypes best accounted for the association, strongest with HLA-DRB1*04:04 and HLA-DRB1*04:07, and intermediary with HLA-DRB1*04:01 and HLA-DRB1*04:03. The same signal was associated with decreased neurofibrillary tangles in postmortem brains and was associated with reduced tau levels in cerebrospinal fluid and to a lower extent with increased Aβ42. Protective HLA-DRB1*04 subtypes strongly bound the aggregation-prone tau PHF6 sequence, however only when acetylated at a lysine (K311), a common posttranslational modification central to tau aggregation. An HLA-DRB1*04-mediated adaptive immune response decreases PD and AD risks, potentially by acting against tau, offering the possibility of therapeutic avenues.This work was supported by the Michael J. Fox Foundation grant MJFF-020161 (E.M., Z.G.-O.), NIH and National Institute of Aging grants AG060747 (M.D.G.), AG066206 (Z.H.), AG066515 (Z.H., M.D.G.), the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie (grant agreement No. 890650, Y.L.G.), the Alzheimer’s Association (AARF-20-683984, M.E.B.), and the Iqbal Farrukh and Asad Jamal Fund, a grant from the EU Joint Programme—Neurodegenerative Disease Research (European Alzheimer DNA BioBank, EADB; JPND), the Japan Agency for Medical Research and Development JP21dk0207045 (T.I.), JP21dk020704 (K.O., S.N.), JP21km040550 (K.O.), the Einstein Center for Neurosciences in Berlin (S.M.Y.), the Swedish Research Council (#2018-02532, H.Z.), the European Research Council (#681712, H.Z.), and the Swedish State Support for Clinical Research (#ALFGBG-720931, H.Z.). Inserm UMR1167 is also funded by the Inserm, Institut Pasteur de Lille, Lille Métropole Communauté Urbaine, and the French government’s LABEX DISTALZ program (development of innovative strategies for a transdisciplinary approach to AD). Additional funders of individual investigators and institutions who contributed to data collection and genotyping are provided in SI Appendix.Peer reviewe

    Multiancestry analysis of the HLA locus in Alzheimer’s and Parkinson’s diseases uncovers a shared adaptive immune response mediated by HLA-DRB1*04 subtypes

    Get PDF
    Across multiancestry groups, we analyzed Human Leukocyte Antigen (HLA) associations in over 176,000 individuals with Parkinson’s disease (PD) and Alzheimer’s disease (AD) versus controls. We demonstrate that the two diseases share the same protective association at the HLA locus. HLA-specific fine-mapping showed that hierarchical protective effects of HLA-DRB1*04 subtypes best accounted for the association, strongest with HLA-DRB1*04:04 and HLA-DRB1*04:07, and intermediary with HLA-DRB1*04:01 and HLA-DRB1*04:03. The same signal was associated with decreased neurofibrillary tangles in postmortem brains and was associated with reduced tau levels in cerebrospinal fluid and to a lower extent with increased Aβ42. Protective HLA-DRB1*04 subtypes strongly bound the aggregation-prone tau PHF6 sequence, however only when acetylated at a lysine (K311), a common posttranslational modification central to tau aggregation. An HLA-DRB1*04-mediated adaptive immune response decreases PD and AD risks, potentially by acting against tau, offering the possibility of therapeutic avenues

    Local-pooled-error test for RNA sequencing experiments with a small number of replicates

    No full text
    RNA-Sequencing (RNA-Seq) provides valuable information for characterizing the molecular nature of the cells, in particular, identification of differentially expressed transcripts on a genome-wide scale. Unfortunately, cost and limited specimen availability often lead to studies with small sample sizes, and hypothesis testing on differential expression between classes with a small number of samples is generally limited. The problem is especially challenging when only one sample per each class exists. In this case, only a few methods among many that have been developed are applicable for identifying differentially expressed transcripts. Thus, the aim of this study was to develop a method able to accurately test differential expression with a limited number of samples, in particular non-replicated samples. We propose a local-pooled-error method for RNA-Seq data (LPEseq) to account for non-replicated samples in the analysis of differential expression. Our LPEseq method extends the existing LPE method, which was proposed for microarray data, to allow examination of non-replicated RNA-Seq experiments. We demonstrated the validity of the LPEseq method using both real and simulated datasets. By comparing the results obtained using the LPEseq method with those obtained from other methods, we found that the LPEseq method outperformed the others for non-replicated datasets, and showed a similar performance with replicated samples; LPEseq consistently showed high true discovery rate while not increasing the rate of false positives regardless of the number of samples. Our proposed LPEseq method can be effectively used to conduct differential expression analysis as a preliminary design step or for investigation of a rare specimen, for which a limited number of samples is available<br
    corecore