14 research outputs found

    Case-control genome-wide association study of rheumatoid arthritis from Genetic Analysis Workshop 16 using penalized orthogonal-components regression-linear discriminant analysis

    Get PDF
    Currently, genome-wide association studies (GWAS) are conducted by collecting a massive number of SNPs (i.e., large p) for a relatively small number of individuals (i.e., small n) and associations are made between clinical phenotypes and genetic variation one single-nucleotide polymorphism (SNP) at a time. Univariate association approaches like this ignore the linkage disequilibrium between SNPs in regions of low recombination. This results in a low reliability of candidate gene identification. Here we propose to improve the case-control GWAS approach by implementing linear discriminant analysis (LDA) through a penalized orthogonal-components regression (POCRE), a newly developed variable selection method for large p small n data. The proposed POCRE-LDA method was applied to the Genetic Analysis Workshop 16 case-control data for rheumatoid arthritis (RA). In addition to the two regions on chromosomes 6 and 9 previously associated with RA by GWAS, we identified SNPs on chromosomes 10 and 18 as potential candidates for further investigation

    Genome-wide association analysis of GAW17 data using an empirical Bayes variable selection

    Get PDF
    Next-generation sequencing technologies enable us to explore rare functional variants. However, most current statistical techniques are too underpowered to capture signals of rare variants in genome-wide association studies. We propose a supervised coalescing of single-nucleotide polymorphisms to obtain gene-based markers that can stably reveal possible genetic effects related to rare alleles. We use a newly developed empirical Bayes variable selection algorithm to identify associations between studied traits and genetic markers. Using our novel method, we analyzed the three continuous phenotypes in the GAW17 data set across 200 replicates, with intriguing results

    Quantitative Serum Glycomics of Esophageal Adenocarcinoma, and Other Esophageal Disease Onsets

    Get PDF
    Aberrant glycosylation has been implicated in various types of cancers and changes in glycosylation may be associated with signaling pathways during malignant transformation. Glycomic profiling of blood serum, in which cancer cell proteins or their fragments with altered glycosylation patterns are shed, could reveal the altered glycosylation. We performed glycomic profiling of serum from patients with no known disease (N=18), patients with high grade dysplasia (HGD, N=11) and Barrett’s (N=5), and patients with esophageal adenocarcinoma (EAC, N=50) in an attempt to delineate distinct differences in glycosylation between these groups. The relative intensities of 98 features were significantly different among the disease onsets; 26 of these correspond to known glycan structures. The changes in the relative intensities of three of the known glycan structures predicted esophageal adenocarcinoma with 94% sensitivity and better than 60% specificity as determined by receiver operating characteristic (ROC) analysis. We have demonstrated that comparative glycomic profiling of EAC reveals a subset of glycans that can be selected as candidate biomarkers. These markers can differentiate disease-free from HGD, disease-free from EAC, and HGD from EAC. The clinical utility of these glycan biomarkers requires further validation

    Genome-wide case-control study in GAW17 using coalesced rare variants

    Get PDF
    Genome-wide association studies have successfully identified numerous loci at which common variants influence disease risks or quantitative traits of interest. Despite these successes, the variants identified by these studies have generally explained only a small fraction of the variations in the phenotype. One explanation may be that many rare variants that are not included in the common genotyping platforms may contribute substantially to the genetic variations of the diseases. Next-generation sequencing, which would better allow for the analysis of rare variants, is now becoming available and affordable; however, the presence of a large number of rare variants challenges the statistical endeavor to stably identify these disease-causing genetic variants. We conduct a genome-wide association study of Genetic Analysis Workshop 17 case-control data produced by the next-generation sequencing technique and propose that collapsing rare variants within each genetic region through a supervised dimension reduction algorithm leads to several macrovariants constructed for rare variants within each genetic region. A simultaneous association of the phenotype to all common variants and macrovariants is undertaken using a linear discriminant analysis using the penalized orthogonal-components regression algorithm. The results suggest that the proposed analysis strategy shows promise but needs further development

    Empirical bayes variable selection in high-dimensional regression

    No full text
    Available high-throughput biotechnologies make it necessary to select important candidates out of massive biomarkers while exploiting their complicated relationship structures. Here we consider an empirical Bayes method for variable selection in regression models. In most practical situations, Markov chain Monte Carlo (MCMC) algorithms are used for implementation by many previous empirical Bayes variable selection methods. However, these MCMC based procedures are challenged by exponentially growing numbers of biomarkers and involve intensive computing. We propose an iterated conditional modes/medians (ICM/M) algorithm which will be employed to implement an empirical Bayes variable selection in regression models. First, iterative conditional modes are employed to optimize values of the hyperparameters so as to implement the empirical Bayes method; Second, iterative conditional medians are used to estimate the model coefficients and therefore implement the variable selection function. Our simulation studies suggest fast computation and superior performance of the proposed method. The developed algorithm has also been applied to real omics data
    corecore