84 research outputs found

    Mapping Haplotype-haplotype Interactions with Adaptive LASSO

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genetic etiology of complex diseases in human has been commonly viewed as a complex process involving both genetic and environmental factors functioning in a complicated manner. Quite often the interactions among genetic variants play major roles in determining the susceptibility of an individual to a particular disease. Statistical methods for modeling interactions underlying complex diseases between single genetic variants (e.g. single nucleotide polymorphisms or SNPs) have been extensively studied. Recently, haplotype-based analysis has gained its popularity among genetic association studies. When multiple sequence or haplotype interactions are involved in determining an individual's susceptibility to a disease, it presents daunting challenges in statistical modeling and testing of the interaction effects, largely due to the complicated higher order epistatic complexity.</p> <p>Results</p> <p>In this article, we propose a new strategy in modeling haplotype-haplotype interactions under the penalized logistic regression framework with adaptive <it>L</it><sub>1</sub>-penalty. We consider interactions of sequence variants between haplotype blocks. The adaptive <it>L</it><sub>1</sub>-penalty allows simultaneous effect estimation and variable selection in a single model. We propose a new parameter estimation method which estimates and selects parameters by the modified Gauss-Seidel method nested within the EM algorithm. Simulation studies show that it has low false positive rate and reasonable power in detecting haplotype interactions. The method is applied to test haplotype interactions involved in mother and offspring genome in a small for gestational age (SGA) neonates data set, and significant interactions between different genomes are detected.</p> <p>Conclusions</p> <p>As demonstrated by the simulation studies and real data analysis, the approach developed provides an efficient tool for the modeling and testing of haplotype interactions. The implementation of the method in R codes can be freely downloaded from <url>http://www.stt.msu.edu/~cui/software.html</url>.</p

    Conservation and implications of eukaryote transcriptional regulatory regions across multiple species

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Increasing evidence shows that whole genomes of eukaryotes are almost entirely transcribed into both protein coding genes and an enormous number of non-protein-coding RNAs (ncRNAs). Therefore, revealing the underlying regulatory mechanisms of transcripts becomes imperative. However, for a complete understanding of transcriptional regulatory mechanisms, we need to identify the regions in which they are found. We will call these transcriptional regulation regions, or TRRs, which can be considered functional regions containing a cluster of regulatory elements that cooperatively recruit transcriptional factors for binding and then regulating the expression of transcripts.</p> <p>Results</p> <p>We constructed a hierarchical stochastic language (HSL) model for the identification of core TRRs in yeast based on regulatory cooperation among TRR elements. The HSL model trained based on yeast achieved comparable accuracy in predicting TRRs in other species, e.g., fruit fly, human, and rice, thus demonstrating the conservation of TRRs across species. The HSL model was also used to identify the TRRs of genes, such as p53 or <it>OsALYL1</it>, as well as microRNAs. In addition, the ENCODE regions were examined by HSL, and TRRs were found to pervasively locate in the genomes.</p> <p>Conclusion</p> <p>Our findings indicate that 1) the HSL model can be used to accurately predict core TRRs of transcripts across species and 2) identified core TRRs by HSL are proper candidates for the further scrutiny of specific regulatory elements and mechanisms. Meanwhile, the regulatory activity taking place in the abundant numbers of ncRNAs might account for the ubiquitous presence of TRRs across the genome. In addition, we also found that the TRRs of protein coding genes and ncRNAs are similar in structure, with the latter being more conserved than the former.</p

    An aggregating U-Test for a genetic association study of quantitative traits

    Get PDF
    We propose a novel aggregating U-test for gene-based association analysis. The method considers both rare and common variants. It adaptively searches for potential disease-susceptibility rare variants and collapses them into a single “supervariant.” A forward U-test is then used to assess the joint association of the supervariant and other common variants with quantitative traits. Using 200 simulated replicates from the Genetic Analysis Workshop 17 mini-exome data, we compare the performance of the proposed method with that of a commonly used approach, QuTie. We find that our method has an equivalent or greater power than QuTie to detect nine genes that influence the quantitative trait Q1. This new approach provides a powerful tool for detecting both common and rare variants associated with quantitative traits

    Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation

    Get PDF
    Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy greatly. In this study, we address this issue and demonstrate that hybridization with mismatch nucleotides (HWMMN) occurs in all SNP probe-sets and has a critical effect on the estimation of allelic concentrations (ACs). We study sequence binding through binding free energy and then binding affinity, and develop a probe intensity composite representation (PICR) model. The PICR model allows the estimation of ACs at a given SNP through statistical regression. Furthermore, we demonstrate with cell-line data of known true copy numbers that the PICR model can achieve reasonable accuracy in copy number estimation at a single SNP locus, by using the ratio of the estimated AC of each sample to that of the reference sample, and can reveal subtle genotype structure of SNPs at abnormal loci. We also demonstrate with HapMap data that the PICR model yields accurate SNP genotype calls consistently across samples, laboratories and even across array platforms

    The Neuropathology of Fatal Cerebral Malaria in Malawian Children

    Get PDF
    We examined the brains of 50 Malawian children who satisfied the clinical definition of cerebral malaria (CM) during life; 37 children had sequestration of infected red blood cells (iRBCs) and no other cause of death, and 13 had a nonmalarial cause of death with no cerebral sequestration. For comparison, 18 patients with coma and no parasitemia were included. We subdivided the 37 CM cases into two groups based on the cerebral microvasculature pathology: iRBC sequestration only (CM1) or sequestration with intravascular and perivascular pathology (CM2). We characterized and quantified the axonal and myelin damage, blood-brain barrier (BBB) disruption, and cellular immune responses and correlated these changes with iRBC sequestration and microvascular pathology. Axonal and myelin damage was associated with ring hemorrhages and vascular thrombosis in the cerebral and cerebellar white matter and brainstem of the CM2 cases. Diffuse axonal and myelin damage were present in CM1 and CM2 cases in areas of prominent iRBC sequestration. Disruption of the BBB was associated with ring hemorrhages and vascular thrombosis in CM2 cases and with sequestration in both CM1 and CM2 groups. Monocytes with phagocytosed hemozoin accumulated within microvessels containing iRBCs in CM2 cases but were not present in the adjacent neuropil. These findings are consistent with a link between iRBC sequestration and intravascular and perivascular pathology in fatal pediatric CM, resulting in myelin damage, axonal injury, and breakdown of the BBB
    corecore