246 research outputs found

    Statistical and Computational Methods for Analyzing and Visualizing Large-Scale Genomic Datasets

    Full text link
    Advances in large-scale genomic data production have led to a need for better methods to process, interpret, and organize this data. Starting with raw sequencing data, generating results requires many complex data processing steps, from quality control, alignment, and variant calling to genome wide association studies (GWAS) and characterization of expression quantitative trait loci (eQTL). In this dissertation, I present methods to address issues faced when working with large-scale genomic datasets. In Chapter 2, I present an analysis of 4,787 whole genomes sequenced for the study of age-related macular degeneration (AMD) as a follow-up fine-mapping study to previous work from the International AMD Genomics Consortium (IAMDGC). Through whole genome sequencing, we comprehensively characterized genetic variants associated with AMD in known loci to provide additional insights on the variants potentially responsible for the disease by leveraging 60,706 additional controls. Our study improved the understanding of loci associated with AMD and demonstrated the advantages and disadvantages of different approaches for fine-mapping studies with sequence-based genotypes. In Chapter 3, I describe a novel method and a software tool to perform Hardy-Weinberg equilibrium (HWE) tests for structured populations. In sequence-based genetic studies, HWE test statistics are important quality metrics to distinguish true genetic variants from artifactual ones, but it becomes much less informative when it is applied to a heterogeneous and/or structured population. As next generation sequencing studies contain samples from increasingly diverse ancestries, we developed a new HWE test which addresses both the statistical and computational challenges of modern large-scale sequencing data and implemented the method in a publicly available software tool. Moreover, we extensively evaluated our proposed method with alternative methods to test HWE in both simulated and real datasets. Our method has been successfully applied to the latest variant calling QC pipeline in the TOPMed project. In Chapter 4, I describe PheGET, a web application to interactively visualize Expression Quantitative Trait Loci (eQTLs) across tissues, genes, and regions to aid functional interpretations of regulatory variants. Tissue-specific expression has become increasingly important for understanding the links between genetic variation and disease. To address this need, the Genotype-Tissue Expression (GTEx) project collected and analyzed a treasure trove of expression data. However, effectively navigating this wealth of data to find signals relevant to researchers has become a major challenge. I demonstrate the functionalities of PheGET using the newest GTEx data on our eQTL browser website at https://eqtl.pheweb.org/, allowing the user to 1) view all cis-eQTLs for a single variant; 2) view and compare single-tissue, single-gene associations within any genomic region; 3) find the best eQTL signal in any given genomic region or gene; and 4) customize the plotted data in real time. PheGET is designed to handle and display the kind of complex multidimensional data often seen in our post-GWAS era, such as multi-tissue expression data, in an intuitive and convenient interface, giving researchers an additional tool to better understand the links between genetics and disease.PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162918/1/amkwong_1.pd

    Robust, flexible, and scalable tests for Hardy-Weinberg Equilibrium across diverse ancestries

    Get PDF
    Traditional Hardy-Weinberg equilibrium (HWE) tests (the χ2 test and the exact test) have long been used as a metric for evaluating genotype quality, as technical artifacts leading to incorrect genotype calls often can be identified as deviations from HWE. However, in datasets comprised of individuals from diverse ancestries, HWE can be violated even without genotyping error, complicating the use of HWE testing to assess genotype data quality. In this manuscript, we present the Robust Unified Test for HWE (RUTH) to test for HWE while accounting for population structure and genotype uncertainty, and evaluate the impact of population heterogeneity and genotype uncertainty on the standard HWE tests and alternative methods using simulated and real sequence datasets. Our results demonstrate that ignoring population structure or genotype uncertainty in HWE tests can inflate false positive rates by many orders of magnitude. Our evaluations demonstrate different tradeoffs between false positives and statistical power across the methods, with RUTH consistently amongst the best across all evaluations. RUTH is implemented as a practical and scalable software tool to rapidly perform HWE tests across millions of markers and hundreds of thousands of individuals while supporting standard VCF/BCF formats. RUTH is publicly available at https://www.github.com/statgen/ruth

    Solar Neutrino Rates, Spectrum, and its Moments : an MSW Analysis in the Light of Super-Kamiokande Results

    Get PDF
    We re-examine MSW solutions of the solar neutrino problem in a two flavor scenario taking (a) the results on total rates and the electron energy spectrum from the 1117-day SuperKamiokande (SK) data and (b) those on total rates from the Chlorine and Gallium experiments. We find that the SMA solution gives the best fit to the total rates data from the different experiments. One new feature of our analysis is the use of the moments of the SK electron spectrum in a χ2\chi^2 analysis. The best-fit to the moments is broadly in agreement with that obtained from a direct fit to the spectrum data and prefers a Δm2\Delta m^2 comparable to the SMA fit to the rates but the required mixing angle is larger. In the combined rate and spectrum analysis, apart from varying the normalization of the 8^8B flux as a free parameter and determining its best-fit value we also obtain the best-fit parameters when correlations between the rates and the spectrum data are included and the normalization of the 8^8B flux held fixed at its SSM value. We observe that the correlations between the rates and spectrum data are important and the goodness of fit worsens when these are included. In either case, the best-fit lies in the LMA region.Comment: 17 pages, 4 figure

    Life cycle assessment for three ventilation methods

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.A sustainable ventilation method is one of the possible solutions to mitigate climate change and carbon emission. This method shall involve an analysis of the environmental impact, energy performance, and economical cost-effectiveness. There are still few studies concerning the life cycle assessment (LCA) of various alternative ventilation systems incorporating the combined effect of life cycle cost (LCC) and carbon emission in the supply-and-installation phase, as well as energy performances in the operation phase. The supply-and-installation phase of the system materials and components has a significant contribution to the total energy consumption and environmental loads of buildings. This paper covers a systematic approach to estimate their environmental impact, which was counted in terms of energy demand and CO2 emission in the two phases. This approach has been applied to an actual typical classroom served by mixing ventilation (MV), displacement ventilation (DV) and stratum ventilation (SV). The results show that SV has the least environmental impact and life cycle cost (LCC). Results of this analysis demonstrated that by adopting DV and SV, it is possible to reduce the CO2 emission up to 23.25% and 31.71% respectively; and to reduce the LCC up to 15.52% and 23.89% respectively, in comparison with an MV system for 20 service years. This approach may be generally applied to a sustainability analysis of ventilation methods in various scales of air-conditioned spaces

    Psychiatric Co-morbidity in Ketamine and Methamphetamine Dependence:a Retrospective Chart Review

    Get PDF
    Both ketamine and methamphetamine (MA) have become very popular and have been abused worldwide over the past two decades. However, the relationship between dependence on ketamine or MA and psychiatric comorbidities is still unclear. This study aimed to examine the frequency of co-morbid psychiatric disorders in patients dependent on ketamine or methamphetamine who were receiving treatment at three substance abuse treatment clinics (SACs) in Hong Kong. This was a retrospective chart review. The medical records of 183 patients (103 with ketamine and 80 with MA dependence) treated between January 2008 and August 2012 were retrieved. Patients’ demographic data, patterns of substance abuse and comorbid psychiatric diagnoses were recorded. The mean age of onset and duration of substance abuse were 18.1 ± 4.7 and 9.2 ± 6.2 years for ketamine and 19.9 ± 8.8 and 10.5 ± 9.8 years for MA users, respectively. Psychotic disorders were more common in MA dependent users (76.2 % vs. 28.2 %, p &lt; 0.001), whereas mood disorders were more common in ketamine dependent users (27.2 % vs. 11.2 %, p = 0.008). Ketamine and MA dependence have a notably different profile of psychiatric co-morbidity. Compared with MA dependence, ketamine dependence is more likely to be associated with mood disorders and less likely with psychotic disorders.</p

    HIV infection and stroke:current perspectives and future directions

    Get PDF
    HIV infection can result in stroke via several mechanisms, including opportunistic infection, vasculopathy, cardioembolism, and coagulopathy. However, the occurrence of stroke and HIV infection might often be coincidental. HIV-associated vasculopathy describes various cerebrovascular changes, including stenosis and aneurysm formation, vasculitis, and accelerated atherosclerosis, and might be caused directly or indirectly by HIV infection, although the mechanisms are controversial. HIV and associated infections contribute to chronic inflammation. Combination antiretroviral therapies (cART) are clearly beneficial, but can be atherogenic and could increase stroke risk. cART can prolong life, increasing the size of the ageing population at risk of stroke. Stroke management and prevention should include identification and treatment of the specific cause of stroke and stroke risk factors, and judicious adjustment of the cART regimen. Epidemiological, clinical, biological, and autopsy studies of risk, the pathogenesis of HIV-associated vasculopathy (particularly of arterial endothelial damage), the long-term effects of cART, and ideal stroke treatment in patients with HIV are needed, as are antiretrovirals that are without vascular risk
    corecore