976 research outputs found

    SARS-CoV-2 susceptibility and COVID-19 disease severity are associated with genetic variants affecting gene expression in a variety of tissues.

    Get PDF
    Variability in SARS-CoV-2 susceptibility and COVID-19 disease severity between individuals is partly due to genetic factors. Here, we identify 4 genomic loci with suggestive associations for SARS-CoV-2 susceptibility and 19 for COVID-19 disease severity. Four of these 23 loci likely have an ethnicity-specific component. Genome-wide association study (GWAS) signals in 11 loci colocalize with expression quantitative trait loci (eQTLs) associated with the expression of 20 genes in 62 tissues/cell types (range: 1:43 tissues/gene), including lung, brain, heart, muscle, and skin as well as the digestive system and immune system. We perform genetic fine mapping to compute 99% credible SNP sets, which identify 10 GWAS loci that have eight or fewer SNPs in the credible set, including three loci with one single likely causal SNP. Our study suggests that the diverse symptoms and disease severity of COVID-19 observed between individuals is associated with variants across the genome, affecting gene expression levels in a wide variety of tissue types

    Identification of multiple rare variants associated with a disease

    Get PDF
    Identifying rare variants that are responsible for complex disease has been promoted by advances in sequencing technologies. However, statistical methods that can handle the vast amount of data generated and that can interpret the complicated relationship between disease and these variants have lagged. We apply a zero-inflated Poisson regression model to take into account the excess of zeros caused by the extremely low frequency of the 24,487 exonic variants in the Genetic Analysis Workshop 17 data. We grouped the 697 subjects in the data set as Europeans, Asians, and Africans based on principal components analysis and found the total number of rare variants per gene for each individual. We then analyzed these collapsed variants based on the assumption that rare variants are enriched in a group of people affected by a disease compared to a group of unaffected people. We also tested the hypothesis with quantitative traits Q1, Q2, and Q4. Analyses performed on the combined 697 individuals and on each ethnic group yielded different results. For the combined population analysis, we found that UGT1A1, which was not part of the simulation model, was associated with disease liability and that FLT1, which was a causal locus in the simulation model, was associated with Q1. Of the causal loci in the simulation models, FLT1 and KDR were associated with Q1 and VNN1 was correlated with Q2. No significant genes were associated with Q4. These results show the feasibility and capability of our new statistical model to detect multiple rare variants influencing disease risk

    MGMR: leveraging RNA-Seq population data to optimize expression estimation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>RNA-Seq is a technique that uses Next Generation Sequencing to identify transcripts and estimate transcription levels. When applying this technique for quantification, one must contend with reads that align to multiple positions in the genome (multireads). Previous efforts to resolve multireads have shown that RNA-Seq expression estimation can be improved using probabilistic allocation of reads to genes. These methods use a probabilistic generative model for data generation and resolve ambiguity using likelihood-based approaches. In many instances, RNA-seq experiments are performed in the context of a population. The generative models of current methods do not take into account such population information, and it is an open question whether this information can improve quantification of the individual samples</p> <p>Results</p> <p>In order to explore the contribution of population level information in RNA-seq quantification, we apply a hierarchical probabilistic generative model, which assumes that expression levels of different individuals are sampled from a Dirichlet distribution with parameters specific to the population, and reads are sampled from the distribution of expression levels. We introduce an optimization procedure for the estimation of the model parameters, and use HapMap data and simulated data to demonstrate that the model yields a significant improvement in the accuracy of expression levels of paralogous genes.</p> <p>Conclusions</p> <p>We provide a proof of principal of the benefit of drawing on population commonalities to estimate expression. The results of our experiments demonstrate this approach can be beneficial, primarily for estimation at the gene level.</p

    Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Get PDF
    BACKGROUND: Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. RESULTS: We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance). In particular, Cinteny provides: i) integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii) flexibility to adjust the parameters and re-compute the results on-the-fly; iii) ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at . CONCLUSION: Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances. Cinteny can also be used to interactively browse syntenic blocks conserved in multiple genomes, to facilitate genome annotation and validation of assemblies for newly sequenced genomes, and to construct and assess phylogenomic trees

    Comparative analysis and visualization of multiple collinear genomes

    Get PDF
    Abstract Background Genome browsers are a common tool used by biologists to visualize genomic features including genes, polymorphisms, and many others. However, existing genome browsers and visualization tools are not well-suited to perform meaningful comparative analysis among a large number of genomes. With the increasing quantity and availability of genomic data, there is an increased burden to provide useful visualization and analysis tools for comparison of multiple collinear genomes such as the large panels of model organisms which are the basis for much of the current genetic research. Results We have developed a novel web-based tool for visualizing and analyzing multiple collinear genomes. Our tool illustrates genome-sequence similarity through a mosaic of intervals representing local phylogeny, subspecific origin, and haplotype identity. Comparative analysis is facilitated through reordering and clustering of tracks, which can vary throughout the genome. In addition, we provide local phylogenetic trees as an alternate visualization to assess local variations. Conclusions Unlike previous genome browsers and viewers, ours allows for simultaneous and comparative analysis. Our browser provides intuitive selection and interactive navigation about features of interest. Dynamic visualizations adjust to scale and data content making analysis at variable resolutions and of multiple data sets more informative. We demonstrate our genome browser for an extensive set of genomic data sets composed of almost 200 distinct mouse laboratory strains

    The geography of recent genetic ancestry across Europe

    Get PDF
    The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide a window into this recent history, as rare traces of recent shared genetic ancestry are detectable due to long segments of shared genomic material. We make use of genomic data for 2,257 Europeans (the POPRES dataset) to conduct one of the first surveys of recent genealogical ancestry over the past three thousand years at a continental scale. We detected 1.9 million shared genomic segments, and used the lengths of these to infer the distribution of shared ancestors across time and geography. We find that a pair of modern Europeans living in neighboring populations share around 10-50 genetic common ancestors from the last 1500 years, and upwards of 500 genetic ancestors from the previous 1000 years. These numbers drop off exponentially with geographic distance, but since genetic ancestry is rare, individuals from opposite ends of Europe are still expected to share millions of common genealogical ancestors over the last 1000 years. There is substantial regional variation in the number of shared genetic ancestors: especially high numbers of common ancestors between many eastern populations likely date to the Slavic and/or Hunnic expansions, while much lower levels of common ancestry in the Italian and Iberian peninsulas may indicate weaker demographic effects of Germanic expansions into these areas and/or more stably structured populations. Recent shared ancestry in modern Europeans is ubiquitous, and clearly shows the impact of both small-scale migration and large historical events. Population genomic datasets have considerable power to uncover recent demographic history, and will allow a much fuller picture of the close genealogical kinship of individuals across the world.Comment: Full size figures available from http://www.eve.ucdavis.edu/~plralph/research.html; or html version at http://ralphlab.usc.edu/ibd/ibd-paper/ibd-writeup.xhtm

    Aquatic community response to volcanic eruptions on the Ecuadorian Andean flank: evidence from the palaeoecological record

    Get PDF
    Aquatic ecosystems in the tropical Andes are under increasing pressure from human modification of the landscape (deforestation and dams) and climatic change (increase of extreme events and 1.5 °C on average temperatures are projected for AD 2100). However, the resilience of these ecosystems to perturbations is poorly understood. Here we use a multi-proxy palaeoecological approach to assess the response of aquatic ecosystems to a major mechanism for natural disturbance, volcanic ash deposition. Specifically, we present data from two Neotropical lakes located on the eastern Andean flank of Ecuador. Laguna Pindo (1°27.132′S–78°04.847′W) is a tectonically formed closed basin surrounded by a dense mid-elevation forest, whereas Laguna Baños (0°19.328′S–78°09.175′W) is a glacially formed lake with an inflow and outflow in high Andean Páramo grasslands. In each lake we examined the dynamics of chironomids and other aquatic and semi-aquatic organisms to explore the effect of thick (> 5 cm) volcanic deposits on the aquatic communities in these two systems with different catchment features. In both lakes past volcanic ash deposition was evident from four large tephras dated to c.850 cal year BP (Pindo), and 4600, 3600 and 1500 cal year BP (Baños). Examination of the chironomid and aquatic assemblages before and after the ash depositions revealed no shift in composition at Pindo, but a major change at Baños occurred after the last event around 1500 cal year BP. Chironomids at Baños changed from an assemblage dominated by Pseudochironomus and Polypedilum nubifer-type to Cricotopus/Paratrichocladius type-II, and such a dominance lasted for approximately 380 years. We suggest that, despite potential changes in the water chemistry, the major effect on the chironomid community resulted from the thickness of the tephra being deposited, which acted to shallow the water body beyond a depth threshold. Changes in the aquatic flora and fauna at the base of the trophic chain can promote cascade effects that may deteriorate the ecosystem, especially when already influenced by human activities, such as deforestation and dams, which is frequent in the high Andes

    A framework for interpreting genome-wide association studies of psychiatric disorders

    Get PDF
    Genome-wide association studies (GWAS) have yielded a plethora of new findings in the past 3 years. By early 2009, GWAS on 47 samples of subjects with attention-deficit hyperactivity disorder, autism, bipolar disorder, major depressive disorder and schizophrenia will be completed. Taken together, these GWAS constitute the largest biological experiment ever conducted in psychiatry (59 000 independent cases and controls, 7700 family trios and >40 billion genotypes). We know that GWAS can work, and the question now is whether it will work for psychiatric disorders. In this review, we describe these studies, the Psychiatric GWAS Consortium for meta-analyses of these data, and provide a logical framework for interpretation of some of the conceivable outcomes
    corecore