34 research outputs found
Adventures in the Enormous: A 1.8 Million Clone BAC Library for the 21.7 Gb Genome of Loblolly Pine
Loblolly pine (LP; Pinus taeda L.) is the most economically important tree in the U.S. and a cornerstone species in southeastern forests. However, genomics research on LP and other conifers has lagged behind studies on flowering plants due, in part, to the large size of conifer genomes. As a means to accelerate conifer genome research, we constructed a BAC library for the LP genotype 7-56. The LP BAC library consists of 1,824,768 individually-archived clones making it the largest single BAC library constructed to date, has a mean insert size of 96 kb, and affords 7.6X coverage of the 21.7 Gb LP genome. To demonstrate the efficacy of the library in gene isolation, we screened macroarrays with overgos designed from a pine EST anchored on LP chromosome 10. A positive BAC was sequenced and found to contain the expected full-length target gene, several gene-like regions, and both known and novel repeats. Macroarray analysis using the retrotransposon IFG-7 (the most abundant repeat in the sequenced BAC) as a probe indicates that IFG-7 is found in roughly 210,557 copies and constitutes about 5.8% or 1.26 Gb of LP nuclear DNA; this DNA quantity is eight times the Arabidopsis genome. In addition to its use in genome characterization and gene isolation as demonstrated herein, the BAC library should hasten whole genome sequencing of LP via next-generation sequencing strategies/technologies and facilitate improvement of trees through molecular breeding and genetic engineering. The library and associated products are distributed by the Clemson University Genomics Institute (www.genome.clemson.edu)
Evolution of Genome Size and Complexity in Pinus
BACKGROUND: Genome evolution in the gymnosperm lineage of seed plants has given rise to many of the most complex and largest plant genomes, however the elements involved are poorly understood. METHODOLOGY/PRINCIPAL FINDINGS: Gymny is a previously undescribed retrotransposon family in Pinus that is related to Athila elements in Arabidopsis. Gymny elements are dispersed throughout the modern Pinus genome and occupy a physical space at least the size of the Arabidopsis thaliana genome. In contrast to previously described retroelements in Pinus, the Gymny family was amplified or introduced after the divergence of pine and spruce (Picea). If retrotransposon expansions are responsible for genome size differences within the Pinaceae, as they are in angiosperms, then they have yet to be identified. In contrast, molecular divergence of Gymny retrotransposons together with other families of retrotransposons can account for the large genome complexity of pines along with protein-coding genic DNA, as revealed by massively parallel DNA sequence analysis of Cot fractionated genomic DNA. CONCLUSIONS/SIGNIFICANCE: Most of the enormous genome complexity of pines can be explained by divergence of retrotransposons, however the elements responsible for genome size variation are yet to be identified. Genomic resources for Pinus including those reported here should assist in further defining whether and how the roles of retrotransposons differ in the evolution of angiosperm and gymnosperm genomes
A large-scale genome-wide association study meta-analysis of cannabis use disorder
Summary Background Variation in liability to cannabis use disorder has a strong genetic component (estimated twin and family heritability about 50–70%) and is associated with negative outcomes, including increased risk of psychopathology. The aim of the study was to conduct a large genome-wide association study (GWAS) to identify novel genetic variants associated with cannabis use disorder. Methods To conduct this GWAS meta-analysis of cannabis use disorder and identify associations with genetic loci, we used samples from the Psychiatric Genomics Consortium Substance Use Disorders working group, iPSYCH, and deCODE (20 916 case samples, 363 116 control samples in total), contrasting cannabis use disorder cases with controls. To examine the genetic overlap between cannabis use disorder and 22 traits of interest (chosen because of previously published phenotypic correlations [eg, psychiatric disorders] or hypothesised associations [eg, chronotype] with cannabis use disorder), we used linkage disequilibrium score regression to calculate genetic correlations. Findings We identified two genome-wide significant loci: a novel chromosome 7 locus (FOXP2, lead single-nucleotide polymorphism [SNP] rs7783012; odds ratio [OR] 1·11, 95% CI 1·07–1·15, p=1·84 × 10−9) and the previously identified chromosome 8 locus (near CHRNA2 and EPHX2, lead SNP rs4732724; OR 0·89, 95% CI 0·86–0·93, p=6·46 × 10−9). Cannabis use disorder and cannabis use were genetically correlated (rg 0·50, p=1·50 × 10−21), but they showed significantly different genetic correlations with 12 of the 22 traits we tested, suggesting at least partially different genetic underpinnings of cannabis use and cannabis use disorder. Cannabis use disorder was positively genetically correlated with other psychopathology, including ADHD, major depression, and schizophrenia. Interpretation These findings support the theory that cannabis use disorder has shared genetic liability with other psychopathology, and there is a distinction between genetic liability to cannabis use and cannabis use disorder. Funding National Institute of Mental Health; National Institute on Alcohol Abuse and Alcoholism; National Institute on Drug Abuse; Center for Genomics and Personalized Medicine and the Centre for Integrative Sequencing; The European Commission, Horizon 2020; National Institute of Child Health and Human Development; Health Research Council of New Zealand; National Institute on Aging; Wellcome Trust Case Control Consortium; UK Research and Innovation Medical Research Council (UKRI MRC); The Brain & Behavior Research Foundation; National Institute on Deafness and Other Communication Disorders; Substance Abuse and Mental Health Services Administration (SAMHSA); National Institute of Biomedical Imaging and Bioengineering; National Health and Medical Research Council (NHMRC) Australia; Tobacco-Related Disease Research Program of the University of California; Families for Borderline Personality Disorder Research (Beth and Rob Elliott) 2018 NARSAD Young Investigator Grant; The National Child Health Research Foundation (Cure Kids); The Canterbury Medical Research Foundation; The New Zealand Lottery Grants Board; The University of Otago; The Carney Centre for Pharmacogenomics; The James Hume Bequest Fund; National Institutes of Health: Genes, Environment and Health Initiative; National Institutes of Health; National Cancer Institute; The William T Grant Foundation; Australian Research Council; The Virginia Tobacco Settlement Foundation; The VISN 1 and VISN 4 Mental Illness Research, Education, and Clinical Centers of the US Department of Veterans Affairs; The 5th Framework Programme (FP-5) GenomEUtwin Project; The Lundbeck Foundation; NIH-funded Shared Instrumentation Grant S10RR025141; Clinical Translational Sciences Award grants; National Institute of Neurological Disorders and Stroke; National Heart, Lung, and Blood Institute; National Institute of General Medical Sciences.Peer reviewe
A framework for human microbiome research
A variety of microbial communities and their genes (the microbiome) exist throughout the human body, with fundamental roles in human health and disease. The National Institutes of Health (NIH)-funded Human Microbiome Project Consortium has established a population-scale framework to develop metagenomic protocols, resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 or 18 body sites up to three times, which have generated 5,177 microbial taxonomic profiles from 16S ribosomal RNA genes and over 3.5 terabases of metagenomic sequence so far. In parallel, approximately 800 reference strains isolated from the human body have been sequenced. Collectively, these data represent the largest resource describing the abundance and variety of the human microbiome, while providing a framework for current and future studies
Structure, function and diversity of the healthy human microbiome
Author Posting. © The Authors, 2012. This article is posted here by permission of Nature Publishing Group. The definitive version was published in Nature 486 (2012): 207-214, doi:10.1038/nature11234.Studies of the human microbiome have revealed that even healthy individuals differ remarkably in the microbes that occupy habitats such as the gut, skin and vagina. Much of this diversity remains unexplained, although diet, environment, host genetics and early microbial exposure have all been implicated. Accordingly, to characterize the ecology of human-associated microbial communities, the Human Microbiome Project has analysed the largest cohort and set of distinct, clinically relevant body habitats so far. We found the diversity and abundance of each habitat’s signature microbes to vary widely even among healthy subjects, with strong niche specialization both within and among individuals. The project encountered an estimated 81–99% of the genera, enzyme families and community configurations occupied by the healthy Western microbiome. Metagenomic carriage of metabolic pathways was stable among individuals despite variation in community structure, and ethnic/racial background proved to be one of the strongest associations of both pathways and microbes with clinical metadata. These results thus delineate the range of structural and functional configurations normal in the microbial communities of a healthy population, enabling future characterization of the epidemiology, ecology and translational applications of the human microbiome.This research was supported in
part by National Institutes of Health grants U54HG004969 to B.W.B.; U54HG003273
to R.A.G.; U54HG004973 to R.A.G., S.K.H. and J.F.P.; U54HG003067 to E.S.Lander;
U54AI084844 to K.E.N.; N01AI30071 to R.L.Strausberg; U54HG004968 to G.M.W.;
U01HG004866 to O.R.W.; U54HG003079 to R.K.W.; R01HG005969 to C.H.;
R01HG004872 to R.K.; R01HG004885 to M.P.; R01HG005975 to P.D.S.;
R01HG004908 to Y.Y.; R01HG004900 to M.K.Cho and P. Sankar; R01HG005171 to
D.E.H.; R01HG004853 to A.L.M.; R01HG004856 to R.R.; R01HG004877 to R.R.S. and
R.F.; R01HG005172 to P. Spicer.; R01HG004857 to M.P.; R01HG004906 to T.M.S.;
R21HG005811 to E.A.V.; M.J.B. was supported by UH2AR057506; G.A.B. was
supported by UH2AI083263 and UH3AI083263 (G.A.B., C. N. Cornelissen, L. K. Eaves
and J. F. Strauss); S.M.H. was supported by UH3DK083993 (V. B. Young, E. B. Chang,
F. Meyer, T. M. S., M. L. Sogin, J. M. Tiedje); K.P.R. was supported by UH2DK083990 (J.
V.); J.A.S. and H.H.K. were supported by UH2AR057504 and UH3AR057504 (J.A.S.);
DP2OD001500 to K.M.A.; N01HG62088 to the Coriell Institute for Medical Research;
U01DE016937 to F.E.D.; S.K.H. was supported by RC1DE0202098 and
R01DE021574 (S.K.H. and H. Li); J.I. was supported by R21CA139193 (J.I. and
D. S. Michaud); K.P.L. was supported by P30DE020751 (D. J. Smith); Army Research
Office grant W911NF-11-1-0473 to C.H.; National Science Foundation grants NSF
DBI-1053486 to C.H. and NSF IIS-0812111 to M.P.; The Office of Science of the US
Department of Energy under Contract No. DE-AC02-05CH11231 for P.S. C.; LANL
Laboratory-Directed Research and Development grant 20100034DR and the US
Defense Threat Reduction Agency grants B104153I and B084531I to P.S.C.; Research
Foundation - Flanders (FWO) grant to K.F. and J.Raes; R.K. is an HHMI Early Career
Scientist; Gordon&BettyMoore Foundation funding and institutional funding fromthe
J. David Gladstone Institutes to K.S.P.; A.M.S. was supported by fellowships provided by
the Rackham Graduate School and the NIH Molecular Mechanisms in Microbial
Pathogenesis Training Grant T32AI007528; a Crohn’s and Colitis Foundation of
Canada Grant in Aid of Research to E.A.V.; 2010 IBM Faculty Award to K.C.W.; analysis
of the HMPdata was performed using National Energy Research Scientific Computing
resources, the BluBioU Computational Resource at Rice University
Genetic effects on gene expression across human tissues
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of diseas
Clinical correlates of grey matter pathology in multiple sclerosis
Traditionally, multiple sclerosis has been viewed as a disease predominantly affecting white matter. However, this view has lately been subject to numerous changes, as new evidence of anatomical and histological changes as well as of molecular targets within the grey matter has arisen. This advance was driven mainly by novel imaging techniques, however, these have not yet been implemented in routine clinical practice. The changes in the grey matter are related to physical and cognitive disability seen in individuals with multiple sclerosis. Furthermore, damage to several grey matter structures can be associated with impairment of specific functions. Therefore, we conclude that grey matter damage - global and regional - has the potential to become a marker of disease activity, complementary to the currently used magnetic resonance markers (global brain atrophy and T2 hyperintense lesions). Furthermore, it may improve the prediction of the future disease course and response to therapy in individual patients and may also become a reliable additional surrogate marker of treatment effect
Shared genetic risk between eating disorder- and substance-use-related phenotypes:Evidence from genome-wide association studies
First published: 16 February 202
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
A large-scale genome-wide association study meta-analysis of cannabis use disorder
Background: Variation in liability to cannabis use disorder has a strong genetic component (estimated twin and family heritability about 50-70%) and is associated with negative outcomes, including increased risk of psychopathology. The aim of the study was to conduct a large genome-wide association study (GWAS) to identify novel genetic variants associated with cannabis use disorder.
Methods: To conduct this GWAS meta-analysis of cannabis use disorder and identify associations with genetic loci, we used samples from the Psychiatric Genomics Consortium Substance Use Disorders working group, iPSYCH, and deCODE (20 916 case samples, 363 116 control samples in total), contrasting cannabis use disorder cases with controls. To examine the genetic overlap between cannabis use disorder and 22 traits of interest (chosen because of previously published phenotypic correlations [eg, psychiatric disorders] or hypothesised associations [eg, chronotype] with cannabis use disorder), we used linkage disequilibrium score regression to calculate genetic correlations.
Findings: We identified two genome-wide significant loci: a novel chromosome 7 locus (FOXP2, lead single-nucleotide polymorphism [SNP] rs7783012; odds ratio [OR] 1·11, 95% CI 1·07-1·15, p=1·84 × 10-9) and the previously identified chromosome 8 locus (near CHRNA2 and EPHX2, lead SNP rs4732724; OR 0·89, 95% CI 0·86-0·93, p=6·46 × 10-9). Cannabis use disorder and cannabis use were genetically correlated (rg 0·50, p=1·50 × 10-21), but they showed significantly different genetic correlations with 12 of the 22 traits we tested, suggesting at least partially different genetic underpinnings of cannabis use and cannabis use disorder. Cannabis use disorder was positively genetically correlated with other psychopathology, including ADHD, major depression, and schizophrenia.
Interpretation: These findings support the theory that cannabis use disorder has shared genetic liability with other psychopathology, and there is a distinction between genetic liability to cannabis use and cannabis use disorder