15 research outputs found
The genetic architecture of the human cerebral cortex
The cerebral cortex underlies our complex cognitive capabilities, yet little is known about the specific genetic loci that influence human cortical structure. To identify genetic variants that affect cortical structure, we conducted a genome-wide association meta-analysis of brain magnetic resonance imaging data from 51,665 individuals. We analyzed the surface area and average thickness of the whole cortex and 34 regions with known functional specializations. We identified 199 significant loci and found significant enrichment for loci influencing total surface area within regulatory elements that are active during prenatal cortical development, supporting the radial unit hypothesis. Loci that affect regional surface area cluster near genes in Wnt signaling pathways, which influence progenitor expansion and areal identity. Variation in cortical structure is genetically correlated with cognitive function, Parkinson's disease, insomnia, depression, neuroticism, and attention deficit hyperactivity disorder
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Recommended from our members
Complete vertebrate mitogenomes reveal widespread repeats and gene duplications
Abstract: Background: Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions: Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone
Recommended from our members
Conservation Genomics of the Threatened Canada Lynx (Lynx canadensis) in the Northern Appalachian-Acadian Ecoregion
Canada lynx (Lynx canadensis) are habitat- and prey-specialists associated with early successional boreal forests that support an abundance of their primary prey species, snowshoe hare (Lepus americanus). The species distribution dips into the northernmost United States, where lynx are listed as threatened under the US Endangered Species Act. Within the Northern Appalachian-Acadian ecoregion, habitat in Maine supports the largest and most robust population of lynx in the contiguous United States. However, suitable habitat is typically less abundant and more patchily distributed at the trailing edge of the species distribution, where peripheral populations are more at risk of isolation and associated impacts to resilience (e.g., genetic erosion). We developed a chromosome-scale reference genome for Canada lynx and used low-coverage whole genome sequences to better understand connectivity and gene flow between Maine and adjacent Canadian provinces. We detected genetic structure and resistance to gene flow shaped by isolating water bodies including the St. Lawrence River. Genome-wide diversity was lower at the trailing edge, which suggests populations at the range periphery may already be showing genetic impacts of isolation. Northward and upslope habitat contractions driven by climate change are expected to exacerbate isolation beyond 2050. Therefore, we used an ecological genomics approach to quantify the species’ capacity to evolve in response to changing conditions (i.e., adaptive potential). We identified gene-environment associations under current climate (1970-2000) and calculated the “genetic offset” required for lynx to retain those beneficial associations under future conditions (1961-1980). Our findings suggest that mismatch between current and future adaptive optima will require dramatic allelic turnover for some populations. However, adaptive potential at some functional loci is weakened by available standing variation, particularly among peripheral populations where genome-wide diversity is low. Further study is required to identify and conserve corridors which facilitate connectivity and gene flow between populations at the trailing edge and at the core of the species distribution. In 2018, the US Fish and Wildlife Service recommended Canada lynx for delisting from US Endangered Species Act protections. State and federal management agencies should consider incorporating a non-invasive genetic component to their post-delisting monitoring plans
Large-scale genome sampling reveals unique immunity and metabolic adaptations in bats
Comprising more than 1,400 species, bats possess adaptations unique among mammals including powered flight, unexpected longevity, and extraordinary immunity. Some of the molecular mechanisms underlying these unique adaptations includes DNA repair, metabolism and immunity. However, analyses have been limited to a few divergent lineages, reducing the scope of inferences on gene family evolution across the Order Chiroptera. We conducted an exhaustive comparative genomic study of 37 bat species, one generated in this study, encompassing a large number of lineages, with a particular emphasis on multi-gene family evolution across immune and metabolic genes. In agreement with previous analyses, we found lineage-specific expansions of the APOBEC3 and MHC-I gene families, and loss of the proinflammatory PYHIN gene family. We inferred more than 1,000 gene losses unique to bats, including genes involved in the regulation of inflammasome pathways such as epithelial defence receptors, the natural killer gene complex and the interferon-gamma induced pathway. Gene set enrichment analyses revealed genes lost in bats are involved in defence response against pathogen-associated molecular patterns and damage-associated molecular patterns. Gene family evolution and selection analyses indicate bats have evolved fundamental functional differences compared to other mammals in both innate and adaptive immune system, with the potential to enhance antiviral immune response while dampening inflammatory signalling. In addition, metabolic genes have experienced repeated expansions related to convergent shifts to plant-based diets. Our analyses support the hypothesis that, in tandem with flight, ancestral bats had evolved a unique set of immune adaptations whose functional implications remain to be explored
Benchmarking ultra-high molecular weight DNA preservation methods for long-read and long-range sequencing.
BACKGROUND: Studies in vertebrate genomics require sampling from a broad range of tissue types, taxa, and localities. Recent advancements in long-read and long-range genome sequencing have made it possible to produce high-quality chromosome-level genome assemblies for almost any organism. However, adequate tissue preservation for the requisite ultra-high molecular weight DNA (uHMW DNA) remains a major challenge. Here we present a comparative study of preservation methods for field and laboratory tissue sampling, across vertebrate classes and different tissue types. RESULTS: We find that storage temperature was the strongest predictor of uHMW fragment lengths. While immediate flash-freezing remains the sample preservation gold standard, samples preserved in 95% EtOH or 20-25% DMSO-EDTA showed little fragment length degradation when stored at 4°C for 6 hours. Samples in 95% EtOH or 20-25% DMSO-EDTA kept at 4°C for 1 week after dissection still yielded adequate amounts of uHMW DNA for most applications. Tissue type was a significant predictor of total DNA yield but not fragment length. Preservation solution had a smaller but significant influence on both fragment length and DNA yield. CONCLUSION: We provide sample preservation guidelines that ensure sufficient DNA integrity and amount required for use with long-read and long-range sequencing technologies across vertebrates. Our best practices generated the uHMW DNA needed for the high-quality reference genomes for phase 1 of the Vertebrate Genomes Project, whose ultimate mission is to generate chromosome-level reference genome assemblies of all ∼70,000 extant vertebrate species
Towards complete and error-free genome assemblies of all vertebrate species
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1,2,3,4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences
Cognitive performance and neuropsychiatric symptoms in early, untreated Parkinson's disease
This study was undertaken to determine the prevalence and correlates of cognitive impairment (CI) and neuropsychiatric symptoms (NPS) in early, untreated patients with Parkinson's disease (PD). Background: Both CI and NPS are common in PD and impact disease course and quality of life. However, limited knowledge is available about cognitive abilities and NPS. Methods: Parkinson's Progression Markers Initiative (PPMI) is a multi-site study of early, untreated PD patients and healthy controls (HCs), the latter with normal cognition. At baseline, participants were assessed with a neuropsychological battery and for symptoms of depression, anxiety, impulse control disorders (ICDs), psychosis, and apathy. Results: Baseline data of 423 PD patients and 196 HCs yielded no between-group differences in demographic characteristics. Twenty-two percent of PD patients met the PD-recommended screening cutoff for CI on the Montral Cognitive Assessment (MoCA), but only 9% met detailed neuropsychological testing criteria for mild cognitive impairment (MCI)-level impairment. The PD patients were more depressed than HCs (P<0.001), with twice as many (14% vs. 7%) meeting criteria for clinically significant depressive symptoms. The PD patients also experienced more anxiety (P<0.001) and apathy (P<0.001) than HCs. Psychosis was uncommon in PD (3%), and no between-group difference was seen in ICD symptoms (P=0.51). Conclusions: Approximately 10% of PD patients in the early, untreated disease state met traditional criteria of CI, which is a lower frequency compared with previous studies. Multiple dopaminergic-dependent NPS are also more common in these patients compared with the general population, but others associated with dopamine replacement therapy are not or are rare. Future analyses of this cohort will examine biological predictors and the course of CI and NPS