8 research outputs found

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Evolutionary constraint facilitates interpretation of genetic variation in resequenced human genomes

    No full text
    Here, we demonstrate how comparative sequence analysis facilitates genome-wide base-pair-level interpretation of individual genetic variation and address two questions of importance for human personal genomics: first, whether an individual's functional variation comes mostly from noncoding or coding polymorphisms; and, second, whether population-specific or globally-present polymorphisms contribute more to functional variation in any given individual. Neither has been definitively answered by analyses of existing variation data because of a focus on coding polymorphisms, ascertainment biases in favor of common variation, and a lack of base-pair-level resolution for identifying functional variants. We resequenced 575 amplicons within 432 individuals at genomic sites enriched for evolutionary constraint and also analyzed variation within three published human genomes. We find that single-site measures of evolutionary constraint derived from mammalian multiple sequence alignments are strongly predictive of reductions in modern-day genetic diversity across a range of annotation categories and across the allele frequency spectrum from rare (<1%) to high frequency (>10% minor allele frequency). Furthermore, we show that putatively functional variation in an individual genome is dominated by polymorphisms that do not change protein sequence and that originate from our shared ancestral population and commonly segregate in human populations. These observations show that common, noncoding alleles contribute substantially to human phenotypes and that constraint-based analyses will be of value to identify phenotypically relevant variants in individual genomes

    The DNA sequence and comparative analysis of human chromosome 5

    No full text
    Chromosome 5 is one of the largest human chromosomes yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-encoding genes including the protocadherin and interleukin gene families and the first complete versions of each of the large chromosome 5 specific internal duplications. These duplications are very recent evolutionary events and play a likely mechanistic role, since deletions of these regions are the cause of debilitating disorders including spinal muscular atrophy (SMA)

    The sequence and analysis of duplication-rich human chromosome 16

    Get PDF
    Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin. Manual annotation revealed 880 protein-coding genes confirmed by 1,670 aligned transcripts, 19 transfer RNA genes, 341 pseudogenes and three RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukaemia. Several large-scale structural polymorphisms spanning hundreds of kilobase pairs were identified and result in gene content differences among humans. Whereas the segmental duplications of chromosome 16 are enriched in the relatively gene-poor pericentromere of the p arm, some are involved in recent gene duplication and conversion events that are likely to have had an impact on the evolution of primates and human disease susceptibility.Joel Martin, Cliff Han, Laurie A. Gordon, Astrid Terry, Shyam Prabhakar, Xinwei She, Gary Xie, Uffe Hellsten, Yee Man Chan, Michael Altherr, Olivier Couronne, Andrea Aerts, Eva Bajorek, Stacey Black, Heather Blumer, Elbert Branscomb, Nancy C. Brown, William J. Bruno, Judith M. Buckingham, David F. Callen, Connie S. Campbell, Mary L. Campbell, Evelyn W. Campbell, Chenier Caoile, Jean F. Challacombe, Leslie A. Chasteen, Olga Chertkov, Han C. Chi, Mari Christensen, Lynn M. Clark, Judith D. Cohn, Mirian Denys, John C. Detter, Mark Dickson, Mira Dimitrijevic-Bussod, Julio Escobar, Joseph J. Fawcett, Dave Flowers, Dea Fotopulos, Tijana Glavina, Maria Gomez, Eidelyn Gonzales, David Goodstein, Lynne A. Goodwin, Deborah L. Grady, Igor Grigoriev, Matthew Groza, Nancy Hammon, Trevor Hawkins, Lauren Haydu, Carl E. Hildebrand, Wayne Huang, Sanjay Israni, Jamie Jett, Phillip B. Jewett, Kristen Kadner, Heather Kimball, Arthur Kobayashi, Marie-Claude Krawczyk, Tina Leyba, Jonathan L. Longmire, Frederick Lopez, Yunian Lou, Steve Lowry, Thom Ludeman, Chitra F. Manohar, Graham A. Mark, Kimberly L. McMurray, Linda J. Meincke, Jenna Morgan, Robert K. Moyzis, Mark O. Mundt, A. Christine Munk, Richard D. Nandkeshwar, Sam Pitluck, Martin Pollard Paul Predki, Beverly Parson-Quintana, Lucia Ramirez, Sam Rash, James Retterer, Darryl O. Ricke, Donna L. Robinson, Alex Rodriguez, Asaf Salamov, Elizabeth H. Saunders, Duncan Scott, Timothy Shough, Raymond L. Stallings, Malinda Stalvey, Robert D. Sutherland, Roxanne Tapia, Judith G. Tesmer, Nina Thayer, Linda S. Thompson, Hope Tice, David C. Torney, Mary Tran-Gyamfi, Ming Tsai, Levy E. Ulanovsky, Anna Ustaszewska, Nu Vo, P. Scott White, Albert L. Williams, Patricia L. Wills, Jung-Rung Wu, Kevin Wu, Joan Yang, Pieter DeJong, David Bruce, Norman A. Doggett, Larry Deaven, Jeremy Schmutz, Jane Grimwood, Paul Richardson, Daniel S. Rokhsar, Evan E. Eichler, Paul Gilna, Susan M. Lucas, Richard M. Myers, Edward M. Rubin and Len A. Pennacchi

    The DNA sequence and biology of human chromosome 19

    No full text
    corecore