9 research outputs found
Recommended from our members
The complete sequence of human chromosome 5
Chromosome 5 is one of the largest human chromosomes yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-encoding genes including the protocadherin and interleukin gene families and the first complete versions of each of the large chromosome 5 specific internal duplications. These duplications are very recent evolutionary events and play a likely mechanistic role, since deletions of these regions are the cause of debilitating disorders including spinal muscular atrophy (SMA)
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Recommended from our members
Pyrosequencing Strategies for cDNA Libraries
The US DOE Joint Genome Institute (JGI) is a high-throughput genomics facility involved in sequencing a variety of organisms. A major effort at JGI is the sequencing of genomes and microbial community samples of relevance to the DOE missions of carbon sequestration, bioremediation and energy production. cDNA/EST sequencing is an integral part of genomic sequencing because it provides crucial information for gene models and genome annotation. The 454 sequencing platform is an integrated system of emulsion-based PCR amplification of hundreds of thousands of DNA fragments linked to high throughput parallel pyrosequencing in picoliter-sized wells. Several strategies have been designed and carried out at JGI to use the 454 platform for cDNA/EST sequencing. cDNA libraries constructed by conventional methods were subjected to direct 454 sequencing. In addition, special primers and adaptors were also designed for library construction so the directional sequencing feature of the 454 technology can be used to sequence a particular end of the cDNA/EST fragments. Adaptor sequences used by 454 library construction can be incorporated into polyT primer, cap primer and/or random primer for cDNA/EST library construction. The 454 sequencing platform can deliver 200 to 400 thousand cDNA/EST reads from a single run and does not require cloning step, potentially improving the coverage obtained through traditional Sanger sequencing. The large numbers of short reads generated by the 454 platform can be aligned to genome assemblies to extend and confirm gene models. Results from different strategies of library construction combined with 454 sequencing will be presented. The coverage of the library and the novelty rate are compared with traditional Sanger sequencing. The possible assembly problems caused by short reads with slightly higher error rate from 454 will also be addressed
Recommended from our members
Isothermal strand-displacement amplification applications for high-throughput genomics
Recommended from our members
The complete sequence of human chromosome 5
Chromosome 5 is one of the largest human chromosomes yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-encoding genes including the protocadherin and interleukin gene families and the first complete versions of each of the large chromosome 5 specific internal duplications. These duplications are very recent evolutionary events and play a likely mechanistic role, since deletions of these regions are the cause of debilitating disorders including spinal muscular atrophy (SMA)
The DNA sequence and comparative analysis of human chromosome 5
Chromosome 5 is one of the largest human chromosomes yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-encoding genes including the protocadherin and interleukin gene families and the first complete versions of each of the large chromosome 5 specific internal duplications. These duplications are very recent evolutionary events and play a likely mechanistic role, since deletions of these regions are the cause of debilitating disorders including spinal muscular atrophy (SMA)
Recommended from our members
The sequence and analysis of duplication rich human chromosome 16
We report here the 78,884,754 base pairs of finished human chromosome 16 sequence, representing over 99.9 percent of its euchromatin. Manual annotation revealed 880 protein coding genes confirmed by 1,637 aligned transcripts, 19 tRNA genes, 341 pseudogenes and 3 RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukemia. Several large-scale structural polymorphisms spanning hundreds of kilobasepairs were identified and result in gene content differences across humans. One of the unique features of chromosome 16 is its high level of segmental duplication, ranked among the highest of the human autosomes. While the segmental duplications are enriched in the relatively gene poor pericentromere of the p-arm, some are involved in recent gene duplication and conversion events which are likely to have had an impact on the evolution of primates and human disease susceptibility
The sequence and analysis of duplication-rich human chromosome 16
Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin. Manual annotation revealed 880 protein-coding genes confirmed by 1,670 aligned transcripts, 19 transfer RNA genes, 341 pseudogenes and three RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukaemia. Several large-scale structural polymorphisms spanning hundreds of kilobase pairs were identified and result in gene content differences among humans. Whereas the segmental duplications of chromosome 16 are enriched in the relatively gene-poor pericentromere of the p arm, some are involved in recent gene duplication and conversion events that are likely to have had an impact on the evolution of primates and human disease susceptibility.Joel Martin, Cliff Han, Laurie A. Gordon, Astrid Terry, Shyam Prabhakar, Xinwei She, Gary Xie, Uffe Hellsten, Yee Man Chan, Michael Altherr, Olivier Couronne, Andrea Aerts, Eva Bajorek, Stacey Black, Heather Blumer, Elbert Branscomb, Nancy C. Brown, William J. Bruno, Judith M. Buckingham, David F. Callen, Connie S. Campbell, Mary L. Campbell, Evelyn W. Campbell, Chenier Caoile, Jean F. Challacombe, Leslie A. Chasteen, Olga Chertkov, Han C. Chi, Mari Christensen, Lynn M. Clark, Judith D. Cohn, Mirian Denys, John C. Detter, Mark Dickson, Mira Dimitrijevic-Bussod, Julio Escobar, Joseph J. Fawcett, Dave Flowers, Dea Fotopulos, Tijana Glavina, Maria Gomez, Eidelyn Gonzales, David Goodstein, Lynne A. Goodwin, Deborah L. Grady, Igor Grigoriev, Matthew Groza, Nancy Hammon, Trevor Hawkins, Lauren Haydu, Carl E. Hildebrand, Wayne Huang, Sanjay Israni, Jamie Jett, Phillip B. Jewett, Kristen Kadner, Heather Kimball, Arthur Kobayashi, Marie-Claude Krawczyk, Tina Leyba, Jonathan L. Longmire, Frederick Lopez, Yunian Lou, Steve Lowry, Thom Ludeman, Chitra F. Manohar, Graham A. Mark, Kimberly L. McMurray, Linda J. Meincke, Jenna Morgan, Robert K. Moyzis, Mark O. Mundt, A. Christine Munk, Richard D. Nandkeshwar, Sam Pitluck, Martin Pollard Paul Predki, Beverly Parson-Quintana, Lucia Ramirez, Sam Rash, James Retterer, Darryl O. Ricke, Donna L. Robinson, Alex Rodriguez, Asaf Salamov, Elizabeth H. Saunders, Duncan Scott, Timothy Shough, Raymond L. Stallings, Malinda Stalvey, Robert D. Sutherland, Roxanne Tapia, Judith G. Tesmer, Nina Thayer, Linda S. Thompson, Hope Tice, David C. Torney, Mary Tran-Gyamfi, Ming Tsai, Levy E. Ulanovsky, Anna Ustaszewska, Nu Vo, P. Scott White, Albert L. Williams, Patricia L. Wills, Jung-Rung Wu, Kevin Wu, Joan Yang, Pieter DeJong, David Bruce, Norman A. Doggett, Larry Deaven, Jeremy Schmutz, Jane Grimwood, Paul Richardson, Daniel S. Rokhsar, Evan E. Eichler, Paul Gilna, Susan M. Lucas, Richard M. Myers, Edward M. Rubin and Len A. Pennacchi