207 research outputs found

    Isochores Merit the Prefix 'Iso'

    Full text link
    The isochore concept in human genome sequence was challenged in an analysis by the International Human Genome Sequencing Consortium (IHGSC). We argue here that a statement in IGHSC analysis concerning the existence of isochore is incorrect, because it had applied an inappropriate statistical test. To test the existence of isochores should be equivalent to a test of homogeneity of windowed GC%. The statistical test applied in the IHGSC's analysis, the binomial test, is however a test of a sequence being random on the base level. For testing the existence of isochore, or homogeneity in GC%, we propose to use another statistical test: the analysis of variance (ANOVA). It can be shown that DNA sequences that are rejected by binomial test may not be rejected by the ANOVA test.Comment: 14 pages (including 1 figure), submitte

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    A History of Genomics across Species, Communities and Projects

    Get PDF

    The Distributions of "New" and "Old" Alu Sequences in the Human Genome: The Solution of a "Mystery"

    Get PDF
    The distribution in the human genome of the largest family of mobile elements, the Alu sequences, has been investigated for the past 30 years, and the vast majority of Alu sequences were shown to have the highest density in GC-rich isochores. Ten years ago, it was discovered, however, that the small ''youngest'' (most recently transposed) Alu families had a strikingly different distribution compared with the ''old'' families. This raised the question as to how this change took place in evolution. We solved what was considered to be a ''mystery'' by 1) revisiting our previous results on the integration and stability of retroviral sequences, and 2) assessing the densities of acceptor sites TTTT/AA in isochore families. We could conclude 1) that the open state of chromatin structure plays a crucial role in allowing not only the initial integration of retroviral sequences but also that of the youngest Alu sequences, and 2) that the distribution of old Alus can be explained as due to Alu sequences being unstable in the GC-poor isochores but stable in the compositionally matching GC-rich isochores, again in line with what happens in the case of retroviral sequences

    NIH Data and Resource Sharing, Data Release and Intellectual Property Policies for Genomics Community Resource Projects

    Get PDF
    Most observers predict significant health-related gains from genomics research. Policy and legal decisions made by government institutions, the courts and legislatures have the potential to make a significant impact on both the quantity and quality of effective and innovative healthcare-related products ultimately derived from the vibrant genomics research enterprise. In particular, the careful management of the intellectual property (IP) aspects of this promising area of research will be necessary to maximise scientific progress, provide appropriate incentives for investment, and ultimately ensure optimal public benefit. It is the mission of the US National Institutes of Health (NIH), which is comprised of 27 individual institutes and is an agency of the US Department of Health and Human Services, to facilitate the translation of basic biomedical research discoveries into useful healthcare services and products. Within the NIH, the National Human Genome Research Institute (NHGRI) is the agency’s lead entity for advancing human health through genetic research.Through its stewardship of an array of infrastructure and research projects, including several innovative public-private consortia efforts, the NHGRI seeks to contribute to the genomic tools, data and knowledge bases. In general, I believe that scientific progress in this still young field will be best served by early, open and continuing access to: i) comprehensive, high-quality data sets containing basic biological and biochemical data; and ii) critical biological materials such as animal models and genes. Data such as the complete nucleotide sequence of many different organisms’ genomes, information on genetic variation within and among populations, and results on how gene expression is regulated at the cellular and molecular level are often referred to as ‘precompetitive’ information, and in my view should be made rapidly available to all, without restrictions on use. Adherence by data and resource producers and users to this simple strategy should ensure that industry and academic researchers will be able to build upon this strong foundation. At the NIH we are expected to support basic scientific discovery whilst simultaneously facilitating the appropriate commercial research and development of the results of our formidable research programs. A sizeable number of end users for these resource projects are employed with private sector companies. For this constituency the terms governing the data use, data release, the sharing and distribution of research resources and intellectual property rights of derivative inventions are of particular importance. Policies that limit companies’ ability to file patent applications or licence downstream uses could end up having an unintended inhibitory effect on the development of biomedical products. Government policies need to balance the important dual goals of continuing to rapidly place huge amounts of data in the public domain and encouraging restriction-free sharing of genomic tools, whilst also ensuring that more applied inventions, notably those closer to being an actual product, can be patented. US taxpayers, and especially patients, would like the government to appropriately foster the commercialisation of promising inventions derived from use of the data and reagents generated by these efforts. Currently, the NHGRI is actively involved in the development and vetting of policy options aimed at ensuring that genomic tools, resources and databases of genomic information are used in a manner that promotes scientific research and the practice of medicine. Relevant policies implemented by NIH-supported public private consortia efforts such the International Human Genome Sequencing Consortium (IHGSC),2 the Trans-NIH Mouse Initiative,3 the Mammalian Gene Collection (MGC)4 and the International Haplotype Map Project (HapMap)5 are specifically covered in this review

    NIH Data and Resource Sharing, Data Release and Intellectual Property Policies for Genomics Community Resource Projects

    Get PDF
    Most observers predict significant health-related gains from genomics research. Policy and legal decisions made by government institutions, the courts and legislatures have the potential to make a significant impact on both the quantity and quality of effective and innovative healthcare-related products ultimately derived from the vibrant genomics research enterprise. In particular, the careful management of the intellectual property (IP) aspects of this promising area of research will be necessary to maximise scientific progress, provide appropriate incentives for investment, and ultimately ensure optimal public benefit. It is the mission of the US National Institutes of Health (NIH), which is comprised of 27 individual institutes and is an agency of the US Department of Health and Human Services, to facilitate the translation of basic biomedical research discoveries into useful healthcare services and products. Within the NIH, the National Human Genome Research Institute (NHGRI) is the agency’s lead entity for advancing human health through genetic research.Through its stewardship of an array of infrastructure and research projects, including several innovative public-private consortia efforts, the NHGRI seeks to contribute to the genomic tools, data and knowledge bases. In general, I believe that scientific progress in this still young field will be best served by early, open and continuing access to: i) comprehensive, high-quality data sets containing basic biological and biochemical data; and ii) critical biological materials such as animal models and genes. Data such as the complete nucleotide sequence of many different organisms’ genomes, information on genetic variation within and among populations, and results on how gene expression is regulated at the cellular and molecular level are often referred to as ‘precompetitive’ information, and in my view should be made rapidly available to all, without restrictions on use. Adherence by data and resource producers and users to this simple strategy should ensure that industry and academic researchers will be able to build upon this strong foundation. At the NIH we are expected to support basic scientific discovery whilst simultaneously facilitating the appropriate commercial research and development of the results of our formidable research programs. A sizeable number of end users for these resource projects are employed with private sector companies. For this constituency the terms governing the data use, data release, the sharing and distribution of research resources and intellectual property rights of derivative inventions are of particular importance. Policies that limit companies’ ability to file patent applications or licence downstream uses could end up having an unintended inhibitory effect on the development of biomedical products. Government policies need to balance the important dual goals of continuing to rapidly place huge amounts of data in the public domain and encouraging restriction-free sharing of genomic tools, whilst also ensuring that more applied inventions, notably those closer to being an actual product, can be patented. US taxpayers, and especially patients, would like the government to appropriately foster the commercialisation of promising inventions derived from use of the data and reagents generated by these efforts. Currently, the NHGRI is actively involved in the development and vetting of policy options aimed at ensuring that genomic tools, resources and databases of genomic information are used in a manner that promotes scientific research and the practice of medicine. Relevant policies implemented by NIH-supported public private consortia efforts such the International Human Genome Sequencing Consortium (IHGSC),2 the Trans-NIH Mouse Initiative,3 the Mammalian Gene Collection (MGC)4 and the International Haplotype Map Project (HapMap)5 are specifically covered in this review

    Multiple species comparative analysis of human chromosome 22 between markers D22S1687 and D22S419 and gene expression profiling in zebrafish.

    Get PDF
    Major large scale insertions or deletions that resulted in gene number differences between human and chimpanzee were discovered in the IGLL and LCR22s within this region, with four human insertions from 6 Kb to 75 Kb and three chimpanzee insertions from 12 Kb to 74 Kb observed in the IGLL region, two human insertions of 59 Kb and 36 Kb in LCR22-6, and a 67 Kb chimpanzee insertion in LCR22-8. Small scale insertions and deletions, in addition to exon shuffling, elevated nucleotide divergence rate and positive selection were also observed in the putative genes, partially duplicated genes and pseudogenes in the IGLL and LCR22s. Thus, the second major conclusion of this study is the major differences between human and chimpanzee in this region lies in the highly repetitive regions of the IGLL and the LCR22s.Comparison of a 4.5 Mb region of human chromosome 22 between markers D22s1687 and D22s419, with the syntenic region in chimpanzee had revealed overall DNA sequence identity of approximately 97.6%, Ka/Ks ratio of known protein coding genes at approximately 0.25, with the majority of amino acid changes between hydrophilic amino acids, followed by changes between hydrophobic amino acids, and the least changes between hydrophobic to hydrophilic amino acids or vise versa. Thus, the first major conclusion of this study is that overall, this chromosomal region is highly conserved between human and chimpanzee, and the known protein coding genes are undergoing purifying selections, in which 75% of nucleotide substitutions that led to amino acid changes were eliminated by adaptive evolution.Through whole mount in situ hybridization studies, a total of 12 human orthologs in zebrafish, including 4 newly predicted putative genes with no previously known expression profile and function, showed specific expression in the developing zebrafish embryonic central nervous system, optic system, the neural crest cells, ottic vesicle, liver, and notochord. Thus, the third major conclusion from this present study is that many predicted genes which currently lack expression data and functional information likely are time and tissue specific during developmental processes

    DNA methylation profiling of the human major histocompatibility complex: A pilot study for the Human Epigenome Project

    Get PDF
    The Human Epigenome Project aims to identify, catalogue, and interpret genome-wide DNA methylation phenomena. Occurring naturally on cytosine bases at cytosine-guanine dinucleotides, DNA methylation is intimately involved in diverse biological processes and the aetiology of many diseases. Differentially methylated cytosines give rise to distinct profiles, thought to be specific for gene activity, tissue type, and disease state. The identification of such methylation variable positions will significantly improve our understanding of genome biology and our ability to diagnose disease. Here, we report the results of the pilot study for the Human Epigenome Project entailing the methylation analysis of the human major histocompatibility complex. This study involved the development of an integrated pipeline for high-throughput methylation analysis using bisulphite DNA sequencing, discovery of methylation variable positions, epigenotyping by matrix-assisted laser desorption/ionisation mass spectrometry, and development of an integrated public database available at http://www.epigenome.org. Our analysis of DNA methylation levels within the major histocompatibility complex, including regulatory exonic and intronic regions associated with 90 genes in multiple tissues and individuals, reveals a bimodal distribution of methylation profiles (i.e., the vast majority of the analysed regions were either hypo- or hypermethylated), tissue specificity, inter-individual variation, and correlation with independent gene expression data
    corecore