708,652 research outputs found
Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model
It has become clear that a large proportion of functional DNA in the human genome does not code for protein. Identification of this non-coding functional sequence using comparative approaches is proving difficult and has previously been thought to require deep sequencing of multiple vertebrates. Here we introduce a new model and comparative method that, instead of nucleotide substitutions, uses the evolutionary imprint of insertions and deletions (indels) to infer the past consequences of selection. The model predicts the distribution of indels under neutrality, and shows an excellent fit to human–mouse ancestral repeat data. Across the genome, many unusually long ungapped regions are detected that are unaccounted for by the neutral model, and which we predict to be highly enriched in functional DNA that has been subject to purifying selection with respect to indels. We use the model to determine the proportion under indel-purifying selection to be between 2.56% and 3.25% of human euchromatin. Since annotated protein-coding genes comprise only 1.2% of euchromatin, these results lend further weight to the proposition that more than half the functional complement of the human genome is non-protein-coding. The method is surprisingly powerful at identifying selected sequence using only two or three mammalian genomes. Applying the method to the human, mouse, and dog genomes, we identify 90 Mb of human sequence under indel-purifying selection, at a predicted 10% false-discovery rate and 75% sensitivity. As expected, most of the identified sequence represents unannotated material, while the recovered proportions of known protein-coding and microRNA genes closely match the predicted sensitivity of the method. The method's high sensitivity to functional sequence such as microRNAs suggest that as yet unannotated microRNA genes are enriched among the sequences identified. Futhermore, its independence of substitutions allowed us to identify sequence that has been subject to heterogeneous selection, that is, sequence subject to both positive selection with respect to substitutions and purifying selection with respect to indels. The ability to identify elements under heterogeneous selection enables, for the first time, the genome-wide investigation of positive selection on functional elements other than protein-coding genes
1: To Know Ourselves
AT THE END OF THE ROAD in Little Cottonwood Canyon, near Salt Lake City, Alta is a place of near-mythic renown among skiers. In time it may well assume similar status among molecular geneticists. In December 1984, a conference there, co-sponsored by the U.S. Department of Energy, pondered a single question: Does modern DNA research offer a way of detecting tiny genetic mutations—and, in particular, of observing any increase in the mutation rate among the survivors of the Hiroshima and Nagasaki bombings and their descendants? In short the answer was, Not yet. But in an atmosphere of rare intellectual fertility, the seeds were sown for a project that would make such detection possible in the future—the Human Genome Project
Transposable element insertions have strongly affected human evolution
Comparison of a full collection of the transposable element (TE) sequences of vertebrates with genome sequences shows that the human genome makes 655 perfect full-length matches. The cause is that the human genome contains many active TEs that have caused TE inserts in relatively recent times. These TE inserts in the human genome are several types of young Alus (AluYa5, AluYb8, AluYc1, etc.). Work in many laboratories has shown that such inserts have many effects including changes in gene expression, increases in recombination, and unequal crossover. The time of these very effective changes in the human lineage genome extends back about 4 million years according to these data and very likely much earlier. Rapid human lineage-specific evolution, including brain size is known to have also occurred in the last few million years. Alu insertions likely underlie rapid human lineage evolution. They are known to have many effects. Examples are listed in which TE sequences have influenced human-specific genes. The proposed model is that the many TE insertions created many potentially effective changes and those selected were responsible for a part of the striking human lineage evolution. The combination of the results of these events that were selected during human lineage evolution was apparently effective in producing a successful and rapidly evolving species
Promoting and Managing Genome Innovation
An introduction to the symposium, Promoting and Managing Genome Innovation held October 1995. The conference was organized by Professor Thomas G. Field, Jr. and Gianna Julian-Arnold. The conference was funded in part by the Ethical, Legal and Social Issues component of the D.O.E. Human Genome Program; Nixon, Hargrave, Devans & Doyle L.L.P., Rochester, N.Y.; and Human Genome Sciences
Genome maps across 26 human populations reveal population-specific patterns of structural variation.
Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome
Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence
BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve
From metagenomics to the metagenome: Conceptual change and the rhetoric of translational genomic research
As the international genomic research community moves from the tool-making efforts of the Human Genome Project into biomedical applications of those tools, new metaphors are being suggested as useful to understanding how our genes work – and for understanding who we are as biological organisms. In this essay we focus on the Human Microbiome Project as one such translational initiative. The HMP is a new ‘metagenomic’ research effort to sequence the genomes of human microbiological flora, in order to pursue the interesting hypothesis that our ‘microbiome’ plays a vital and interactive role with our human genome in normal human physiology. Rather than describing the human genome as the ‘blueprint’ for human nature, the promoters of the HMP stress the ways in which our primate lineage DNA is interdependent with the genomes of our microbiological flora. They argue that the human body should be understood as an ecosystem with multiple ecological niches and habitats in which a variety of cellular species collaborate and compete, and that human beings should be understood as ‘superorganisms’ that incorporate multiple symbiotic cell species into a single individual with very blurry boundaries. These metaphors carry interesting philosophical messages, but their inspiration is not entirely ideological. Instead, part of their cachet within genome science stems from the ways in which they are rooted in genomic research techniques, in what philosophers of science have called a ‘tools-to-theory’ heuristic. Their emergence within genome science illustrates the complexity of conceptual change in translational research, by showing how it reflects both aspirational and methodological influences
A Phylogenomic Study of Human, Dog, and Mouse
In recent years the phylogenetic relationship of mammalian orders has been addressed in a number of molecular studies. These analyses have frequently yielded inconsistent results with respect to some basal ordinal relationships. For example, the relative placement of primates, rodents, and carnivores has differed in various studies. Here, we attempt to resolve this phylogenetic problem by using data from completely sequenced nuclear genomes to base the analyses on the largest possible amount of data. To minimize the risk of reconstruction artifacts, the trees were reconstructed under different criteria—distance, parsimony, and likelihood. For the distance trees, distance metrics that measure independent phenomena (amino acid replacement, synonymous substitution, and gene reordering) were used, as it is highly improbable that all of the trees would be affected the same way by any reconstruction artifact. In contradiction to the currently favored classification, our results based on full-genome analysis of the phylogenetic relationship between human, dog, and mouse yielded overwhelming support for a primate–carnivore clade with the exclusion of rodents
- …