96 research outputs found
Constraints on genes shape long-term conservation of macro-synteny in metazoan genomes
<p>Abstract</p> <p>Background</p> <p>Many metazoan genomes conserve chromosome-scale gene linkage relationships (âmacro-syntenyâ) from the common ancestor of multicellular animal life <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, but the biological explanation for this conservation is still unknown. Double cut and join (DCJ) is a simple, well-studied model of neutral genome evolution amenable to both simulation and mathematical analysis <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, but as we show here, it is not sufficent to explain long-term macro-synteny conservation.</p> <p>Results</p> <p>We examine a family of simple (one-parameter) extensions of DCJ to identify models and choices of parameters consistent with the levels of macro- and micro-synteny conservation observed among animal genomes. Our software implements a flexible strategy for incorporating genomic context into the DCJ model to incorporate various types of genomic context (âDCJ-[C]â), and is available as open source software from <url>http://github.com/putnamlab/dcj-c</url>.</p> <p>Conclusions</p> <p>A simple model of genome evolution, in which DCJ moves are allowed only if they maintain chromosomal linkage among a set of constrained genes, can simultaneously account for the level of macro-synteny conservation and for correlated conservation among multiple pairs of species. Simulations under this model indicate that a constraint on approximately 7% of metazoan genes is sufficient to constrain genome rearrangement to an average rate of 25 inversions and 1.7 translocations per million years.</p
Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication
Horseshoe crabs are marine arthropods with a fossil record extending back
approximately 450 million years. They exhibit remarkable morphological
stability over their long evolutionary history, retaining a number of ancestral
arthropod traits, and are often cited as examples of "living fossils." As
arthropods, they belong to the Ecdysozoa}, an ancient super-phylum whose
sequenced genomes (including insects and nematodes) have thus far shown more
divergence from the ancestral pattern of eumetazoan genome organization than
cnidarians, deuterostomes, and lophotrochozoans. However, much of ecdysozoan
diversity remains unrepresented in comparative genomic analyses. Here we use a
new strategy of combined de novo assembly and genetic mapping to examine the
chromosome-scale genome organization of the Atlantic horseshoe crab Limulus
polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by
sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their
parents at a mean redundancy of 1.1x per sample. The map includes 84,307
sequence markers and 5,775 candidate conserved protein coding genes. Comparison
to other metazoan genomes shows that the L. polyphemus genome preserves
ancestral bilaterian linkage groups, and that a common ancestor of modern
horseshoe crabs underwent one or more ancient whole genome duplications (WGDs)
~ 300 MYA, followed by extensive chromosome fusion
Improving Phrap-Based Assembly of the Rat Using âReliableâ Overlaps
The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of âreliableâ overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version PhrapUMD. Integrating PhrapUMD and our âreliable-overlapâ algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the Rattus norvegicus genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at http://www.genome.umd.edu. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps
Whole Genome Sequencing of Mutation Accumulation Lines Reveals a Low Mutation Rate in the Social Amoeba Dictyostelium discoideum
Spontaneous mutations play a central role in evolution. Despite their importance, mutation rates are some of the most elusive parameters to measure in evolutionary biology. The combination of mutation accumulation (MA) experiments and whole-genome sequencing now makes it possible to estimate mutation rates by directly observing new mutations at the molecular level across the whole genome. We performed an MA experiment with the social amoeba Dictyostelium discoideum and sequenced the genomes of three randomly chosen lines using high-throughput sequencing to estimate the spontaneous mutation rate in this model organism. The mitochondrial mutation rate of 6.76Ă10(-9), with a Poisson confidence interval of 4.1Ă10(-9) - 9.5Ă10(-9), per nucleotide per generation is slightly lower than estimates for other taxa. The mutation rate estimate for the nuclear DNA of 2.9Ă10(-11), with a Poisson confidence interval ranging from 7.4Ă10(-13) to 1.6Ă10(-10), is the lowest reported for any eukaryote. These results are consistent with low microsatellite mutation rates previously observed in D. discoideum and low levels of genetic variation observed in wild D. discoideum populations. In addition, D. discoideum has been shown to be quite resistant to DNA damage, which suggests an efficient DNA-repair mechanism that could be an adaptation to life in soil and frequent exposure to intracellular and extracellular mutagenic compounds. The social aspect of the life cycle of D. discoideum and a large portion of the genome under relaxed selection during vegetative growth could also select for a low mutation rate. This hypothesis is supported by a significantly lower mutation rate per cell division in multicellular eukaryotes compared with unicellular eukaryotes
Evolutionary profiling reveals the heterogeneous origins of classes of human disease genes: implications for modeling disease genetics in animals
Background: The recent expansion of whole-genome sequence data available from diverse animal lineages provides an opportunity to investigate the evolutionary origins of specific classes of human disease genes. Previous studies have observed that human disease genes are of particularly ancient origin. While this suggests that many animal species have the potential to serve as feasible models for research on genes responsible for human disease, it is unclear whether this pattern has meaningful implications and whether it prevails for every class of human disease. Results: We used a comparative genomics approach encompassing a broad phylogenetic range of animals with sequenced genomes to determine the evolutionary patterns exhibited by human genes associated with different classes of disease. Our results support previous claims that most human disease genes are of ancient origin but, more importantly, we also demonstrate that several specific disease classes have a significantly large proportion of genes that emerged relatively recently within the metazoans and/or vertebrates. An independent assessment of the synonymous to non-synonymous substitution rates of human disease genes found in mammals reveals that disease classes that arose more recently also display unexpected rates of purifying selection between their mammalian and human counterparts. Conclusions: Our results reveal the heterogeneity underlying the evolutionary origins of (and selective pressures on) different classes of human disease genes. For example, some disease gene classes appear to be of uncommonly recent (i.e., vertebrate-specific) origin and, as a whole, have been evolving at a faster rate within mammals than the majority of disease classes having more ancient origins. The novel patterns that we have identified may provide new insight into cases where studies using traditional animal models were unable to produce results that translated to humans. Conversely, we note that the larger set of disease classes do have ancient origins, suggesting that many non-traditional animal models have the potential to be useful for studying many human disease genes. Taken together, these findings emphasize why model organism selection should be done on a disease-by-disease basis, with evolutionary profiles in mind
Recommended from our members
Bos taurus genome assembly
We present here the assembly of the bovine genome. The assembly method combines the BAC plus WGS local assembly used for the rat and sea urchin with the whole genome shotgun (WGS) only assembly used for many other animal genomes including the rhesus macaque. The assembly process consisted of multiple phases: First, BACs were assembled with BAC generated sequence, then subsequently in combination with the individual overlapping WGS reads. Different assembly parameters were tested to separately optimize the performance for each BAC assembly of the BAC and WGS reads. In parallel, a second assembly was produced using only the WGS sequences and a global whole genome assembly method. The two assemblies were combined to create a more complete genome representation that retained the high quality BAC-based local assembly information, but with gaps between BACs filled in with the WGS-only assembly. Finally, the entire assembly was placed on chromosomes using the available map information. Over 90% of the assembly is now placed on chromosomes. The estimated genome size is 2.87 Gb which represents a high degree of completeness, with 95% of the available EST sequences found in assembled contigs. The quality of the assembly was evaluated by comparison to 73 finished BACs, where the draft assembly covers between 92.5 and 100% (average 98.5%) of the finished BACs. The assembly contigs and scaffolds align linearly to the finished BACs, suggesting that misassemblies are rare. Genotyping and genetic mapping of 17,482 SNPs revealed that more than 99.2% were correctly positioned within the Btau_4.0 assembly, confirming the accuracy of the assembly. The biological analysis of this bovine genome assembly is being published, and the sequence data is available to support future bovine research
The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.
Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific life history.This work was supported by the following grants: NHGRIU54HG003273 to R.A.G; EU Marie Curie ITN #215781 âEvonetâ to M.A.; a Wellcome Trust Value in People (VIP) award to C.B. and Wellcome Trust graduate studentship WT089615MA to J.E.G; Marine
rhythms of Lifeâ of the University of Vienna, an FWF (http://www.fwf.ac.at/) START award (#AY0041321) and HFSP (http://www.hfsp.org/) research grant (#RGY0082/2010) to KT-ÂâR; MFPL Vienna International PostDoctoral Program for Molecular Life Sciences (funded by Austrian Ministry of Science and Research and City of Vienna, Cultural Department -ÂâScience and Research to T.K; Direct Grant (4053034) of the Chinese University of Hong Kong to J.H.L.H.; NHGRI HG004164 to G.M.; Danish Research Agency (FNU), Carlsberg Foundation, and Lundbeck Foundation to C.J.P.G.; U.S. National Institutes of Health R01AI55624 to J.H.W.; Royal Society University Research fellowship to F.M.J.; P.D.E. was supported by the BBSRC via the Babraham Institute;This is the final version of the article. It first appeared from PLOS via http://dx.doi.org/10.1371/journal.pbio.100200
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers âŒ99% of the euchromatic genome and is accurate to an error rate of âŒ1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
The Rice Thresher
A weekly student newspaper from the Rice University in Houston, Texas that includes campus news and commentaries along with advertising
- âŠ