Search CORE

207 research outputs found

Species Choice for Comparative Genomics: Being Greedy Works

Author: Fabio Pardi
International Mouse Genome Sequencing Consortium
Leonid Kruglyak
Nick Goldman
Rat Genome Sequencing Consortium
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

Several projects investigating genetic function and evolution through sequencing and comparison of multiple genomes are now underway. These projects consume many resources, and appropriate planning should be devoted to choosing which species to sequence, potentially involving cooperation among different sequencing centres. A widely discussed criterion for species choice is the maximisation of evolutionary divergence. Our mathematical formalization of this problem surprisingly shows that the best long-term cooperative strategy coincides with the seemingly short-term “greedy” strategy of always choosing the next best single species. Other criteria influencing species choice, such as medical relevance or sequencing costs, can also be accommodated in our approach, suggesting our results' broad relevance in scientific policy decisions

Crossref

Directory of Open Access Journals

PubMed Central

Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model

Author: Chris P Ponting
Gerton Lunter
International Chicken Genome Sequencing Consortium
International Human Genome Sequencing Consortium
International Human Genome Sequencing Consortium
Jotun Hein
Mouse Genome Sequencing Consortium
Steven Henikoff
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

It has become clear that a large proportion of functional DNA in the human genome does not code for protein. Identification of this non-coding functional sequence using comparative approaches is proving difficult and has previously been thought to require deep sequencing of multiple vertebrates. Here we introduce a new model and comparative method that, instead of nucleotide substitutions, uses the evolutionary imprint of insertions and deletions (indels) to infer the past consequences of selection. The model predicts the distribution of indels under neutrality, and shows an excellent fit to human–mouse ancestral repeat data. Across the genome, many unusually long ungapped regions are detected that are unaccounted for by the neutral model, and which we predict to be highly enriched in functional DNA that has been subject to purifying selection with respect to indels. We use the model to determine the proportion under indel-purifying selection to be between 2.56% and 3.25% of human euchromatin. Since annotated protein-coding genes comprise only 1.2% of euchromatin, these results lend further weight to the proposition that more than half the functional complement of the human genome is non-protein-coding. The method is surprisingly powerful at identifying selected sequence using only two or three mammalian genomes. Applying the method to the human, mouse, and dog genomes, we identify 90 Mb of human sequence under indel-purifying selection, at a predicted 10% false-discovery rate and 75% sensitivity. As expected, most of the identified sequence represents unannotated material, while the recovered proportions of known protein-coding and microRNA genes closely match the predicted sensitivity of the method. The method's high sensitivity to functional sequence such as microRNAs suggest that as yet unannotated microRNA genes are enriched among the sequences identified. Futhermore, its independence of substitutions allowed us to identify sequence that has been subject to heterogeneous selection, that is, sequence subject to both positive selection with respect to substitutions and purifying selection with respect to indels. The ability to identify elements under heterogeneous selection enables, for the first time, the genome-wide investigation of positive selection on functional elements other than protein-coding genes

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

Adaptive Evolution of Conserved Noncoding Elements in Mammals

Author: Emmanouil T Dermitzakis
International Chicken Genome Sequencing Consortium
International Human Genome Sequencing Consortium
Jonathan K Pritchard
Mouse Genome Sequencing Consortium
Su Yeon Kim
Publication venue: Public Library of Science
Publication date: 01/09/2007
Field of study

Conserved noncoding elements (CNCs) are an abundant feature of vertebrate genomes. Some CNCs have been shown to act as cis-regulatory modules, but the function of most CNCs remains unclear. To study the evolution of CNCs, we have developed a statistical method called the “shared rates test” to identify CNCs that show significant variation in substitution rates across branches of a phylogenetic tree. We report an application of this method to alignments of 98,910 CNCs from the human, chimpanzee, dog, mouse, and rat genomes. We find that ∼68% of CNCs evolve according to a null model where, for each CNC, a single parameter models the level of constraint acting throughout the phylogeny linking these five species. The remaining ∼32% of CNCs show departures from the basic model including speed-ups and slow-downs on particular branches and occasionally multiple rate changes on different branches. We find that a subset of the significant CNCs have evolved significantly faster than the local neutral rate on a particular branch, providing strong evidence for adaptive evolution in these CNCs. The distribution of these signals on the phylogeny suggests that adaptive evolution of CNCs occurs in occasional short bursts of evolution. Our analyses suggest a large set of promising targets for future functional studies of adaptation

Crossref

Directory of Open Access Journals

PubMed Central

A Phylogenomic Study of Human, Dog, and Mouse

Author: Adrian Schneider
Gaston Gonnet
Gina Cannarozzi
International Human Genome Consortium
Mouse Genome Sequencing Consortium
Philip E Bourne
Publication venue: Public Library of Science
Publication date: 01/01/2007
Field of study

In recent years the phylogenetic relationship of mammalian orders has been addressed in a number of molecular studies. These analyses have frequently yielded inconsistent results with respect to some basal ordinal relationships. For example, the relative placement of primates, rodents, and carnivores has differed in various studies. Here, we attempt to resolve this phylogenetic problem by using data from completely sequenced nuclear genomes to base the analyses on the largest possible amount of data. To minimize the risk of reconstruction artifacts, the trees were reconstructed under different criteria—distance, parsimony, and likelihood. For the distance trees, distance metrics that measure independent phenomena (amino acid replacement, synonymous substitution, and gene reordering) were used, as it is highly improbable that all of the trees would be affected the same way by any reconstruction artifact. In contradiction to the currently favored classification, our results based on full-genome analysis of the phylogenetic relationship between human, dog, and mouse yielded overwhelming support for a primate–carnivore clade with the exclusion of rodents

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

PubMed Central

Transcription Factor Map Alignment of Promoter Regions

Author: Enrique Blanco
International Chicken Genome Sequencing Consortium
International Mouse Genome Sequencing Consortium
Philip Bourne
Roderic Guigó
Temple F Smith
The Gene Ontology Consortium
Xavier Messeguer
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments

CiteSeerX

Public Library of Science (PLOS)

Crossref

Boston University Institutional Repository (OpenBU)

Directory of Open Access Journals

PubMed Central

UPF Digital Repository

Diposit Digital de la Universitat de Barcelona

On the Origin and Evolution of Vertebrate Olfactory Receptor Genes: Comparative Genome Analysis Among 23 Chordate Species

Olfaction is a primitive sense in organisms. Both vertebrates and insects have receptors for detecting odor molecules in the environment, but the evolutionary origins of these genes are different. Among studied vertebrates, mammals have ∼1,000 olfactory receptor (OR) genes, whereas teleost fishes have much smaller (∼100) numbers of OR genes. To investigate the origin and evolution of vertebrate OR genes, I attempted to determine near-complete OR gene repertoires by searching whole-genome sequences of 14 nonmammalian chordates, including cephalochordates (amphioxus), urochordates (ascidian and larvacean), and vertebrates (sea lamprey, elephant shark, five teleost fishes, frog, lizard, and chicken), followed by a large-scale phylogenetic analysis in conjunction with mammalian OR genes identified from nine species. This analysis showed that the amphioxus has >30 vertebrate-type OR genes though it lacks distinctive olfactory organs, whereas all OR genes appear to have been lost in the urochordate lineage. Some groups of genes (θ, κ, and λ) that are phylogenetically nested within vertebrate OR genes showed few gene gains and losses, which is in sharp contrast to the evolutionary pattern of OR genes, suggesting that they are actually non-OR genes. Moreover, the analysis demonstrated a great difference in OR gene repertoires between aquatic and terrestrial vertebrates, reflecting the necessity for the detection of water-soluble and airborne odorants, respectively. However, a minor group (β) of genes that are atypically present in both aquatic and terrestrial vertebrates was also found. These findings should provide a critical foundation for further physiological, behavioral, and evolutionary studies of olfaction in various organisms

Crossref

PubMed Central

Comparative studies of glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1: evidence for a eutherian mammalian origin for the GPIHBP1 gene from an LY6-like gene

Glycosylphosphatidylinositol-anchored high-density lipoprotein-binding protein 1 (GPIHBP1) functions as a platform and transport agent for lipoprotein lipase (LPL) which functions in the hydrolysis of chylomicrons, principally in heart, skeletal muscle and adipose tissue capillary endothelial cells. Previous reports of genetic deficiency for this protein have described severe chylomicronemia. Comparative GPIHBP1 amino acid sequences and structures and GPIHBP1 gene locations were examined using data from several mammalian genome projects. Mammalian GPIHBP1 genes usually contain four coding exons on the positive strand. Mammalian GPIHBP1 sequences shared 41–96% identities as compared with 9–32% sequence identities with other LY6-domain-containing human proteins (LY6-like). The human N-glycosylation site was predominantly conserved among other mammalian GPIHBP1 proteins except cow, dog and pig. Sequence alignments, key amino acid residues and conserved predicted secondary structures were also examined, including the N-terminal signal peptide, the acidic amino acid sequence region which binds LPL, the glycosylphosphatidylinositol linkage group, the Ly6 domain and the C-terminal α-helix. Comparative and phylogenetic studies of mammalian GPIHBP1 suggested that it originated in eutherian mammals from a gene duplication event of an ancestral LY6-like gene and subsequent integration of exon 2, which may have been derived from BCL11A (B-cell CLL/lymphoma 11A gene) encoding an extended acidic amino acid sequence

Crossref

Springer - Publisher Connector

PubMed Central

How repetitive are genomes?

Author: A Faiella
AE Mirsky
B Haubold
Bernhard Haubold
CA Thomas Jn
D Gusfield
D Tautz
EA Bennett
EPC Rocha
G Achaz
International Human Genome Sequencing Consortium
J Liu
JI Jordan
JM Hancock
JM Hancock
L Zhou
LE Orgel
M Hofnung
MA Nóbrega
Mouse Genome Sequencing Consortium
N Volfovsky
OG Troyanskaya
R Development Core Team
RA Aras
Rat Genome Sequencing Consortium
RJ Britten
S Kurtz
SS Shapiro
The Chimpanzee Sequencing and Analysis Consortium
Thomas Wiehe
TR Gregory
WF Doolittle
Y Tian
YL Orlov
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Genome sequences vary strongly in their repetitiveness and the causes for this are still debated. Here we propose a novel measure of genome repetitiveness, the index of repetitiveness, I(r), which can be computed in time proportional to the length of the sequences analyzed. We apply it to 336 genomes from all three domains of life. RESULTS: The expected value of I(r )is zero for random sequences of any G/C content and greater than zero for sequences with excess repeats. We find that the I(r )of archaea is significantly smaller than that of eubacteria, which in turn is smaller than that of eukaryotes. Mouse chromosomes have a significantly higher I(r )than human chromosomes and within each genome the Y chromosome is most repetitive. A sliding window analysis reveals that the human HOXA cluster and two surrounding genes are characterized by local minima in I(r). A program for calculating the I(r )is freely available at . CONCLUSION: The general measure of DNA repetitiveness proposed in this paper can be efficiently computed on a genomic scale. This reveals a broad spectrum of repetitiveness among diverse genomes which agrees qualitatively with previous studies of repeat content. A sliding window analysis helps to analyze the intragenomic distribution of repeats

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

Mouse versus Rat: Profound Differences in Meiotic Regulation at the Level of the Isolated Oocyte

Author: Biggers
Brannstrom
Chen
Chen
Cho
Dekel
Dekel
Dekel
Dekel
Downs
Downs
Downs
Downs
Downs
Downs
Downs
Downs
Downs
Downs
Downs
Eppig
Fagbohun
Fagbohun
Han
Hubbard
LaRosa
LaRosa
Larsen
Magnusson
Makris
Masciarelli
Mouse Genome Sequencing Consortium
Mullins
Norris
Norris
Norris
Racowsky
Rat Genome Sequencing Project Consortium
Ratner
Ratner
Reizel
Richard
Romero
Schultz
Sela-Abramovich
Sela-Abramovich
Sela-Abramovich
Shitsukawa
Tornell
Tornell
Tornell
Tornell
Tornell
Tsafriri
Tsafriri
Vaccari
Vivarelli
Zeilmaker
Zhang
Zhang
Zhang
Zhao
Publication venue: e-Publications@Marquette
Publication date: 01/01/2011
Field of study

Cumulus cell-enclosed oocytes (CEO), denuded oocytes (DO), or dissected follicles were obtained 44–48 hr after priming immature mice (20–23 days old) with 5 IU or immature rats (25–27 days old) with 12.5 IU of equine chorionic gonadotropin, and exposed to a variety of culture conditions. Mouse oocytes were more effectively maintained in meiotic arrest by hypoxanthine, dbcAMP, IBMX, milrinone, and 8-Br-cGMP. Atrial natriuretic peptide, a guanylate cyclase activator, suppressed maturation in CEO from both species, but mycophenolic acid reversed IBMX-maintained meiotic arrest in mouse CEO with little activity in rat CEO. IBMX-arrested mouse, but not rat, CEO were induced to undergo germinal vesicle breakdown (GVB) by follicle-stimulating hormone (FSH) and amphiregulin, while human chorionic gonadotropin (hCG) was ineffective in both species. Nevertheless, FSH and amphiregulin stimulated cumulus expansion in both species. FSH and hCG were both effective inducers of GVB in cultured mouse and rat follicles while amphiregulin was stimulatory only in mouse follicles. Changing the culture medium or altering macromolecular supplementation had no effect on FSH-induced maturation in rat CEO. The AMP-activated protein kinase (AMPK) activator, AICAR, was a potent stimulator of maturation in mouse CEO and DO, but only marginally stimulatory in rat CEO and ineffective in rat DO. The AMPK inhibitor, compound C, blocked meiotic induction more effectively in hCG-treated mouse follicles and heat-treated mouse CEO. Both agents produced contrasting results on polar body formation in cultured CEO in the two species. Active AMPK was detected in germinal vesicles of immature mouse, but not rat, oocytes prior to hCG-induced maturation in vivo; it colocalized with chromatin after GVB in rat and mouse oocytes, but did not appear at the spindle poles in rat oocytes as it did in mouse oocytes. Finally, cultured mouse and rat CEO displayed disparate maturation responses to energy substrate manipulation. These data highlight significant differences in meiotic regulation between the two species, and demonstrate a greater potential in mice for control at the level of the cumulus CEO

epublications@Marquette

Crossref

PubMed Central

Genome-scale relationships between cytosine methylation and dinucleotide abundances in animals

Author: Aissani
Bernardi
Beutler
Bird
Blake
Burge
Cardon
Cooper
Coulondre
Duret
Fryxell
Fryxell
Galtier
Gentles
Hanai
Ikehata
International Human Genome Sequencing Consortium
Jabbari
Jabbari
Jiang
Karlin
Karlin
Karlin
Karlin
Lefebvre
Lyko
Macleod
Marhold
Martin W. Simmen
Mouse Genome Sequencing Consortium
Ollila
Pfeifer
Regev
Russell
Salser
Shackelton
Shen
Sved
Swartz
Tweedie
Wang
Widom
Zhang
Zhao
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

AbstractIn mammalian genomes CpGs occur at one-fifth their expected frequency. This is accepted as resulting from cytosine methylation and deamination of 5-methylcytosine leading to TpG and CpA dinucleotides. The corollary that a CpG deficit should correlate with TpG excess has not hitherto been systematically tested at a genomic level. I analyzed genome sequences (human, chimpanzee, mouse, pufferfish, zebrafish, sea squirt, fruitfly, mosquito, and nematode) to do this and generally to assess the hypothesis that CpG deficit, TpG excess, and other data are accountable in terms of 5-methylcytosine mutation. In all methylated genomes local CpG deficit decreases with higher G + C content. Local TpG surplus, while positively associated with G + C level in mammalian genomes but negatively associated with G + C in nonmammalian methylated genomes, is always explicable in terms of the CpG trend under the methylation model. Covariance of dinucleotide abundances with G + C demonstrates that correlation analyses should control for G + C. Doing this reveals a strong negative correlation between local CpG and TpG abundances in methylated genomes, in accord with the methylation hypothesis. CpG deficit also correlates with CpT excess in mammals, which may reflect enhanced cytosine mutation in the context 5′-YCG-3′. Analyses with repeat-masked sequences show that the results are not attributable to repetitive elements

Crossref

Elsevier - Publisher Connector

Edinburgh Research Explorer