8 research outputs found

    Database: The Journal of Biological Databases and Curation

    Get PDF
    Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available.Database URL: http://www.ensembl.org

    Kinetoplastid Phylogenomics Reveals the Evolutionary Innovations Associated with the Origins of Parasitism

    Get PDF
    The evolution of parasitism is a recurrent event in the history of life and a core problem in evolutionary biology. Trypanosomatids are important parasites and include the human pathogens Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., which in humans cause African trypanosomiasis, Chagas disease, and leishmaniasis, respectively. Genome comparison between trypanosomatids reveals that these parasites have evolved specialized cell-surface protein families, overlaid on a well-conserved cell template. Understanding how these features evolved and which ones are specifically associated with parasitism requires comparison with related non-parasites. We have produced genome sequences for Bodo saltans, the closest known non-parasitic relative of trypanosomatids, and a second bodonid, Trypanoplasma borreli. Here we show how genomic reduction and innovation contributed to the character of trypanosomatid genomes. We show that gene loss has “streamlined” trypanosomatid genomes, particularly with respect to macromolecular degradation and ion transport, but consistent with a widespread loss of functional redundancy, while adaptive radiations of gene families involved in membrane function provide the principal innovations in trypanosomatid evolution. Gene gain and loss continued during trypanosomatid diversification, resulting in the asymmetric assortment of ancestral characters such as peptidases between Trypanosoma and Leishmania, genomic differences that were subsequently amplified by lineage-specific innovations after divergence. Finally, we show how species-specific, cell-surface gene families (DGF-1 and PSA) with no apparent structural similarity are independent derivations of a common ancestral form, which we call “bodonin.” This new evidence defines the parasitic innovations of trypanosomatid genomes, revealing how a free-living phagotroph became adapted to exploiting hostile host environments

    Assembly and annotation tools for analysis of large contiguous regions of the maize genome

    Get PDF
    Sequencing projects continue to tackle larger and more challenging genomes. Many of the grass genomes have important agriculture and economic impacts. The maize genome is now underway, and others important as foods and biofuels will be sequenced in the near future. The grass genomes are very large and therefore computationally complex to assemble and annotate. Long terminal repeat retrotransposons make up significant portions of many of the longer grass genomes. Their repeat sequences across the genome, their terminal repeats, and their nested cluster configuration make assembly of sequence clones challenging and identification of gene regions difficult. Tools are needed to assist with the more difficult types of genomes that are sequenced today and will be sequenced in the future. Sequencing of the maize genome is underway, but still much is not known about the landscape of the genome. While many smaller regions of maize have been sequenced, they cannot give a full picture of the structure and layout of gene islands and of repeat clusters. In addition, because of the available small sequenced contigs of maize, a true view of the relationships between maize and other grass genomes remains elusive. In this thesis I provide tools necessary for both assembly and annotation of highly repetitive genomes, and I use these tools to construct the currently two longest maize sequence contigs. These contigs provide a resource for many of the unanswered questions of the maize genome

    The complete plastome sequences of eleven Capsicum genotypes: Insights into DNA variation and molecular evolution

    Get PDF
    Members of the genus Capsicum are of great economic importance, including both wild forms and cultivars of peppers and chilies. The high number of potentially informative characteristics that can be identified through next-generation sequencing technologies gave a huge boost to evolutionary and comparative genomic research in higher plants. Here, we determined the complete nucleotide sequences of the plastomes of eight Capsicum species (eleven genotypes), representing the three main taxonomic groups in the genus and estimated molecular diversity. Comparative analyses highlighted a wide spectrum of variation, ranging from point mutations to small/medium size insertions/deletions (InDels), with accD, ndhB, rpl20, ycf1, and ycf2 being the most variable genes. The global pattern of sequence variation is consistent with the phylogenetic signal. Maximum-likelihood tree estimation revealed that Capsicum chacoense is sister to the baccatum complex. Divergence and positive selection analyses unveiled that protein-coding genes were generally well conserved, but we identified 25 positive signatures distributed in six genes involved in different essential plastid functions, suggesting positive selection during evolution of Capsicum plastomes. Finally, the identified sequence variation allowed us to develop simple PCR-based markers useful in future work to discriminate species belonging to different Capsicum complexes

    Kinetoplastid Phylogenomics Reveals the Evolutionary Innovations Associated with the Origins of Parasitism.

    Get PDF
    The evolution of parasitism is a recurrent event in the history of life and a core problem in evolutionary biology. Trypanosomatids are important parasites and include the human pathogens Trypanosoma brucei, Trypanosoma cruzi, and Leishmania spp., which in humans cause African trypanosomiasis, Chagas disease, and leishmaniasis, respectively. Genome comparison between trypanosomatids reveals that these parasites have evolved specialized cell-surface protein families, overlaid on a well-conserved cell template. Understanding how these features evolved and which ones are specifically associated with parasitism requires comparison with related non-parasites. We have produced genome sequences for Bodo saltans, the closest known non-parasitic relative of trypanosomatids, and a second bodonid, Trypanoplasma borreli. Here we show how genomic reduction and innovation contributed to the character of trypanosomatid genomes. We show that gene loss has "streamlined" trypanosomatid genomes, particularly with respect to macromolecular degradation and ion transport, but consistent with a widespread loss of functional redundancy, while adaptive radiations of gene families involved in membrane function provide the principal innovations in trypanosomatid evolution. Gene gain and loss continued during trypanosomatid diversification, resulting in the asymmetric assortment of ancestral characters such as peptidases between Trypanosoma and Leishmania, genomic differences that were subsequently amplified by lineage-specific innovations after divergence. Finally, we show how species-specific, cell-surface gene families (DGF-1 and PSA) with no apparent structural similarity are independent derivations of a common ancestral form, which we call "bodonin." This new evidence defines the parasitic innovations of trypanosomatid genomes, revealing how a free-living phagotroph became adapted to exploiting hostile host environments

    Multiple species comparative analysis of human chromosome 22 between markers D22S1687 and D22S419 and gene expression profiling in zebrafish.

    Get PDF
    Major large scale insertions or deletions that resulted in gene number differences between human and chimpanzee were discovered in the IGLL and LCR22s within this region, with four human insertions from 6 Kb to 75 Kb and three chimpanzee insertions from 12 Kb to 74 Kb observed in the IGLL region, two human insertions of 59 Kb and 36 Kb in LCR22-6, and a 67 Kb chimpanzee insertion in LCR22-8. Small scale insertions and deletions, in addition to exon shuffling, elevated nucleotide divergence rate and positive selection were also observed in the putative genes, partially duplicated genes and pseudogenes in the IGLL and LCR22s. Thus, the second major conclusion of this study is the major differences between human and chimpanzee in this region lies in the highly repetitive regions of the IGLL and the LCR22s.Comparison of a 4.5 Mb region of human chromosome 22 between markers D22s1687 and D22s419, with the syntenic region in chimpanzee had revealed overall DNA sequence identity of approximately 97.6%, Ka/Ks ratio of known protein coding genes at approximately 0.25, with the majority of amino acid changes between hydrophilic amino acids, followed by changes between hydrophobic amino acids, and the least changes between hydrophobic to hydrophilic amino acids or vise versa. Thus, the first major conclusion of this study is that overall, this chromosomal region is highly conserved between human and chimpanzee, and the known protein coding genes are undergoing purifying selections, in which 75% of nucleotide substitutions that led to amino acid changes were eliminated by adaptive evolution.Through whole mount in situ hybridization studies, a total of 12 human orthologs in zebrafish, including 4 newly predicted putative genes with no previously known expression profile and function, showed specific expression in the developing zebrafish embryonic central nervous system, optic system, the neural crest cells, ottic vesicle, liver, and notochord. Thus, the third major conclusion from this present study is that many predicted genes which currently lack expression data and functional information likely are time and tissue specific during developmental processes

    Molecular investigation of Australian termites and their gut symbionts

    Get PDF
    corecore