38 research outputs found

    Structural and functional advances in the evolutionary studies of cells and viruses

    Get PDF
    Phylogenomics aims to describe evolutionary relatedness between organisms by analyzing genomic data. The common practice is to produce phylogenomic trees from molecular information in the sequence, order and content of genes in genomes. These phylogenies describe the evolution of life and have become valuable tools for taxonomy. The recent availability of structural and functional data for hundreds of genomes now offer the opportunity to study evolution using more conserved sets of molecular features. Here we report a phylogenomic (i.e. historical) and comparative (ahistorical) analysis that yields novel insights into the origin of cells (Chapters 1-3) and viruses (Chapters 4-6). We utilized conserved protein domain structure information (fold families [FFs] and fold superfamilies [FSFs]) and ontological definitions of gene products (Gene Ontology [GO]) to reconstruct rooted trees of life (ToL), taking advantage of a genomic census of molecular structure and function in the genomes of sampled organisms and viruses. The analysis revealed a global tendency in the proteomic repertories of cellular organisms to increase domain abundance. ToLs built directly from the census of molecular functions confirmed an early origin of Archaea relative to Bacteria and Eukarya, a conclusion further supported by comparative analysis. The analysis further revealed an ancient history of viruses and their evolution by gene loss. Despite the very high levels of variability seen in the replication strategies, morphologies, and host preferences of extant viruses, we recovered a conserved and ancient structural core of protein domains that was shared between cellular organisms and distantly related viruses. This core together with an analysis of the evolution of virion morphotypes strongly suggests an ancient origin for the viral supergroup. Moreover, a large number of viral proteins lacked cellular homologs and strongly negated the idea that viruses merely evolve by acquiring cellular genes. These virus-specific proteins confer pathogenic abilities to viruses and appeared late in evolution suggesting that the shift to parasitic mode of life happened later in viral evolution. The strong evolutionary association between viruses and cells is likely reminiscent of their ancient co-existence inside primordial cells. Moreover, the crucial dependency of viruses to replicate in an intracellular environment creates fertile grounds for genetic innovation. Interestingly, protein domains shared with viruses were widespread in the proteomes of all three cellular superkingdoms suggesting that viruses mediate gene transfer and crucially enhance biodiversity. The phylogenomic trees identify viruses as a ‘fourth supergroup’ along with cellular superkingdoms, Archaea, Bacteria, and Eukarya. The new model for the origin and evolution of viruses and cells is backed by strong molecular data and is compatible with the existing models of viral evolution. Our experiments indicate that structure and functionomic data represent a useful addition to the set of molecular characters used for tree reconstruction and that ToLs carry in deep branches considerable predictive power to explain the evolution of living organisms and viruses

    The evolution of LOL, the secondary metabolite gene cluster for insecticidal loline alkaloids in fungal endophytes of grasses.

    Get PDF
    LOL is a novel secondary metabolite gene cluster associated with the production of loline alkaloids (saturated 1-aminopyrrolizidine alkaloids with an oxygen bridge) exclusively in closely related grass-endophyte species in the genera Epichloë and Neotyphodium. In this study I characterize the LOL cluster in E. festucae, including the presentation of sequence corresponding to 10 individual lol genes as well as defining the boundaries of the cluster and evaluation of the genomic DNA region flanking LOL in E. festucae. In addition to characterizing the LOL cluster in E. festucae, I present LOL sequence from two additional species, Neotyphodium coenophialum and Neotyphodium sp. PauTG-1. Together with two recently published LOL clusters from N. uncinatum, these data allow for a powerful phylogenetic comparison of five clusters from four closely related species. There is a high degree of microsynteny (conserved gene order and orientation) among the five LOL clusters, allowing us to predict potential transcriptional co-regulatory binding motifs in lol promoter regions. The relatedness of LOL clusters is especially interesting in light of the history of interspecific hybridizations that generated the asexual, Neotyphodium lineages. In fact, three of the clusters appear to have been introduced to different Neotyphodium species by the same ancestral Epichloë species, for which present day isolates are no longer able to produce lolines. To address the evolutionary origins of the cluster we have investigated the phylogenetic relationships of particular lol ORFs to their paralogous primary metabolism genes (and gene families) from endophytes, other fungi and even other kingdoms. I present extensive evidence that at least two individual lol genes have evolved from primary metabolism genes within the fungal ancestors of endophytes, rather than being introduced via horizontal gene transfer. I also present complementation studies in Neurospora crassa exploring the functional divergence of one lol gene from its primary metabolism paralog. While it is clear that these insecticidal compounds should convey a selective advantage to the fungus and its host, thus explaining preservation of the trait, this analysis provides an exploration into the evolutionary origin and maintenance of the genes that comprise the LOL and the cluster itself

    An evolutionary genomics approach towards analysis of genes implicated in transmission of trypanosomes between tsetse fly and mammalian host

    Get PDF
    >Magister Scientiae - MScHuman African trypanosomiasis is the world’s third most important parasitic disease affecting human health after malaria and schistosomiaisis. The world health organization estimates approximately 60 million people at risk in sub-Saharan Africa and up to 50,000 deaths per year caused by trypanosomiasis. Current management of human African trypanosomiasis relies on active surveillance and chemotherapy of infected patients. Efforts to develop a vaccine to immunize the human host have been hampered by antigenic variation of the parasites cell coat. The advent of the genome era has opened up opportunities for developing novel strategies for interrupting the transmission cycle of trypanosomes, specifically using any of the three players,the human host, the tsetse fly vector and/or the parasite. The human genome has been deciphered and the genomes of several trypanosome species have been sequenced. Sequencing of additional neglected trypanosome species is in progress. The tsetse fly genome is currently being sequenced as part of the genomic activities of the International Glossina genome initiative (IGGI). In an attempt to support the tsetse fly sequencing effort, expressed sequence tags (ESTs) from various tissues and developmental stages of Glossina morsitans have been generated.In this study, tsetse fly EST data was analyzed using bioinformatics approaches, focusing on transcripts encoding serpin genes implicated in the immune defenses of tsetse flies. Glossina morsitans homologues to Drosophila melanogaster serpin4, serpin5, and serpin27A and Anopheles gambiae serpin10 were identified in the tsetse fly EST contigs. Comparison of the reactive center loop of tsetse fly serpins with human α-1-antitrypsin suggests that these tsetse serpins are inhibitory. Preliminary EST clustering did not succeed in assembling 3564 Tsal encoded ESTs into one contig. In this study, these ESTs were assembled together with three published Tsal cDNAs. A total of 29 Tsal-encoded contigs were generated. An analysis of the sequence variation within the Tsal EST assembled contigs identified five single base mismatches namely A-T, T-A, G-T and T-G.Results from this study form a basis onto which genetic and biochemical experimental studies can be designed, a process that will be successfully carried out once we have a reference genome. Specifically, studies aimed at genetic modification of tsetse flies towards populations that are inhabitable to trypanosomes. Ultimately, this will supplement current vector control strategies towards elimination of human African trypanosomiasis

    Needles in a haystack of protein diversity: Interrogation of complex biological samples through specialized strategies in bottom-up proteomics uncover peptides of interest for diverse applications

    Get PDF
    Peptide identification is at the core of bottom-up proteomics measurements. However, even with state-of the-art mass spectrometric instrumentation, peptide level information is still lost or missing in these types of experiments. Reasons behind missing peptide identifications in bottom-up proteomics include variable peptide ionization efficiencies, ion suppression effects, as well as the occurrence of chimeric spectra that can lower the efficacy of database search strategies. Peptides derived from naturally abundant proteins in a biological system also have better chances of being identified in comparison to the ones produced from less abundant proteins, at least in regular discovery-based proteomics experiments. This dissertation focused on the recovery of the “missing or hidden proteome” information in complex biological matrices by approaching this challenge under a peptide-centric view and implementing different liquid chromatography tandem mass spectrometry (LC-MS/MS) experimental workflows. In particular, the projects presented here covered: (1) The feasibility of applying a liquid chromatography-multiple reaction monitoring MS methodology for the targeted identification of peptides serving as surrogates of protein biomarkers in environmental matrices with unknown microbial diversities; (2) the evaluation of selecting unique tryptic peptides in-silico that can distinguish groups of proteins, instead of individual proteins, for targeted proteomics workflows; (3) maximizing peptide identification in spectral data collected from different LC-MS/MS setups by applying a multi-peptide-spectrum-match algorithm, and (4) showing that LC-MS/MS combined with de novo assisted-database searches is a feasible strategy for the comprehensive identification of peptides derived from native proteolytic mechanisms in biological systems

    Single-Cell Genomics Reveals a Diverse Metabolic Potential of Uncultivated Desulfatiglans-Related Deltaproteobacteria Widely Distributed in Marine Sediment

    Get PDF
    Desulfatiglans-related organisms comprise one of the most abundant deltaproteobacterial lineages in marine sediments where they occur throughout the sediment column in a gradient of increasing sulfate and organic carbon limitation with depth. Characterized Desulfatiglans isolates are dissimilatory sulfate reducers able to grow by degrading aromatic hydrocarbons. The ecophysiology of environmental Desulfatiglans-populations is poorly understood, however, possibly utilization of aromatic compounds may explain their predominance in marine subsurface sediments. We sequenced and analyzed seven Desulfatiglans-related single-cell genomes (SAGs) from Aarhus Bay sediments to characterize their metabolic potential with regard to aromatic compound degradation and energy metabolism. The average genome assembly size was 1.3 Mbp and completeness estimates ranged between 20 and 50%. Five of the SAGs (group 1) originated from the sulfate-rich surface part of the sediment while two (group 2) originated from sulfate-depleted subsurface sediment. Based on 16S rRNA gene amplicon sequencing group 2 SAGs represent the more frequent types of Deaufatig/ans-populations in Aarhus Bay sediments. Genes indicative of aromatic compound degradation could be identified in both groups, but the two groups were metabolically distinct with regard to energy conservation. Group 1 SAGs carry a full set of genes for dissimilatory sulfate reduction, whereas the group 2 SAGs lacked any genetic evidence for sulfate reduction. The latter may be due to incompleteness of the SAGs, but as alternative energy metabolisms group 2 SAGs carry the genetic potential for growth by acetogenesis and fermentation. Group 1 SAGs encoded reductive dehalogenase genes, allowing them to access organohalides and possibly conserve energy by their reduction. Both groups possess sulfatases unlike their cultured relatives allowing them to utilize sulfate esters as source of organic carbon and sulfate. In conclusion, the uncultivated marine Desulfatiglans populations are metabolically diverse, likely reflecting different strategies for coping with energy and sulfate limitation in the subsurface seabed

    Thermal stress in the Antarctic clam Laternula and the temperate mussel Mytilus.

    Get PDF
    There is ample evidence that a period of global warming is already affecting ecosystems worldwide. In order to predict the effects of a warming climate on organism physiology and biogeography, a description of the mechanisms involved in species responses to elevated temperatures is needed. Comparative studies examining species inhabiting different environments provide important information on the relative susceptibility of ecosystems to climate change. Antarctic marine ectotherms have evolved in a stable cold environment. They live within a narrow thermal window and experience stress with small elevations in temperature. In contrast, temperate intertidal species experience considerable temperature changes on a daily basis. The Antarctic clam Lateniula elliptica and temperate mussel Mytilus edulis were selected as representative species for their respective environments. This thesis presents i) a description of the construction of a cDNA microarray for L. elliptica, ii) analysis of gene expression in L. elliptica upon acute exposure to 3°C, iii) a comparative study between the two species at the protein level via two dimensional electrophoresis, and iv) analysis of corticosteroid synthesis in Mytiliis. Significant changes in the expression of 294 clones, representing 160 transcripts were observed. Of these, 33 were identified by sequence similarity searches and classified to a variety of cellular functions including protein turnover, folding and chaperoning, intracellular signalling and trafficking and cytoskeletal activity. In addition, the expression of 264 and 375 proteins in L. elliptica and M. edulis respectively was studied, 14 and 26 of which presented changes in expression between treatments. Only changes in proteins involved in energy metabolism were detected in both species. A higher level of biological variation in response to stress was observed in M. echilis at the protein level. The relevance of the observed results in determining the relative susceptibility of these species to climate change is discussed
    corecore