8 research outputs found
Recommended from our members
The evolution, modifications and interactions of proteins and RNAs
Proteins and RNAs are two of the most versatile macromolecules that carry out almost all functions within living organisms. In this thesis I have explored evolutionary and regulatory aspects of proteins and RNAs by studying their structures, modifications and interactions. In the first chapter of my thesis I investigate domain atrophy, a term I coined to describe large-scale deletions of core structural elements within protein domains. By looking into truncated domain boundaries across several domain families using Pfam, I was able to identify rare cases of domains that showed atrophy. Given that even point mutations can be deleterious, it is surprising that proteins can tolerate such large-scale deletions. Some of the structures of atrophied domains show novel protein-protein interaction interfaces that appear to compensate and stabilise their folds. Protein-protein interactions are largely influenced by the surface and charge complementarity, while RNA-RNA interactions are governed by base-pair complementarity; both interaction types are inherently different and these differences might be observed in their interaction networks. Based on this hypothesis I have explored the protein-protein, RNA-protein and the RNA-RNA interaction networks of yeast in the second chapter. By analysing the three networks I found no major differences in their network properties, which indicates an underlying uniformity in their interactomes despite their individual differences. In the third chapter I focus on RNA-protein interactions by investigating post-translational modifications (PTMs) in RNA-binding proteins (RBPs). By comparing occurrences of PTMs, I observe that RBPs significantly undergo more PTMs than non-RBPs. I also found that within RBPs, PTMs are more frequently targeted at regions that directly interact with RNA compared to regions that do not. Moreover disorderedness and amino acid composition were not observed to significantly influence the differential PTMs observed between RBPs and nonRBPs. The results point to a direct regulatory role of PTMs in RNA-protein interactions of RBPs. In the last chapter, I explore regulatory RNA-RNA interactions. Using differential expression data of mRNAs and lncRNAs from mouse models of hereditary hemochromatosis, I investigated competing regulatory interactions between mRNA, lncRNA and miRNA. A mutual interaction network was created from the predicted miRNA interaction sites on mRNAs and lncRNAs to identify regulatory RNAs in the disease. I also observed interesting relations between the sense-antisense mRNA-lncRNA pairs that indicate mutual regulation of expression levels through a yet unknown mechanism
Structure and dynamics of genome-wide diversity in Prochlorococcus
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2008.Includes bibliographical references.The capability of microbes to thrive in myriad environments has its foundation in the diversity of microbial genomes. Here we explore adaptation and diversification through the lens of the marine cyanobacterium Prochlorococcus, which comprises a group of closely-related ecotypes that together perform most of the primary production in low-nutrient regions of the world oceans. Prochlorococcus was one of the first microbes in which a genomic basis for ecological differentiation was characterized, in the distinction between high- and low-light adapted ecotypes. It is clear, however, that other axes of differentiation are important, including temperature, nutrient availability, and biotic interactions. This thesis seeks to characterize salient aspects of genomic diversity in Prochlorococcus and to advance understanding of the ecological and evolutionary forces that shape this variation. We show that closely related isolates harbor remarkably dissimilar gene complements, and much of this variation is concentrated in specific genome regions, termed islands, that appear to have arisen through phage-mediated gene transfer. Several island-encoded genes likely play important metabolic roles, as inferred from their strong and specific upregulation under stress conditions. A region of the genome involved in phosphate assimilation has highly variable gene content that appears to reflect oceanic phosphate availability. Accordingly, we find extreme differences between strains in the transcriptional response to phosphate starvation. Using metagenomics approaches, we describe high coexisting diversity in natural Prochlorococcus populations. Nevertheless, this diversity is structured: a core genome of universal single-copy genes is augmented by a flexible genome.(cont.) The population genome changes with water depth, reflecting genotypic variation among ecotypes and within the dominant ecotype. Finally, we show that the transcriptomes of wild Prochlorococcus correlate strongly with transcriptomes in culture as measured by microarrays. Genes of unknown function are among the most highly expressed in the wild. Several highly expressed genes show signatures of intragenic recombination, a process that likely influences their diversity and function. Overall, this work demonstrates that environmental factors such as light, temperature, nutrient availability, and interspecies interactions each leave different marks in the genome over different scales of time and space. Understanding microbial evolution requires that we dissect diversity over these multiple scales.by Maureen Lynn Coleman.Ph.D
Structure and computational analysis of a novel protein with metallopeptidase-like and circularly permuted winged-helix-turn-helix domains reveals a possible role in modified polysaccharide biosynthesis.
BackgroundCA_C2195 from Clostridium acetobutylicum is a protein of unknown function. Sequence analysis predicted that part of the protein contained a metallopeptidase-related domain. There are over 200 homologs of similar size in large sequence databases such as UniProt, with pairwise sequence identities in the range of ~40-60%. CA_C2195 was chosen for crystal structure determination for structure-based function annotation of novel protein sequence space.ResultsThe structure confirmed that CA_C2195 contained an N-terminal metallopeptidase-like domain. The structure revealed two extra domains: an α+β domain inserted in the metallopeptidase-like domain and a C-terminal circularly permuted winged-helix-turn-helix domain.ConclusionsBased on our sequence and structural analyses using the crystal structure of CA_C2195 we provide a view into the possible functions of the protein. From contextual information from gene-neighborhood analysis, we propose that rather than being a peptidase, CA_C2195 and its homologs might play a role in biosynthesis of a modified cell-surface carbohydrate in conjunction with several sugar-modification enzymes. These results provide the groundwork for the experimental verification of the function
Genetic analysis of bacteriophages from clinical and environmental samples
Bacteriophages, viruses infecting bacteria, are uniformly present in any location where there are high numbers of bacteria, both in the external environment and the human body. Knowledge of their diversity is limited by the difficulty to culture the host species and by the lack of the universal marker gene present in all viruses. Metagenomics is a powerful tool that can be used to analyse viral communities in their natural environments. The aim of this study was to investigate diverse populations of uncultured viruses from clinical (a sputum of patient with cystic fibrosis, CF) and environmental samples (a sludge from a dairy food wastewater treatment plant) containing rich bacterial populations using genetic and metagenomic analyses. Metagenomic sequencing of viruses obtained from these samples revealed that the majority of the metagenomic reads (97-99%) were novel when compared to the NCBI protein database using BLAST. A large proportion of assembled contigs were assignable as novel phages or uncharacterised prophages, the next largest assignable group being single-stranded eukaryotic virus genomes. Sputum from a cystic fibrosis patient contained DNA typical of phages of bacteria that are traditionally involved in CF lung infections and other bacteria that are part of the normal oral flora. The only eukaryotic virus detected in the CF sputum was Torque Teno virus (TTV). A substantial number of assigned sequences from dairy wastewater could be affiliated with phages of bacteria that are typically found in the soil and aquatic environments, including wastewater. Eukaryotic viral sequences were dominated by plant pathogens from the Geminiviridae and Nanoviridae families, and animal pathogens from the Circoviridae family. Antibiotic resistance genes were detected in both metagenomes suggesting phages could be a source for transmissible antimicrobial resistance. Overall, diversity of viruses in the CF sputum was low, with 89 distinct viral genotypes predicted, and higher (409 genotypes) in the wastewater. Function-based screening of a metagenomic library constructed from DNA extracted from dairy food wastewater viruses revealed candidate promoter sequences that have ability to drive expression of GFP in a promoter-trap vector in Escherichia coli. The majority of the cloned DNA sequences selected by the assay were related to ssDNA circular eukaryotic viruses and phages which formed a minority of the metagenome assembly, and many lacked any significant homology to known database sequences. Natural diversity of bacteriophages in wastewater samples was also examined by PCR amplification of the major capsid protein sequences, conserved within T4-type bacteriophages from Myoviridae family. Phylogenetic analysis of capsid sequences revealed that dairy wastewater contained mainly diverse and uncharacterized phages, while some showed a high level of similarity with phages from geographically distant environments