22 research outputs found

    Comparative genomics for studying the proteomes of mucosal microorganisms

    Get PDF
    A tremendous number of microorganisms are known to interact with their animal hosts. The outcome of the interactions between microbes and their animal hosts range from modulating the maintenance of homeostasis to the establishment of processes leading to pathogenesis. Of the numerous species known to inhabit humans, the great majority live on mucosal surfaces which are highly defended. Despite their importance in human health, little is known about the molecular and cellular basis of most host-microbe interactions across the tremendous diversity of mucosal-adapted microorganisms. The ever-increasing availability of genome sequence data allows systematic comparative genomics studies to identify proteins with potential important molecular functions at the host-microbe interface. In this study, a genome-wide analysis was performed on 3,021,490 protein sequences derived from 867 complete microbial genome sequences across the three domains of cellular life. The ability of microbes to thrive successfully in a mucosal environment was examined in relation to functional genomics data from a range of publicly available databases. Particular emphasis was placed on the extracytoplasmic proteins of microorganisms that thrive on human mucosal surfaces. These proteins form the interface between the complex host-microbe and microbe-microbe interactions. The large amounts of data involved, combined with the numerous analytical techniques that need to be performed makes the study intractable with conventional bioinformatics. The lack of habitat annotations for microorganisms further compounds the problem of identifying the microbial extracytoplasmic proteins playing important roles in the mucosal environments. In order to address these problems, a distributed high throughput computational workflow was developed, and a system for mining biomedical literature was trained to automatically identify microorganisms’ habitats. The workflow integrated existing bioinformatics tools to identify and characterise protein-targeting signals, cell surface-anchoring features, protein domains and protein families. This study successfully demonstrated a large-scale comparative genomics approach utilising a system called Microbase to harness Grid and Cloud computing technologies. A number of conserved protein domains and families that are significantly associated with a speiii iv cific set of mucosa-inhabiting microorganisms were identified. These conserved protein regions of which their functions were either characterised or unknown, were quite narrow in their coverage of taxa distribution, with only a few protein domains more widely distributed, suggesting that mucosal microorganisms evolved different solutions in their strategies and mechanisms for their survival in the host mucosal environments. Metabolic and biological processes common to many mucosal microorganisms included: carbohydrate and amino acid metabolisms, signal transduction, adhesion to host tissues or contents in mucosal environments (e.g. food remnants, mucins), and resistance to host defence mechanisms. Invasive or virulence factors were also identified in pathogenic strains. Several extracytoplasmic protein families were shared among prominent bacterial members of gut microbiota and microbial eukaryotes known to thrive in the same environment, suggesting that the ability of microbes to adapt to particular niches can be influenced by lateral gene transfer. A large number of conserved regions or protein families that potentially play important roles in the mucosa-microbe interactions were revealed by this study. Several of these candidates were proteins of unknown function. The identified candidates were subjected to more detailed computational analysis providing hypothesis for their function that will be tested experimentally in order to contribute to our understanding of the complex host-microbe interactions. Among the candidates of unknown function, a novel M60-like domain was identified. The domain was deposited in the Pfam database with accession number PF13402. The M60-like domain is shared amongst a broad range of mucosal microorganisms as well as their vertebrate hosts. Bioinformatics analyses of the M60-like domain suggested a potential catalytic function of the conserved motif as gluzincins metalloproteases. Targeting signals were detected across microbial M60-likecontaining proteins. Mucosa-related carbohydrate-binding modules (CBMs), CBM32 was also identified on several proteins containing M60-like domains encoded by known mucosal commensals and pathogens. The co-occurrence of the CBMs and M60-like domain, as well as annotated potential peptidase function unveiled a new functional context for the CBM, which is typically connected with carbohydrate processing enzymes but not proteases. The CBM domains linked with members of different protease families are likely to enable these proteases to bind to specific glycoproteins from host animals further highlighting the importance of proteases and CBMs (CBM32 and CBM5_12) in host-microbe interactions.EThOS - Electronic Theses Online ServiceMedical School, Newcastle UniversityGBUnited Kingdo

    Understanding virus and microbial evolution in wildlife through meta-transcriptomics

    Get PDF
    Wildlife harbors a substantial and largely undocumented diversity of RNA viruses and microbial life forms. RNA viruses and microbes are also arguably the most diverse and dynamic entities on Earth. Despite their evident importance, there are major limitations in our knowledge of the diversity, ecology, and evolution of RNA viruses and microbial communities. These gaps stem from a variety of factors, including biased sampling and the difficulty in accurately identifying highly divergent sequences through sequence similarity-based analyses alone. The implementation of meta-transcriptomic sequencing has greatly contributed to narrowing this gap. In particular, the rapid increase in the number of newly described RNA viruses over the last decade provides a glimpse of the remarkable diversity within the RNA virosphere. The central goal in this thesis was to determine the diversity of RNA viruses associated with wildlife, particularly in an Australian context. To this end I exploited cutting-edge meta-transcriptomic and bioinformatic approaches to reveal the RNA virus diversity within diverse animal taxa, tissues, and environments, with a special focus on the highly divergent "dark matter" of the virome that has largely been refractory to sequence analysis. Similarly, I used these approaches to detect targeted common microbes circulating in vertebrate and invertebrate fauna. Another important goal was to assess the diversity of RNA viruses and microbes as a cornerstone within a new eco-evolutionary framework. By doing so, this thesis encompasses multiple disciplines including virus discovery, viral host-range distributions, microbial-virus and host–parasite interactions, phylogenetic analysis, and pathogen surveillance. In sum, the research presented in this thesis expands the known RNA virosphere as well as the detection and surveillance of targeted microbes in wildlife, providing new insights into the diversity, evolution, and ecology of these agents in nature
    corecore