13 research outputs found

    In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles

    Get PDF
    Background: In silico candidate gene prioritisation (CGP) aids the discovery of gene functions by ranking genes according to an objective relevance score. While several CGP methods have been described for identifying human disease genes, corresponding methods for prokaryotic gene function discovery are lacking. Here we present two prokaryotic CGP methods, based on phylogenetic profiles, to assist with this task. Results: Using gene occurrence patterns in sample genomes, we developed two CGP methods (statistical and inductive CGP) to assist with the discovery of bacterial gene functions. Statistical CGP exploits the differences in gene frequency against phenotypic groups, while inductive CGP applies supervised machine learning to identify gene occurrence pattern across genomes. Three rediscovery experiments were designed to evaluate the CGP frameworks. The first experiment attempted to rediscover peptidoglycan genes with 417 published genome sequences. Both CGP methods achieved best areas under receiver operating characteristic curve (AUC) of 0.911 in Escherichia coli K-12 (EC-K12) and 0.978 Streptococcus agalactiae 2603 (SA-2603) genomes, with an average improvement in precision of >3.2-fold and a maximum of >27-fold using statistical CGP. A median AUC of >0.95 could still be achieved with as few as 10 genome examples in each group of genome examples in the rediscovery of the peptidoglycan metabolism genes. In the second experiment, a maximum of 109-fold improvement in precision was achieved in the rediscovery of anaerobic fermentation genes in EC-K12. The last experiment attempted to rediscover genes from 31 metabolic pathways in SA-2603, where 14 pathways achieved AUC >0.9 and 28 pathways achieved AUC >0.8 with the best inductive CGP algorithms. Conclusion: Our results demonstrate that the two CGP methods can assist with the study of functionally uncategorised genomic regions and discovery of bacterial gene-function relationships. Our rediscovery experiments also provide a set of standard tasks against which future methods may be compared.12 page(s

    Molecular signatures (unique proteins and conserved indels) that are specific for the epsilon proteobacteria (Campylobacterales)

    Get PDF
    BACKGROUND: The epsilon proteobacteria, which include many important human pathogens, are presently recognized solely on the basis of their branching in rRNA trees. No unique molecular or biochemical characteristics specific for this group are known. RESULTS: Comparative analyses of proteins in the genomes of Wolinella succinogenes DSM 1740 and Campylobacter jejuni RM1221 against all available sequences have identified a large number of proteins that are unique to various epsilon proteobacteria (Campylobacterales), but whose homologs are not detected in other organisms. Of these proteins, 49 are uniquely found in nearly all sequenced epsilon-proteobacteria (viz. Helicobacter pylori (26695 and J99), H. hepaticus, C. jejuni (NCTC 11168, RM1221, HB93-13, 84-25, CF93-6, 260.94, 11168 and 81-176), C. lari, C. coli, C. upsaliensis, C. fetus, W. succinogenes DSM 1740 and Thiomicrospira denitrificans ATCC 33889), 11 are unique for the Wolinella and Helicobacter species (i.e. Helicobacteraceae family) and many others are specific for either some or all of the species within the Campylobacter genus. The primary sequences of many of these proteins are highly conserved and provide novel resources for diagnostics and therapeutics. We also report four conserved indels (i.e. inserts or deletions) in widely distributed proteins (viz. B subunit of exinuclease ABC, phenylalanyl-tRNA synthetase, RNA polymerase β '-subunit and FtsH protein) that are specific for either all epsilon proteobacteria or different subgroups. In addition, a rare genetic event that caused fusion of the genes for the largest subunits of RNA polymerase (rpoB and rpoC) in Wolinella and Helicobacter is also described. The inter-relationships amongst Campylobacterales as deduced from these molecular signatures are in accordance with the phylogenetic trees based on the 16S rRNA and concatenated sequences for nine conserved proteins. CONCLUSION: These molecular signatures provide novel tools for identifying and circumscribing species from the Campylobacterales order and its subgroups in molecular terms. Although sequence information for these signatures is presently limited to Campylobacterales species, it is likely that many of them will also be found in other epsilon proteobacteria. Functional studies on these proteins and conserved indels should reveal novel biochemical or physiological characteristics that are unique to these groups of epsilon proteobacteria

    The Cyst-Dividing Bacterium Ramlibacter tataouinensis TTB310 Genome Reveals a Well-Stocked Toolbox for Adaptation to a Desert Environment

    Get PDF
    Ramlibacter tataouinensis TTB310T (strain TTB310), a betaproteobacterium isolated from a semi-arid region of South Tunisia (Tataouine), is characterized by the presence of both spherical and rod-shaped cells in pure culture. Cell division of strain TTB310 occurs by the binary fission of spherical “cyst-like” cells (“cyst-cyst” division). The rod-shaped cells formed at the periphery of a colony (consisting mainly of cysts) are highly motile and colonize a new environment, where they form a new colony by reversion to cyst-like cells. This unique cell cycle of strain TTB310, with desiccation tolerant cyst-like cells capable of division and desiccation sensitive motile rods capable of dissemination, appears to be a novel adaptation for life in a hot and dry desert environment. In order to gain insights into strain TTB310's underlying genetic repertoire and possible mechanisms responsible for its unusual lifestyle, the genome of strain TTB310 was completely sequenced and subsequently annotated. The complete genome consists of a single circular chromosome of 4,070,194 bp with an average G+C content of 70.0%, the highest among the Betaproteobacteria sequenced to date, with total of 3,899 predicted coding sequences covering 92% of the genome. We found that strain TTB310 has developed a highly complex network of two-component systems, which may utilize responses to light and perhaps a rudimentary circadian hourglass to anticipate water availability at the dew time in the middle/end of the desert winter nights and thus direct the growth window to cyclic water availability times. Other interesting features of the strain TTB310 genome that appear to be important for desiccation tolerance, including intermediary metabolism compounds such as trehalose or polyhydroxyalkanoate, and signal transduction pathways, are presented and discussed

    An insight into the sialome of Simulium guianense (DIPTERA:SIMulIIDAE), the main vector of River Blindness Disease in Brazil

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Little is known about the composition and function of the saliva in black flies such as <it>Simulium guianense</it>, the main vector of river blindness disease in Brazil. The complex salivary potion of hematophagous arthropods counteracts their host's hemostasis, inflammation, and immunity.</p> <p>Results</p> <p>Transcriptome analysis revealed ubiquitous salivary protein families--such as the Antigen-5, Yellow, Kunitz domain, and serine proteases--in the <it>S. guianense </it>sialotranscriptome. Insect-specific families were also found. About 63.4% of all secreted products revealed protein families found only in <it>Simulium</it>. Additionally, we found a novel peptide similar to kunitoxin with a structure distantly related to serine protease inhibitors. This study revealed a relative increase of transcripts of the SVEP protein family when compared with <it>Simulium vittatum </it>and <it>S. nigrimanum </it>sialotranscriptomes. We were able to extract coding sequences from 164 proteins associated with blood and sugar feeding, the majority of which were confirmed by proteome analysis.</p> <p>Conclusions</p> <p>Our results contribute to understanding the role of <it>Simulium </it>saliva in transmission of <it>Onchocerca volvulus </it>and evolution of salivary proteins in black flies. It also consists of a platform for mining novel anti-hemostatic compounds, vaccine candidates against filariasis, and immuno-epidemiologic markers of vector exposure.</p

    Cloning, annotation and developmental expression of the chicken intestinal MUC2 gene

    Get PDF
    Intestinal mucin 2 (MUC2) encodes a heavily glycosylated, gel-forming mucin, which creates an important protective mucosal layer along the gastrointestinal tract in humans and other species. This first line of defense guards against attacks from microorganisms and is integral to the innate immune system. As a first step towards characterizing the innate immune response of MUC2 in different species, we report the cloning of a full-length, 11,359 bp chicken MUC2cDNA, and describe the genomic organization and functional annotation of this complex, 74.5 kb locus. MUC2 contains 64 exons and demonstrates distinct spatiotemporal expression profiles throughout development in the gastrointestinal tract; expression increases with gestational age and from anterior to posterior along the gut. The chicken protein has a similar domain organization as the human orthologue, with a signal peptide and several von Willebrand domains in the N-terminus and the characteristic cystine knot at the C-terminus. The PTS domain of the chicken MUC2 protein spans ~1600 amino acids and is interspersed with four CysD motifs. However, the PTS domain in the chicken diverges significantly from the human orthologue; although the chicken domain is shorter, the repetitive unit is 69 amino acids in length, which is three times longer than the human. The amino acid composition shows very little similarity to the human motif, which potentially contributes to differences in the innate immune response between species, as glycosylation across this rapidly evolving domain provides much of the musical barrier. Future studies of the function of MUC2 in the innate immune response system in chicken could provide an important model organism to increase our understanding of the biological significance of MUC2 in host defense and highlight the potential of the chicken for creating new immune-based therapies

    From Q Fever to Coxiella burnetii Infection: a Paradigm Change

    No full text
    International audienceCoxiella burnetii is the agent of Q fever, or ``query fever,'' a zoonosis first described in Australia in 1937. Since this first description, knowledge about this pathogen and its associated infections has increased dramatically. We review here all the progress made over the last 20 years on this topic. C. burnetii is classically a strict intracellular, Gram-negative bacterium. However, a major step in the characterization of this pathogen was achieved by the establishment of its axenic culture. C. burnetii infects a wide range of animals, from arthropods to humans. The genetic determinants of virulence are now better known, thanks to the achievement of determining the genome sequences of several strains of this species and comparative genomic analyses. Q fever can be found worldwide, but the epidemiological features of this disease vary according to the geographic area considered, including situations where it is endemic or hyperendemic, and the occurrence of large epidemic outbreaks. In recent years, a major breakthrough in the understanding of the natural history of human infection with C. burnetii was the breaking of the old dichotomy between ``acute'' and ``chronic'' Q fever. The clinical presentation of C. burnetii infection depends on both the virulence of the infecting C. burnetii strain and specific risks factors in the infected patient. Moreover, no persistent infection can exist without a focus of infection. This paradigm change should allow better diagnosis and management of primary infection and long-term complications in patients with C. burnetii infection
    corecore