182 research outputs found

    Transcript profiling in Candida albicans reveals new cellular functions for the transcriptional repressors CaTup1, CaMig1 and CaNrg1.

    Get PDF
    The pathogenic fungus, Candida albicans contains homologues of the transcriptional repressors ScTup1, ScMig1 and ScNrg1 found in budding yeast. In Saccharomyces cerevisiae, ScMig1 targets the ScTup1/ScSsn6 complex to the promoters of glucose repressed genes to repress their transcription. ScNrg1 is thought to act in a similar manner at other promoters. We have examined the roles of their homologues in C. albicans by transcript profiling with an array containing 2002 genes, representing about one quarter of the predicted number of open reading frames (ORFs) in C. albicans. The data revealed that CaNrg1 and CaTup1 regulate a different set of C. albicans genes from CaMig1 and CaTup1. This is consistent with the idea that CaMig1 and CaNrg1 target the CaTup1 repressor to specific subsets of C. albicans genes. However, CaMig1 and CaNrg1 repress other C. albicans genes in a CaTup1-independent fashion. The targets of CaMig1 and CaNrg1 repression, and phenotypic analyses of nrg1/nrg1 and mig1/mig1 mutants, indicate that these factors play differential roles in the regulation of metabolism, cellular morphogenesis and stress responses. Hence, the data provide important information both about the modes of action of these transcriptional regulators and their cellular roles. The transcript profiling data are available at http://www.pasteur.fr/recherche/unites/RIF/transcriptdata/

    Proteome sequence features carry signatures of the environmental niche of prokaryotes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Prokaryotic environmental adaptations occur at different levels within cells to ensure the preservation of genome integrity, proper protein folding and function as well as membrane fluidity. Although specific composition and structure of cellular components suitable for the variety of extreme conditions has already been postulated, a systematic study describing such adaptations has not yet been performed. We therefore explored whether the environmental niche of a prokaryote could be deduced from the sequence of its proteome. Finally, we aimed at finding the precise differences between proteome sequences of prokaryotes from different environments.</p> <p>Results</p> <p>We analyzed the proteomes of 192 prokaryotes from different habitats. We collected detailed information about the optimal growth conditions of each microorganism. Furthermore, we selected 42 physico-chemical properties of amino acids and computed their values for each proteome. Further, on the same set of features we applied two fundamentally different machine learning methods, Support Vector Machines and Random Forests, to successfully classify between bacteria and archaea, halophiles and non-halophiles, as well as mesophiles, thermophiles and mesothermophiles. Finally, we performed feature selection by using Random Forests.</p> <p>Conclusions</p> <p>To our knowledge, this is the first time that three different classification cases (domain of life, halophilicity and thermophilicity) of proteome adaptation are successfully performed with the same set of 42 features. The characteristic features of a specific adaptation constitute a signature that may help understanding the mechanisms of adaptation to extreme environments.</p

    A probabilistic model for gene content evolution with duplication, loss, and horizontal transfer

    Full text link
    We introduce a Markov model for the evolution of a gene family along a phylogeny. The model includes parameters for the rates of horizontal gene transfer, gene duplication, and gene loss, in addition to branch lengths in the phylogeny. The likelihood for the changes in the size of a gene family across different organisms can be calculated in O(N+hM^2) time and O(N+M^2) space, where N is the number of organisms, hh is the height of the phylogeny, and M is the sum of family sizes. We apply the model to the evolution of gene content in Preoteobacteria using the gene families in the COG (Clusters of Orthologous Groups) database

    CandidaDB: a genome database for Candida albicans pathogenomics

    Get PDF
    CandidaDB is a database dedicated to the genome of the most prevalent systemic fungal pathogen of humans, Candida albicans. CandidaDB is based on an annotation of the Stanford Genome Technology Center C.albicans genome sequence data by the European Galar Fungail Consortium. CandidaDB Release 2.0 (June 2004) contains information pertaining to Assembly 19 of the genome of C.albicans strain SC5314. The current release contains 6244 annotated entries corresponding to 130 tRNA genes and 5917 protein-coding genes. For these, it provides tentative functional assignments along with numerous pre-run analyses that can assist the researcher in the evaluation of gene function for the purpose of specific or large-scale analysis. CandidaDB is based on GenoList, a generic relational data schema and a World Wide Web interface that has been adapted to the handling of eukaryotic genomes. The interface allows users to browse easily through genome data and retrieve information. CandidaDB also provides more elaborate tools, such as pattern searching, that are tightly connected to the overall browsing system. As the C.albicans genome is diploid and still incompletely assembled, CandidaDB provides tools to browse the genome by individual supercontigs and to examine information about allelic sequences obtained from complementary contigs. CandidaDB is accessible at http://genolist.pasteur.fr/CandidaDB

    CandidaDB: A genome database for Candida albicans pathogenomics

    Get PDF
    CandidaDB is a database dedicated to the genome of the most prevalent systemic fungal pathogen of humans, Candida albicans. CandidaDB is based on an annotation of the Stanford Genome Technology Center C.albicans genome sequence data by the European Galar Fungail Consortium. CandidaDB Release 2.0 (June 2004) contains information pertaining to Assembly 19 of the genome of C.albicans strain SC5314. The current release contains 6244 annotated entries corresponding to 130 tRNA genes and 5917 protein-coding genes. For these, it provides tentative functional assignments along with numerous pre-run analyses that can assist the researcher in the evaluation of gene function for the purpose of specific or large-scale analysis. CandidaDB is based on GenoList, a generic relational data schema and a World Wide Web interface that has been adapted to the handling of eukaryotic genomes. The interface allows users to browse easily through genome data and retrieve information. CandidaDB also provides more elaborate tools, such as pattern searching, that are tightly connected to the overall browsing system. As the C.albicans genome is diploid and still incompletely assembled, CandidaDB provides tools to browse the genome by individual supercontigs and to examine information about allelic sequences obtained from complementary contigs. CandidaDB is accessible at http://genolist.pasteur.fr/CandidaDB.Sequence data from C.albicans were obtained from the Stanford Genome Technology Center (http://www.sequence. stanford.edu/group/candida). Sequencing of C.albicans was accomplished with the support of the NIDR and the Burroughs Wellcome Fund. This work was supported by grants from the European Commission (QLK2-2000-00795; MCRTN-CT-2003-504148; ‘Galar Fungail Consortium’) to A.J.P.B., C.E., A.D., J.E., C.G., B.H., F.M.K., J.P.M. and R.S. and the Ministere de la Recherche et de la Technologie (PRFMMIP ‘Re´seau Infections Fongiques’) to C.E. and C.G. F.T. was supported by the Institut Pasteur Strategic Horizontal Program on Anopheles gambiae. N.M. was supported by a fellowship of the Junta de Castilla y Leon and by grants DGCYT (PM-98-0317 and BIO 2002-02124) to A.D. R.S. was supported in part by grants from the Spanish Ministerio de Ciencia y Tecnologia (BMC2003- 01023) and Agencia Valenciana de Ciencia i Tecnologia de la Generalitat Valenciana (Grupos 03/187)

    A novel series of compositionally biased substitution matrices for comparing Plasmodium proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The most common substitution matrices currently used (BLOSUM and PAM) are based on protein sequences with average amino acid distributions, thus they do not represent a fully accurate substitution model for proteins characterized by a biased amino acid composition. This problem has been addressed recently by adjusting existing matrices, however, to date, no empirical approach has been taken to build matrices which offer a substitution model for comparing proteins sharing an amino acid compositional bias. Here, we present a novel procedure to construct series of symmetrical substitution matrices to align proteins from similarly biased <it>Plasmodium </it>proteomes.</p> <p>Results</p> <p>We generated substitution matrices by selecting from the BLOCKS database those multiple alignments with a compositional bias similar to that of <it>P. falciparum </it>and <it>P. yoelii </it>proteins. A novel 'fuzzy' clustering method was adopted to group sequences within these alignments, showing that this method retains more complete information on the amino acid substitutions when compared to hierarchical clustering. We assessed the performance against the BLOSUM62 series and showed that the usage of our matrices results in an improvement in the performance of BLAST database searches, greatly reducing the number of false positive hits. We then demonstrated applications of the use of novel matrices to improve the annotation of homologs between the two <it>Plasmodium </it>species and to classify members of the <it>P. falciparum </it>RIFIN/STEVOR family.</p> <p>Conclusion</p> <p>We confirmed that in the case of compositionally biased proteins, standard BLOSUM matrices are not suited for optimal alignments, and specific substitution matrices are required. In addition, we showed that the usage of these matrices leads to a reduction of false positive hits, facilitating the automatic annotation process.</p

    Phylogeny of Prokaryotes and Chloroplasts Revealed by a Simple Composition Approach on All Protein Sequences from Complete Genomes Without Sequence Alignment

    Get PDF
    The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped to resolve the evolution of this organelle in photosynthetic eukaryotes. In this paper we propose an alternative method of phylogenetic analysis using compositional statistics for all protein sequences from complete genomes. This new method is conceptually simpler than and computationally as fast as the one proposed by Qi et al. (2004b) and Chu et al. (2004). The same data sets used in Qi et al. (2004b) and Chu et al. (2004) are analyzed using the new method. Our distance-based phylogenic tree of the 109 prokaryotes and eukaryotes agrees with the biologists tree of life based on 16S rRNA comparison in a predominant majority of basic branching and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated to two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution

    Natural History, Microbes and Sequences: Shouldn't We Look Back Again to Organisms?

    Get PDF
    The discussion on the existence of prokaryotic species is reviewed. The demonstration that several different mechanisms of genetic exchange and recombination exist has led some to a radical rejection of the possibility of bacterial species and, in general, the applicability of traditional classification categories to the prokaryotic domains. However, in spite of intense gene traffic, prokaryotic groups are not continuously variable but form discrete clusters of phenotypically coherent, well-defined, diagnosable groups of individual organisms. Molecularization of life sciences has led to biased approaches to the issue of the origins of biodiversity, which has resulted in the increasingly extended tendency to emphasize genes and sequences and not give proper attention to organismal biology. As argued here, molecular and organismal approaches that should be seen as complementary and not opposed views of biology

    Unresolved orthology and peculiar coding sequence properties of lamprey genes: the KCNA gene family as test case

    Get PDF
    Background:In understanding the evolutionary process of vertebrates, cyclostomes (hagfishes and lamprey) occupy crucial positions. Resolving molecular phylogenetic relationships of cyclostome genes with gnathostomes (jawed vertebrates) genes is indispensable in deciphering both the species tree and gene trees. However, molecular phylogenetic analyses, especially those including lamprey genes, have produced highly discordant results between gene families. To efficiently scrutinize this problem using partial genome assemblies of early vertebrates, we focused on the potassium voltage-gated channel, shaker-related (KCNA) family, whose members are mostly single-exon.Results:Seven sea lamprey KCNA genes as well as six elephant shark genes were identified, and their orthologies to bony vertebrate subgroups were assessed. In contrast to robustly supported orthology of the elephant shark genes to gnathostome subgroups, clear orthology of any sea lamprey gene could not be established. Notably, sea lamprey KCNA sequences displayed unique codon usage pattern and amino acid composition, probably associated with exceptionally high GC-content in their coding regions. This lamprey-specific property of coding sequences was also observed generally for genes outside this gene family.Conclusions:Our results suggest that secondary modifications of sequence properties unique to the lamprey lineage may be one of the factors preventing robust orthology assessments of lamprey genes, which deserves further genome-wide validation. The lamprey lineage-specific alteration of protein-coding sequence properties needs to be taken into consideration in tackling the key questions about early vertebrate evolution
    corecore