95 research outputs found

    Unassigned MURF1 of kinetoplastids codes for NADH dehydrogenase subunit 2

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a previous study, we conducted a large-scale similarity-free function prediction of mitochondrion-encoded hypothetical proteins, by which the hypothetical gene <it>murf1 </it>(maxicircle unidentified reading frame 1) was assigned as <it>nad2</it>, encoding subunit 2 of NADH dehydrogenase (Complex I of the respiratory chain). This hypothetical gene occurs in the mitochondrial genome of kinetoplastids, a group of unicellular eukaryotes including the causative agents of African sleeping sickness and leishmaniasis. In the present study, we test this assignment by using bioinformatics methods that are highly sensitive in identifying remote homologs and confront the prediction with available biological knowledge.</p> <p>Results</p> <p>Comparison of MURF1 profile Hidden Markov Model (HMM) against function-known profile HMMs in Pfam, Panther and TIGR shows that MURF1 is a Complex I protein, but without specifying the exact subunit. Therefore, we constructed profile HMMs for each individual subunit, using all available sequences clustered at various identity thresholds. HMM-HMM comparison of these individual NADH subunits against MURF1 clearly identifies this hypothetical protein as NAD2. Further, we collected the relevant experimental information about kinetoplastids, which provides additional evidence in support of this prediction.</p> <p>Conclusion</p> <p>Our <it>in silico </it>analyses provide convincing evidence for MURF1 being a highly divergent member of NAD2.</p

    Methodology for Constructing Problem Definitions in Bioinformatics

    Get PDF
    Motivation: A recurrent criticism is that certain bioinformatics tools do not account for crucial biology and therefore fail answering the targeted biological question. We posit that the single most important reason for such shortcomings is an inaccurate formulation of the computational problem. Results: Our paper describes how to define a bioinformatics problem so that it captures both the underlying biology and the computational constraints for a particular problem. The proposed model delineates comprehensively the biological problem and conducts an item-by-item bioinformatics transformation resulting in a germane computational problem. This methodology not only facilitates interdisciplinary information flow but also accommodates emerging knowledge and technologies

    Evolution of C2H2-zinc finger genes and subfamilies in mammals: Species-specific duplication and loss of clusters, genes and effector domains

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>C2H2 zinc finger genes (C2H2-ZNF) constitute the largest class of transcription factors in humans and one of the largest gene families in mammals. Often arranged in clusters in the genome, these genes are thought to have undergone a massive expansion in vertebrates, primarily by tandem duplication. However, this view is based on limited datasets restricted to a single chromosome or a specific subset of genes belonging to the large KRAB domain-containing C2H2-ZNF subfamily.</p> <p>Results</p> <p>Here, we present the first comprehensive study of the evolution of the C2H2-ZNF family in mammals. We assembled the complete repertoire of human C2H2-ZNF genes (718 in total), about 70% of which are organized into 81 clusters across all chromosomes. Based on an analysis of their N-terminal effector domains, we identified two new C2H2-ZNF subfamilies encoding genes with a SET or a HOMEO domain. We searched for the syntenic counterparts of the human clusters in other mammals for which complete gene data are available: chimpanzee, mouse, rat and dog. Cross-species comparisons show a large variation in the numbers of C2H2-ZNF genes within homologous mammalian clusters, suggesting differential patterns of evolution. Phylogenetic analysis of selected clusters reveals that the disparity in C2H2-ZNF gene repertoires across mammals not only originates from differential gene duplication but also from gene loss. Further, we discovered variations among orthologs in the number of zinc finger motifs and association of the effector domains, the latter often undergoing sequence degeneration. Combined with phylogenetic studies, physical maps and an analysis of the exon-intron organization of genes from the SCAN and KRAB domains-containing subfamilies, this result suggests that the SCAN subfamily emerged first, followed by the SCAN-KRAB and finally by the KRAB subfamily.</p> <p>Conclusion</p> <p>Our results are in agreement with the "birth and death hypothesis" for the evolution of C2H2-ZNF genes, but also show that this hypothesis alone cannot explain the considerable evolutionary variation within the subfamilies of these genes in mammals. We, therefore, propose a new model involving the interdependent evolution of C2H2-ZNF gene subfamilies.</p

    AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis

    Get PDF
    Affiliation: Département de biochimie, Faculté de médecine, Université de MontréalBACKGROUND:Sequence data analyses such as gene identification, structure modeling or phylogenetic tree inference involve a variety of bioinformatics software tools. Due to the heterogeneity of bioinformatics tools in usage and data requirements, scientists spend much effort on technical issues including data format, storage and management of input and output, and memorization of numerous parameters and multi-step analysis procedures.RESULTS:In this paper, we present the design and implementation of AnaBench, an interactive, Web-based bioinformatics Analysis workBench allowing streamlined data analysis. Our philosophy was to minimize the technical effort not only for the scientist who uses this environment to analyze data, but also for the administrator who manages and maintains the workbench. With new bioinformatics tools published daily, AnaBench permits easy incorporation of additional tools. This flexibility is achieved by employing a three-tier distributed architecture and recent technologies including CORBA middleware, Java, JDBC, and JSP. A CORBA server permits transparent access to a workbench management database, which stores information about the users, their data, as well as the description of all bioinformatics applications that can be launched from the workbench.CONCLUSION:AnaBench is an efficient and intuitive interactive bioinformatics environment, which offers scientists application-driven, data-driven and protocol-driven analysis approaches. The prototype of AnaBench, managed by a team at the Université de Montréal, is accessible on-line at: http://malawimonas.bcm.umontreal.ca:8091/anabench. Please contact the authors for details about setting up a local-network AnaBench site elsewhere

    The Rhodomonas salina mitochondrial genome: bacteria-like operons, compact gene arrangement and complex repeat region

    Get PDF
    To gain insight into the mitochondrial genome structure and gene content of a putatively ancestral group of eukaryotes, the cryptophytes, we sequenced the complete mitochondrial DNA of Rhodomonas salina. The 48 063 bp circular-mapping molecule codes for 2 rRNAs, 27 tRNAs and 40 proteins including 23 components of oxidative phosphorylation, 15 ribosomal proteins and two subunits of tat translocase. One potential protein (ORF161) is without assigned function. Only two introns occur in the genome; both are present within cox1 belong to group II and contain RT open reading frames. Primitive genome features include bacteria-like rRNAs and tRNAs, ribosomal protein genes organized in large clusters resembling bacterial operons and the presence of the otherwise rare genes such as rps1 and tatA. The highly compact gene organization contrasts with the presence of a 4.7 kb long, repeat-containing intergenic region. Repeat motifs ∼40–700 bp long occur up to 31 times, forming a complex repeat structure. Tandem repeats are the major arrangement but the region also includes a large, ∼3 kb, inverted repeat and several potentially stable ∼40–80 bp long hairpin structures. We provide evidence that the large repeat region is involved in replication and transcription initiation, predict a promoter motif that occurs in three locations and discuss two likely scenarios of how this highly structured repeat region might have evolved

    Earliest holozoan expansion of phosphotyrosine signaling

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.Phosphotyrosine (pTyr) signaling is involved in development and maintenance of metazoans' multicellular body through cell-to-cell communication. Tyrosine kinases (TKs), tyrosine phosphatases, and other proteins relaying the signal compose the cascade. Domain architectures of the pTyr signaling proteins are diverse in metazoans, reflecting their complex intercellular communication. Previous studies had shown that the metazoan-type TKs, as well as other pTyr signaling proteins, were already diversified in the common ancestor of metazoans, choanoflagellates, and filastereans (which are together included in the clade Holozoa) whereas they are absent in fungi and other nonholozoan lineages. However, the earliest-branching holozoans Ichthyosporea and Corallochytrea, as well as the two fungi-related amoebae Fonticula and Nuclearia, have not been studied. Here, we analyze the complete genome sequences of two ichthyosporeans and Fonticula, and RNAseq data of three additional ichthyosporeans, one corallochytrean, and Nuclearia. Both the ichthyosporean and corallochytrean genomes encode a large variety of receptor TKs (RTKs) and cytoplasmic TKs (CTKs), as well as other pTyr signaling components showing highly complex domain architectures. However, Nuclearia and Fonticula have no TK, and show much less diversity in other pTyr signaling components. The CTK repertoires of both Ichthyosporea and Corallochytrea are similar to those of Metazoa, Choanoflagellida, and Filasterea, but the RTK sets are totally different from each other. The complex pTyr signaling equipped with positive/negative feedback mechanism likely emerged already at an early stage of holozoan evolution, yet keeping a high evolutionary plasticity in extracellular signal reception until the co-option of the system for cell-to-cell communication in metazoans. © 2013 The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved.This work was supported by the European Research Council Starting grant ERC-2007-StG-206883 to I.R.-T; Ministerio de Ciencia e Innovación grant BFU2008-02839/BMC to I.R.-T.; and the Marie Curie Intra-European Fellowship (MMEMA) within the 7th European Community Framework Programme to H.S.Peer Reviewe

    TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

    Get PDF
    The TBestDB database contains ∼370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact [email protected]. The database can be queried at

    Systematically fragmented genes in a multipartite mitochondrial genome

    Get PDF
    Arguably, the most bizarre mitochondrial DNA (mtDNA) is that of the euglenozoan eukaryote Diplonema papillatum. The genome consists of numerous small circular chromosomes none of which appears to encode a complete gene. For instance, the cox1 coding sequence is spread out over nine different chromosomes in non-overlapping pieces (modules), which are transcribed separately and joined to a contiguous mRNA by trans-splicing. Here, we examine how many genes are encoded by Diplonema mtDNA and whether all are fragmented and their transcripts trans-spliced. Module identification is challenging due to the sequence divergence of Diplonema mitochondrial genes. By employing most sensitive protein profile search algorithms and comparing genomic with cDNA sequence, we recognize a total of 11 typical mitochondrial genes. The 10 protein-coding genes are systematically chopped up into three to 12 modules of 60–350 bp length. The corresponding mRNAs are all trans-spliced. Identification of ribosomal RNAs is most difficult. So far, we only detect the 3′-module of the large subunit ribosomal RNA (rRNA); it does not trans-splice with other pieces. The small subunit rRNA gene remains elusive. Our results open new intriguing questions about the biochemistry and evolution of mitochondrial trans-splicing in Diplonema

    Extensive molecular tinkering in the evolution of the membrane attachment mode of the Rheb GTPase

    Get PDF
    Rheb is a conserved and widespread Ras-like GTPase involved in cell growth regulation mediated by the (m)TORC1 kinase complex and implicated in tumourigenesis in humans. Rheb function depends on its association with membranes via prenylated C-terminus, a mechanism shared with many other eukaryotic GTPases. Strikingly, our analysis of a phylogenetically rich sample of Rheb sequences revealed that in multiple lineages this canonical and ancestral membrane attachment mode has been variously altered. The modifications include: (1) accretion to the N-terminus of two different phosphatidylinositol 3-phosphate-binding domains, PX in Cryptista (the fusion being the first proposed synapomorphy of this clade), and FYVE in Euglenozoa and the related undescribed flagellate SRT308; (2) acquisition of lipidic modifications of the N-terminal region, namely myristoylation and/or S-palmitoylation in seven different protist lineages; (3) acquisition of S-palmitoylation in the hypervariable C-terminal region of Rheb in apusomonads, convergently to some other Ras family proteins; (4) replacement of the C-terminal prenylation motif with four transmembrane segments in a novel Rheb paralog in the SAR clade; (5) loss of an evident C-terminal membrane attachment mechanism in Tremellomycetes and some Rheb paralogs of Euglenozoa. Rheb evolution is thus surprisingly dynamic and presents a spectacular example of molecular tinkering
    corecore