30 research outputs found

    Families of Proteins Forming Transmembrane Channels

    No full text

    Learning to Find Relevant Biological Articles Without Negative Training Examples

    No full text
    Abstract. Classifiers are traditionally learned using sets of positive and negative training examples. However, often a classifier is required, but for training only an incomplete set of positive examples and a set of unlabeled examples are available. This is the situation, for example, with the Transport Classification Database (TCDB, www.tcdb.org), a repository of information about proteins involved in transmembrane transport. This paper presents and evaluates a method for learning to rank the likely relevance to TCDB of newly published scientific articles, using the articles currently referenced in TCDB as positive training examples. The new method has succeeded in identifying 964 new articles relevant to TCDB in fewer than six months, which is a major practical success. From a general data mining perspective, the contributions of this paper are (i) devising and evaluating two novel approaches that solve the positive-only problem effectively, (ii) applying support vector machines in a state-ofthe-art way for recognizing and ranking relevance, and (iii) deploying a system to update a widely-used, real-world biomedical database. Supplementary information including all data sets are publicly available at www.cs.ucsd.edu/users/knoto/pub/ajcai08.

    Supplementary Material for: Bioinformatic Analyses of Transmembrane Transport: Novel Software for Deducing Protein Phylogeny, Topology, and Evolution

    No full text
    <p>During the past decade, we have experienced a revolution in the biological sciences resulting from the flux of information generated by genome-sequencing efforts. Our understanding of living organisms, the metabolic processes they catalyze, the genetic systems encoding cellular protein and stable RNA constituents, and the pathological conditions caused by some of these organisms has greatly benefited from the availability of complete genomic sequences and the establishment of comprehensive databases. Many research institutes around the world are now devoting their efforts largely to genome sequencing, data collection and data analysis. In this review, we summarize tools that are in routine use in our laboratory for characterizing transmembrane transport systems. Applications of these tools to specific transporter families are presented. Many of the computational approaches described should be applicable to virtually all classes of proteins and RNA molecules.</p

    Supplementary Material for: Analysis of 58 Families of Holins Using a Novel Program, PhyST

    No full text
    <p>We have designed a freely accessible program, PhyST, which allows the automated characterization of any family of homologous proteins within the Transporter Classification Database. The program performs an NCBI-PSI-BLAST search and reports (1) the average protein sequence length with standard deviations, (2) the average predicted number of transmembrane segments, (3) the total number of homologues retrieved, (4) a quantitative list of all source phyla, and (5) potential fusion proteins of sizes considerably exceeding the average size of the proteins retrieved. We have applied this program to 58 families of holins, and the results are presented. The results show that holins are very rarely fused to other protein domains, suggesting that holins form transmembrane pores as homooligomers without the participation of other proteins or protein domains.</p><br

    Supplementary Material for: The Membrane Attack Complex/Perforin Superfamily

    No full text
    <p>The membrane attack complex/perforin (MACPF) superfamily consists of a diverse group of proteins involved in bacterial pathogenesis and sporulation as well as eukaryotic immunity, embryonic development, neural migration and fruiting body formation. The present work shows that the evolutionary relationships between the members of the superfamily, previously suggested by comparison of their tertiary structures, can also be supported by analyses of their primary structures. The superfamily includes the MACPF family (TC 1.C.39), the cholesterol-dependent cytolysin (CDC) family (TC 1.C.12.1 and 1.C.12.2) and the pleurotolysin pore-forming (pleurotolysin B) family (TC 1.C.97.1), as revealed by expansion of each family by comparison against a large protein database, and by the comparisons of their hidden Markov models. Clustering analyses demonstrated grouping of the CDC homologues separately from the 12 MACPF subfamilies, which also grouped separately from the pleurotolysin B family. Members of the MACPF superfamily revealed a remarkably diverse range of proteins spanning eukaryotic, bacterial, and archaeal taxonomic domains, with notable variations in protein domain architectures. Our strategy should also be helpful in putting together other highly divergent protein families.</p

    Supplementary Material for: Comparative Analyses of Transport Proteins Encoded within the Genomes of Bdellovibrio bacteriovorus HD100 and Bdellovibrio exovorus JSS

    No full text
    <p><i>Bdellovibrio</i>, δ-proteobacteria, including <i>B. bacteriovorus</i> (Bba) and <i>B. exovorus</i> (Bex), are obligate predators of other Gram-negative bacteria. While Bba grows in the periplasm of the prey cell, Bex grows externally. We have analyzed and compared the transport proteins of these 2 organisms based on the current contents of the Transporter Classification Database (TCDB; www.tcdb.org). Bba has 103 transporters more than Bex, 50% more secondary carriers, and 3 times as many MFS carriers. Bba has far more metabolite transporters than Bex as expected from its larger genome, but there are 2 times more carbohydrate uptake and drug efflux systems, and 3 times more lipid transporters. Bba also has polyamine and carboxylate transporters lacking in Bex. Bba has more than twice as many members of the Mot-Exb family of energizers, but both may have energizers for gliding motility. They use entirely different types of systems for iron acquisition. Both contain unexpectedly large numbers of pseudogenes and incomplete systems, suggesting that they are undergoing genome size reduction. Interestingly, all 5 outer-membrane receptors in Bba are lacking in Bex. The 2 organisms have similar numbers and types of peptide and amino acid uptake systems as well as protein and carbohydrate secretion systems. The differences observed correlate with and may account, in part, for the different lifestyles of these 2 bacterial predators.</p

    Supplementary Material for: The Amino Acid-Polyamine-Organocation Superfamily

    No full text
    The amino acid-polyamine-organocation (APC) superfamily has been shown to include five recognized families, four of which are specific for amino acids and their derivatives. Recent high-resolution X-ray crystallographic data have shown that four additional transporter families (BCCT, TC No. 2.A.15; SSS, 2.A.21; NSS, 2.A.22; and NCS1, 2.A.39), transporting a wide range of solutes, exhibit sufficiently similar folds to suggest a common evolutionary origin. We have used established statistical methods, based on sequence similarity, to show that these families are, in fact, members of the APC superfamily. We also identify two additional families (NCS2, 2.A.40; SulP, 2.A.53) as being members of this superfamily. Repeat sequences, each having five transmembrane α-helical segments and arising via ancient intragenic duplications, are demonstrated for all of these families, further strengthening the conclusion of homology. The APC superfamily appears to be the second largest superfamily of secondary carriers, the largest being the major facilitator superfamily (MFS). Although the topology of the members of the APC superfamily differs from that of the MFS, both families appear to have arisen from a common ancestral 2 TMS hairpin structure that underwent intragenic triplication followed by loss of a TMS in the APC family, to give the repeat units that are characteristic of these two superfamilies
    corecore