111 research outputs found

    An Interactive Practical at the Interface of Web-based and Conventional Publishing

    Get PDF
    In recent years, the World Wide Web has provided new opportunities for innovation in teaching. Web-based approaches allow students to gather information from different corners of the globe, literally at the click of a mouse button. This process attracts mounting interest when different web pages offer added extras, such as animations, or tools with which to commune with data interactively. A great advantage of learning on the web is that, depending on the design of the teaching material, students may be guided as much, or as little, as a particular course demands. Thus material may be used simply to supplement lecture courses, or may be completely selfcontained. In either scenario, armed only with a URL, study may continue away from formal classes. Students may thus explore independently, in a self-paced setting, and compare notes when once again back in the laboratory

    Combining algorithms to predict bacterial protein sub-cellular location: Parallel versus concurrent implementations

    Get PDF
    We describe a novel and potentially important tool for candidate subunit vaccine selection through in silico reverse-vaccinology. A set of Bayesian networks able to make individual predictions for specific subcellular locations is implemented in three pipelines with different architectures: a parallel implementation with a confidence level-based decision engine and two serial implementations with a hierarchical decision structure, one initially rooted by prediction between membrane types and another rooted by soluble versus membrane prediction. The parallel pipeline outperformed the serial pipeline, but took twice as long to execute. The soluble-rooted serial pipeline outperformed the membrane-rooted predictor. Assessment using genomic test sets was more equivocal, as many more predictions are made by the parallel pipeline, yet the serial pipeline identifies 22 more of the 74 proteins of known location

    Toward bacterial protein sub-cellular location prediction: single-class discrimminant models for all gram- and gram+ compartments

    Get PDF
    Based on Bayesian Networks, methods were created that address protein sequence-based bacterial subcellular location prediction. Distinct predictive algorithms for the eight bacterial subcellular locations were created. Several variant methods were explored. These variations included differences in the number of residues considered within the query sequence - which ranged from the N-terminal 10 residues to the whole sequence - and residue representation - which took the form of amino acid composition, percentage amino acid composition, or normalised amino acid composition. The accuracies of the best performing networks were then compared to PSORTB. All individual location methods outperform PSORTB except for the Gram+ cytoplasmic protein predictor, for which accuracies were essentially equal, and for outer membrane protein prediction, where PSORTB outperforms the binary predictor. The method described here is an important new approach to method development for subcellular location prediction. It is also a new, potentially valuable tool for candidate subunit vaccine selection

    Multi-class subcellular location prediction for bacterial proteins

    Get PDF
    Two algorithms, based on Bayesian Networks (BNs), for bacterial subcellular location prediction, are explored in this paper: one predicts all locations for Gram+ bacteria and the other all locations for Gram- bacteria. Methods were evaluated using different numbers of residues (from the N-terminal 10 residues to the whole sequence) and residue representation (amino acid-composition, percentage amino acid-composition or normalised amino acid-composition). The accuracy of the best resulting BN was compared to PSORTB. The accuracy of this multi-location BN was roughly comparable to PSORTB; the difference in predictions is low, often less than 2%. The BN method thus represents both an important new avenue of methodological development for subcellular location prediction and a potentially value new tool of true utilitarian value for candidate subunit vaccine selection

    Learning to extract relations for protein annotation

    Get PDF
    Motivation: Protein annotation is a task that describes protein X in terms of topic Y. Usually, this is constructed using information from the biomedical literature. Until now, most of literature-based protein annotation work has been done manually by human annotators. However, as the number of biomedical papers grows ever more rapidly, manual annotation becomes more difficult, and there is increasing need to automate the process. Recently, information extraction (IE) has been used to address this problem. Typically, IE requires pre-defined relations and hand-crafted IE rules or annotated corpora, and these requirements are difficult to satisfy in real-world scenarios such as in the biomedical domain. In this article, we describe an IE system that requires only sentences labelled according to their relevance or not to a given topic by domain experts. Results: We applied our system to meet the annotation needs of a well-known protein family database; the results show that our IE system can annotate proteins with a set of extracted relations by learning relations and IE rules for disease, function and structure from only relevant and irrelevant sentences. Contact: [email protected]

    EMBER: a European Multimedia Bioinformatics Educational Resource

    Get PDF
    Bioinformatics has taken centre stage in the post-genomic era. The data overload arising from the many now-fruitful genome projects has created an insatiable demand for suitably qualified people to build and maintain databases, to design more incisive analysis software, to use disparate databases and software tools, and to understand both the statistical and biological significance of results generated in silico. It is rare to find individuals with such a range of skills, yet such scientists are now needed urgently in sequencing centres, research/academic institutes, pharmaceutical/agrochemical companies, software houses and start-up companies. But the rate of growth of this field, and its cross-disciplinary nature, has created a problem: while there are many trained biologists and computer scientists, there are few computer-literate biologists or biology-literate computer scientists. Consequently, there is a dearth of skilled staff in bioinformatics. This is especially problematic for universities, which are less able than large multinational companies to compete for the small numbers of trained individuals emerging from current MSc, MRes or PhD courses. In an attempt to address the current European skills shortage in bioinformatics, the European Commission has recently funded an innovative new educational project that aims to develop a suite of multimedia bioinformatics educational tools (collectively termed EMBER). EMBER will provide teaching materials for undergraduate and early postgraduate studies; it will comprise a self-contained, interactive web tutorial in bioinformatics, the equivalent stand-alone course on CD-ROM, and an accompanying introductory textbook. The use of conventional text, coupled with web- and CD-based media, will ensure that students for whom Internet access is not optimal also have access to the same fundamental level of bioinformatics education

    LIPPRED: A web server for accurate prediction of lipoprotein signal sequences and cleavage sites

    Get PDF
    Bacterial lipoproteins have many important functions and represent a class of possible vaccine candidates. The prediction of lipoproteins from sequence is thus an important task for computational vaccinology. Naïve-Bayesian networks were trained to identify SpaseII cleavage sites and their preceding signal sequences using a set of 199 distinct lipoprotein sequences. A comprehensive range of sequence models was used to identify the best model for lipoprotein signal sequences. The best performing sequence model was found to be 10-residues in length, including the conserved cysteine lipid attachment site and the nine residues prior to it. The sensitivity of prediction for LipPred was 0.979, while the specificity was 0.742. Here, we describe LipPred, a web server for lipoprotein prediction; available at the URL: http://www.jenner.ac.uk/LipPred/. LipPred is the most accurate method available for the detection of SpaseIIcleaved lipoprotein signal sequences and the prediction of their cleavage sites

    Alpha helical trans-membrane proteins: Enhanced prediction using a Bayesian approach

    Get PDF
    Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed α-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications

    A predictor of membrane class: Discriminating α-helical and β-barrel membrane proteins from non-membranous proteins

    Get PDF
    Accurate protein structure prediction remains an active objective of research in bioinformatics. Membrane proteins comprise approximately 20% of most genomes. They are, however, poorly tractable targets of experimental structure determination. Their analysis using bioinformatics thus makes an important contribution to their on-going study. Using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we have addressed the alignment-free discrimination of membrane from non-membrane proteins. The method successfully identifies prokaryotic and eukaryotic α-helical membrane proteins at 94.4% accuracy, β-barrel proteins at 72.4% accuracy, and distinguishes assorted non-membranous proteins with 85.9% accuracy. The method here is an important potential advance in the computational analysis of membrane protein structure. It represents a useful tool for the characterisation of membrane proteins with a wide variety of potential applications
    • …
    corecore