11 research outputs found

    Progress and challenges in predicting protein-protein interaction sites

    No full text
    The identification of protein-protein interaction sites is an essential intermediate step for mutant design and the prediction of protein networks. In recent years a significant number of methods have been developed to predict these interface residues and here we review the current status of the field. Progress in this area requires a clear view of the methodology applied, the data sets used for training and testing the systems, and the evaluation procedures. We have analysed the impact of a representative set of features and algorithms and highlighted the problems inherent in generating reliable protein data sets and in the posterior analysis of the results. Although it is clear that there have been some improvements in methods for predicting interacting sites, several major bottlenecks remain. Proteins in complexes are still under-represented in the structural databases and in particular many proteins involved in transient complexes are still to be crystallized.We provide suggestions for effective feature selection, andmake it clear that community standards for testing, training and performancemeasures are necessary for progress in the field

    Modern genome annotation: The BioSapiens network

    No full text
    In order to maximise our understanding of biology and evolution, gained from the large scale sequencing projects of the current era, it is necessary to be able to assign detailed biochemical, cellular and developmental functions to as many protein sequences as possible. More than five million distinct proteins can be found in the major public repositories, i.e., UniProt & RefSeq (Pruitt et al. 2007; UniProt Consortium 2007), but detailed laboratory investigations have only been carried out for a tiny fraction. For instance, only ~ 25,000 proteins have solved structures in the international protein structure repository, the worldwide Protein Data Bank (wwPDB, Berman et al. 2003)

    Alternative splicing in the ENCODE protein complement

    No full text
    An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence. Bioinformatics is essential for annotating the structure and function of genes, proteins and the analysis of complete genomes and to molecular biology and biochemistry. Included is an overview of bioinformatics, the full spectrum of genome annotation approaches including; genome analysis and gene prediction, gene regulation analysis and expression, genome variation and QTL analysis, large scale protein annotation of function and structure, annotation and prediction of protein interactions, and the organization and annotation of molecular networks and biochemical pathways. Also covered is a technical framework to organize and represent genome data using the DAS technology and work in the annotation of two large genomic sets: HIV/HCV viral genomes and splicing alternatives potentially encoded in 1% of the human genome

    Alternative splicing in the ENCODE protein complement

    No full text
    An accurate description of current scientific developments in the field of bioinformatics and computational implementation is presented by research of the BioSapiens Network of Excellence. Bioinformatics is essential for annotating the structure and function of genes, proteins and the analysis of complete genomes and to molecular biology and biochemistry. Included is an overview of bioinformatics, the full spectrum of genome annotation approaches including; genome analysis and gene prediction, gene regulation analysis and expression, genome variation and QTL analysis, large scale protein annotation of function and structure, annotation and prediction of protein interactions, and the organization and annotation of molecular networks and biochemical pathways. Also covered is a technical framework to organize and represent genome data using the DAS technology and work in the annotation of two large genomic sets: HIV/HCV viral genomes and splicing alternatives potentially encoded in 1% of the human genome
    corecore