4 research outputs found

    TESE: generating specific protein structure test set ensembles

    Get PDF
    Abstract Summary: TESE is a web server for the generation of test sets of protein sequences and structures fulfilling a number of different criteria. At least three different use cases can be envisaged: (i) benchmarking of novel methods; (ii) test sets tailored for special needs and (iii) extending available datasets. The CATH structure classification is used to control structural/sequence redundancy and a variety of structural quality parameters can be used to interactively select protein subsets with specific characteristics, e.g. all X-ray structures of α-helical repeat proteins with more than 120 residues and resolution <2.0 Å. The output includes FASTA-formatted sequences, PDB files and a clickable HTML index file containing images of the selected proteins. Multiple subsets for cross-validation are also supported. Availability: The TESE server is available for non-commercial use at URL: http://protein.bio.unipd.it/tese/. Contact: [email protected]

    RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures.

    Get PDF
    Abstract Motivation: Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure. Results: Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (1) recognition of solenoid domains, (2) determination of their periodicity and (3) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters that are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q2 value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1931 highly confident repeat structures not previously annotated as solenoids in the Protein Data Bank records. Availability: The RAPHAEL web server is available with additional data at http://protein.bio.unipd.it/raphael/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures

    No full text
    Abstract Motivation: Repeat proteins form a distinct class of structures where folding is greatly simplified. Several classes have been defined, with solenoid repeats of periodicity between ca. 5 and 40 being the most challenging to detect. Such proteins evolve quickly and their periodicity may be rapidly hidden at sequence level. From a structural point of view, finding solenoids may be complicated by the presence of insertions or multiple domains. To the best of our knowledge, no automated methods are available to characterize solenoid repeats from structure. Results: Here we introduce RAPHAEL, a novel method for the detection of solenoids in protein structures. It reliably solves three problems of increasing difficulty: (1) recognition of solenoid domains, (2) determination of their periodicity and (3) assignment of insertions. RAPHAEL uses a geometric approach mimicking manual classification, producing several numeric parameters that are optimized for maximum performance. The resulting method is very accurate, with 89.5% of solenoid proteins and 97.2% of non-solenoid proteins correctly classified. RAPHAEL periodicities have a Spearman correlation coefficient of 0.877 against the manually established ones. A baseline algorithm for insertion detection in identified solenoids has a Q2 value of 79.8%, suggesting room for further improvement. RAPHAEL finds 1931 highly confident repeat structures not previously annotated as solenoids in the Protein Data Bank records. Availability: The RAPHAEL web server is available with additional data at http://protein.bio.unipd.it/raphael/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin
    corecore