88 research outputs found

    Structuprint: a scalable and extensible tool for two-dimensional representation of protein surfaces

    Get PDF
    © 2016 Kontopoulos et al.Background: The term molecular cartography encompasses a family of computational methods for two-dimensional transformation of protein structures and analysis of their physicochemical properties. The underlying algorithms comprise multiple manual steps, whereas the few existing implementations typically restrict the user to a very limited set of molecular descriptors. Results: We present Structuprint, a free standalone software that fully automates the rendering of protein surface maps, given - at the very least - a directory with a PDB file and an amino acid property. The tool comes with a default database of 328 descriptors, which can be extended or substituted by user-provided ones. The core algorithm comprises the generation of a mould of the protein surface, which is subsequently converted to a sphere and mapped to two dimensions, using the Miller cylindrical projection. Structuprint is partly optimized for multicore computers, making the rendering of animations of entire molecular dynamics simulations feasible. Conclusions: Structuprint is an efficient application, implementing a molecular cartography algorithm for protein surfaces. According to the results of a benchmark, its memory requirements and execution time are reasonable, allowing it to run even on low-end personal computers. We believe that it will be of use - primarily but not exclusively - to structural biologists and computational biochemists

    Protein signatures using electrostatic molecular surfaces in harmonic space

    Full text link
    We developed a novel method based on the Fourier analysis of protein molecular surfaces to speed up the analysis of the vast structural data generated in the post-genomic era. This method computes the power spectrum of surfaces of the molecular electrostatic potential, whose three-dimensional coordinates have been either experimentally or theoretically determined. Thus we achieve a reduction of the initial three-dimensional information on the molecular surface to the one-dimensional information on pairs of points at a fixed scale apart. Consequently, the similarity search in our method is computationally less demanding and significantly faster than shape comparison methods. As proof of principle, we applied our method to a training set of viral proteins that are involved in major diseases such as Hepatitis C, Dengue fever, Yellow fever, Bovine viral diarrhea and West Nile fever. The training set contains proteins of four different protein families, as well as a mammalian representative enzyme. We found that the power spectrum successfully assigns a unique signature to each protein included in our training set, thus providing a direct probe of functional similarity among proteins. The results agree with established biological data from conventional structural biochemistry analyses.Comment: 9 pages, 10 figures Published in PeerJ (2013), https://peerj.com/articles/185

    3D structural analysis of proteins using electrostatic surfaces based on image segmentation

    Get PDF
    Herein, we present a novel strategy to analyse and characterize proteins using protein molecular electrostatic surfaces. Our approach starts by calculating a series of distinct molecular surfaces for each protein that are subsequently flattened out, thus reducing 3D information noise. RGB images are appropriately scaled by means of standard image processing techniques whilst retaining the weight information of each protein’s molecular electrostatic surface. Then homogeneous areas in the protein surface are estimated based on unsupervised clustering of the 3D images, while performing similarity searches. This is a computationally fast approach, which efficiently highlights interesting structural areas among a group of proteins. Multiple protein electrostatic surfaces can be combined together and in conjunction with their processed images, they can provide the starting material for protein structural similarity and molecular docking experiments.  

    Outcome prediction based on microarray analysis: a critical perspective on methods

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Information extraction from microarrays has not yet been widely used in diagnostic or prognostic decision-support systems, due to the diversity of results produced by the available techniques, their instability on different data sets and the inability to relate statistical significance with biological relevance. Thus, there is an urgent need to address the statistical framework of microarray analysis and identify its drawbacks and limitations, which will enable us to thoroughly compare methodologies under the same experimental set-up and associate results with confidence intervals meaningful to clinicians. In this study we consider gene-selection algorithms with the aim to reveal inefficiencies in performance evaluation and address aspects that can reduce uncertainty in algorithmic validation.</p> <p>Results</p> <p>A computational study is performed related to the performance of several gene selection methodologies on publicly available microarray data. Three basic types of experimental scenarios are evaluated, i.e. the independent test-set and the 10-fold cross-validation (CV) using maximum and average performance measures. Feature selection methods behave differently under different validation strategies. The performance results from CV do not mach well those from the independent test-set, except for the support vector machines (SVM) and the least squares SVM methods. However, these wrapper methods achieve variable (often low) performance, whereas the hybrid methods attain consistently higher accuracies. The use of an independent test-set within CV is important for the evaluation of the predictive power of algorithms. The optimal size of the selected gene-set also appears to be dependent on the evaluation scheme. The consistency of selected genes over variation of the training-set is another aspect important in reducing uncertainty in the evaluation of the derived gene signature. In all cases the presence of outlier samples can seriously affect algorithmic performance.</p> <p>Conclusion</p> <p>Multiple parameters can influence the selection of a gene-signature and its predictive power, thus possible biases in validation methods must always be accounted for. This paper illustrates that independent test-set evaluation reduces the bias of CV, and case-specific measures reveal stability characteristics of the gene-signature over changes of the training set. Moreover, frequency measures on gene selection address the algorithmic consistency in selecting the same gene signature under different training conditions. These issues contribute to the development of an objective evaluation framework and aid the derivation of statistically consistent gene signatures that could eventually be correlated with biological relevance. The benefits of the proposed framework are supported by the evaluation results and methodological comparisons performed for several gene-selection algorithms on three publicly available datasets.</p

    On a meaningful integration of web services in data-intensive biomedical environments: The DICODE approach

    Get PDF
    This paper reports on an innovative approach that aims to reduce information management costs in data-intensive and cognitively-complex biomedical environments. Recognizing the importance of prominent high-performance computing paradigms and large data processing technologies as well as collaboration support systems to remedy data-intensive issues, it adopts a hybrid approach by building on the synergy of these technologies. The proposed approach provides innovative Web-based workbenches that integrate and orchestrate a set of interoperable services that reduce the data-intensiveness and complexity overload at critical decision points to a manageable level, thus permitting stakeholders to be more productive and concentrate on creative activities

    Deliverable Raport D4.6 Tools for generating QMRF and QPRF reports

    Get PDF
    Scientific reports carry significant importance for the straightforward and effective transfer of knowledge, results and ideas. Good practice dictates that reports should be well-structured and concise. This deliverable describes the reporting services for models, predictions and validation tasks that have been integrated within the eNanoMapper (eNM) modelling infrastructure. Validation services have been added to the Jaqpot Quattro (JQ) modelling platform and the nano-lazar read-across framework developed within WP4 to support eNM modelling activities. Moreover, we have proceeded with the development of reporting services for predictions and models, respectively QPRF and QMRF reports. Therefore, in this deliverable, we first describe the three validation schemes created, namely training set split, cross- and external validation in detail and demonstrate their functionality both on API and UI levels. We then proceed with the description of the read across functionalities and finally, we present and describe the QPRF and QMRF reporting services
    • …
    corecore