43 research outputs found

    Extraction of consensus protein patterns in regions containing non-proline cis peptide bonds and their functional assessment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In peptides and proteins, only a small percentile of peptide bonds adopts the <it>cis </it>configuration. Especially in the case of amide peptide bonds, the amount of <it>cis </it>conformations is quite limited thus hampering systematic studies, until recently. However, lately the emerging population of databases with more 3D structures of proteins has produced a considerable number of sequences containing non-proline <it>cis </it>formations (<it>cis</it>-nonPro).</p> <p>Results</p> <p>In our work, we extract regular expression-type patterns that are descriptive of regions surrounding the <it>cis</it>-nonPro formations. For this purpose, three types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, and iii) pattern discovery using a structural equivalency set. Afterwards, using each pattern as predicate, we search the Eukaryotic Linear Motif (ELM) resource to identify potential functional implications of regions with <it>cis</it>-nonPro peptide bonds. The patterns extracted from each type of pattern discovery are further employed, in order to formulate a pattern-based classifier, which is used to discriminate between <it>cis</it>-nonPro and <it>trans</it>-nonPro formations.</p> <p>Conclusions</p> <p>In terms of functional implications, we observe a significant association of <it>cis</it>-nonPro peptide bonds towards ligand/binding functionalities. As for the pattern-based classification scheme, the highest results were obtained using the structural equivalency set, which yielded 70% accuracy, 77% sensitivity and 63% specificity.</p

    Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the <it>trans </it>conformation. In spite of their infrequent occurrence, <it>cis </it>peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes.</p> <p>Results</p> <p>We perform a systematic analysis of regions in protein sequences that contain a proline <it>cis </it>peptide bond in order to discover non-random associations between the primary sequence and the nature of proline <it>cis/trans </it>isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of <it>cis </it>proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of <it>cis </it>prolyl bonds.</p> <p>Conclusion</p> <p><it>Cis </it>patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures.</p

    Addressing the clinical unmet needs in primary Sjögren's Syndrome through the sharing, harmonization and federated analysis of 21 European cohorts

    Get PDF
    For many decades, the clinical unmet needs of primary Sjögren's Syndrome (pSS) have been left unresolved due to the rareness of the disease and the complexity of the underlying pathogenic mechanisms, including the pSS-associated lymphomagenesis process. Here, we present the HarmonicSS cloud-computing exemplar which offers beyond the state-of-the-art data analytics services to address the pSS clinical unmet needs, including the development of lymphoma classification models and the identification of biomarkers for lymphomagenesis. The users of the platform have been able to successfully interlink, curate, and harmonize 21 regional, national, and international European cohorts of 7,551 pSS patients with respect to the ethical and legal issues for data sharing. Federated AI algorithms were trained across the harmonized databases, with reduced execution time complexity, yielding robust lymphoma classification models with 85% accuracy, 81.25% sensitivity, 85.4% specificity along with 5 biomarkers for lymphoma development. To our knowledge, this is the first GDPR compliant platform that provides federated AI services to address the pSS clinical unmet needs. © 2022 The Author(s

    Three-dimensional reconstruction of coronary arteries and plaque morphology using CT angiography – comparison and registration with IVUS

    Get PDF
    BACKGROUND: The aim of this study is to present a new methodology for three-dimensional (3D) reconstruction of coronary arteries and plaque morphology using Computed Tomography Angiography (CTA). METHODS: The methodology is summarized in six stages: 1) pre-processing of the initial raw images, 2) rough estimation of the lumen and outer vessel wall borders and approximation of the vessel’s centerline, 3) manual adaptation of plaque parameters, 4) accurate extraction of the luminal centerline, 5) detection of the lumen - outer vessel wall borders and calcium plaque region, and 6) finally 3D surface construction. RESULTS: The methodology was compared to the estimations of a recently presented Intravascular Ultrasound (IVUS) plaque characterization method. The correlation coefficients for calcium volume, surface area, length and angle vessel were 0.79, 0.86, 0.95 and 0.88, respectively. Additionally, when comparing the inner and outer vessel wall volumes of the reconstructed arteries produced by IVUS and CTA the observed correlation was 0.87 and 0.83, respectively. CONCLUSIONS: The results indicated that the proposed methodology is fast and accurate and thus it is likely in the future to have applications in research and clinical arena

    Machine learning applications in cancer prognosis and prediction

    Get PDF
    Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes

    CAN ANTS PREDICT BANKRUPTCY? A COMPARISON OF ANT COLONY SYSTEMS TO OTHER STATE-OF-THE-ART COMPUTATIONAL METHODS

    No full text
    In the current work, we consider the applicability of Ant Colony Systems (ACS) to the bankruptcy prediction problem. ACS are nature-based algorithms that mimic the functions of live organisms to find the best performing solution. In our work, ACS are used for the extraction of classification rules for bankruptcy prediction. An experimental study was conducted in order to evaluate the performance of the system and identify well performing parameters. Results were compared to the performance obtained by state-of-the-art methods for classification, namely the Artificial Neural Networks, the Support Vector Machines, the Partial Decision Trees and the Fuzzy Lattice Reasoning. Comparison indicates the high performance of the ACS which is further supported by their ability to extract classification rules, thus offering interpretation of the prediction results. The latter is of great importance in the field of corporate distress where no unified theory on distress prediction exists. Most studies with distress prediction have focused on increasing the accuracy of the model and have not always paid attention to the model interpretation.Ant colony systems, rule extraction, support vector machines, neural networks, decision trees, fuzzy lattice reasoning, bankruptcy prediction

    In Silico Structural Analysis Predicting the Pathogenicity of PLP1 Mutations in Multiple Sclerosis

    No full text
    The X chromosome gene PLP1 encodes myelin proteolipid protein (PLP), the most prevalent protein in the myelin sheath surrounding the central nervous system. X-linked dysmyelinating disorders such as Pelizaeus–Merzbacher disease (PMD) or spastic paraplegia type 2 (SPG2) are typically caused by point mutations in PLP1. Nevertheless, numerous case reports have shown individuals with PLP1 missense point mutations which also presented clinical symptoms and indications that were consistent with the diagnostic criteria of multiple sclerosis (MS), a disabling disease of the brain and spinal cord with no current cure. Computational structural biology methods were used to assess the impact of these mutations on the stability and flexibility of PLP structure in order to determine the role of PLP1 mutations in MS pathogenicity. The analysis showed that most of the variants can alter the functionality of the protein structure such as R137W variants which results in loss of helix and H140Y which alters the ordered protein interface. In silico genomic methods were also performed to predict the significance of these mutations associated with impairments in protein functionality and could suggest a better definition for therapeutic strategies and clinical application in MS patients
    corecore