24 research outputs found

    Extraction of consensus protein patterns in regions containing non-proline cis peptide bonds and their functional assessment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In peptides and proteins, only a small percentile of peptide bonds adopts the <it>cis </it>configuration. Especially in the case of amide peptide bonds, the amount of <it>cis </it>conformations is quite limited thus hampering systematic studies, until recently. However, lately the emerging population of databases with more 3D structures of proteins has produced a considerable number of sequences containing non-proline <it>cis </it>formations (<it>cis</it>-nonPro).</p> <p>Results</p> <p>In our work, we extract regular expression-type patterns that are descriptive of regions surrounding the <it>cis</it>-nonPro formations. For this purpose, three types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, and iii) pattern discovery using a structural equivalency set. Afterwards, using each pattern as predicate, we search the Eukaryotic Linear Motif (ELM) resource to identify potential functional implications of regions with <it>cis</it>-nonPro peptide bonds. The patterns extracted from each type of pattern discovery are further employed, in order to formulate a pattern-based classifier, which is used to discriminate between <it>cis</it>-nonPro and <it>trans</it>-nonPro formations.</p> <p>Conclusions</p> <p>In terms of functional implications, we observe a significant association of <it>cis</it>-nonPro peptide bonds towards ligand/binding functionalities. As for the pattern-based classification scheme, the highest results were obtained using the structural equivalency set, which yielded 70% accuracy, 77% sensitivity and 63% specificity.</p

    Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the <it>trans </it>conformation. In spite of their infrequent occurrence, <it>cis </it>peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes.</p> <p>Results</p> <p>We perform a systematic analysis of regions in protein sequences that contain a proline <it>cis </it>peptide bond in order to discover non-random associations between the primary sequence and the nature of proline <it>cis/trans </it>isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of <it>cis </it>proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of <it>cis </it>prolyl bonds.</p> <p>Conclusion</p> <p><it>Cis </it>patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures.</p

    Bayesian Algorithm Implementation in a Real Time Exposure Assessment Model on Benzene with Calculation of Associated Cancer Risks

    Get PDF
    The objective of the current study was the development of a reliable modeling platform to calculate in real time the personal exposure and the associated health risk for filling station employees evaluating current environmental parameters (traffic, meteorological and amount of fuel traded) determined by the appropriate sensor network. A set of Artificial Neural Networks (ANNs) was developed to predict benzene exposure pattern for the filling station employees. Furthermore, a Physiology Based Pharmaco-Kinetic (PBPK) risk assessment model was developed in order to calculate the lifetime probability distribution of leukemia to the employees, fed by data obtained by the ANN model. Bayesian algorithm was involved in crucial points of both model sub compartments. The application was evaluated in two filling stations (one urban and one rural). Among several algorithms available for the development of the ANN exposure model, Bayesian regularization provided the best results and seemed to be a promising technique for prediction of the exposure pattern of that occupational population group. On assessing the estimated leukemia risk under the scope of providing a distribution curve based on the exposure levels and the different susceptibility of the population, the Bayesian algorithm was a prerequisite of the Monte Carlo approach, which is integrated in the PBPK-based risk model. In conclusion, the modeling system described herein is capable of exploiting the information collected by the environmental sensors in order to estimate in real time the personal exposure and the resulting health risk for employees of gasoline filling stations

    Analysis of Protein Interaction Networks for the Detection of Candidate Hepatitis B and C Biomarkers

    No full text
    Hepatitis B virus (HBV) and hepatitis C virus (HCV) infection are the major causes of chronic liver disease, cirrhosis and hepatocellular carcinoma (HCC). The resolution or chronicity of acute infection is dependent on a complex interplay between virus and innate/adaptive immunity. The mechanisms that lead a significant proportion of patients to more severe liver disease are not clearly defined and involve virus induced host gene/protein alterations. The utilization of protein interaction networks (PINs) is expected to identify novel aspects of the disease concerning the patients’ immune response to virus as well as the main pathways that are involved in the development of fibrosis and HCC. In this study, we designed several PINs for HBV and HCV and employed topological, modular, and functional analysis techniques in order to determine significant network nodes that correspond to prominent candidate biomarkers. The networks were built using data from various interaction databases. When the overall PINs of HBV and HCV were compared, 48 nodes were found in common. The implementation of a statistical ranking procedure indicated that three of them are of higher importance
    corecore