5 research outputs found

    Predicting MoRFs in protein sequences using HMM profiles

    Get PDF
    Background: Intrinsically Disordered Proteins (IDPs) lack an ordered three-dimensional structure and are enriched in various biological processes. The Molecular Recognition Features (MoRFs) are functional regions within IDPs that undergo a disorder-to-order transition on binding to a partner protein. Identifying MoRFs in IDPs using computational methods is a challenging task. Methods: In this study, we introduce hidden Markov model (HMM) profiles to accurately identify the location of MoRFs in disordered protein sequences. Using windowing technique, HMM profiles are utilised to extract features from protein sequences and support vector machines (SVM) are used to calculate a propensity score for each residue. Two different SVM kernels with high noise tolerance are evaluated with a varying window size and the scores of the SVM models are combined to generate the final propensity score to predict MoRF residues. The SVM models are designed to extract maximal information between MoRF residues, its neighboring regions (Flanks) and the remainder of the sequence (Others). Results: To evaluate the proposed method, its performance was compared to that of other MoRF predictors; MoRFpred and ANCHOR. The results show that the proposed method outperforms these two predictors. Conclusions: Using HMM profile as a source of feature extraction, the proposed method indicates improvement in predicting MoRFs in disordered protein sequence

    Predicting MoRFs in protein sequences using HMM profiles

    Get PDF

    Brain wave classification using long short - term memory based OPTICAL predictor

    Get PDF
    Brain-computer interface (BCI) systems having the ability to classify brain waves with greater accuracy are highly desirable. To this end, a number of techniques have been proposed aiming to be able to classify brain waves with high accuracy. However, the ability to classify brain waves and its implementation in real-time is still limited. In this study, we introduce a novel scheme for classifying motor imagery (MI) tasks using electroencephalography (EEG) signal that can be implemented in real-time having high classification accuracy between different MI tasks. We propose a new predictor, OPTICAL, that uses a combination of common spatial pattern (CSP) and long short-term memory (LSTM) network for obtaining improved MI EEG signal classification. A sliding window approach is proposed to obtain the time-series input from the spatially filtered data, which becomes input to the LSTM network. Moreover, instead of using LSTM directly for classification, we use regression based output of the LSTM network as one of the features for classification. On the other hand, linear discriminant analysis (LDA) is used to reduce the dimensionality of the CSP variance based features. The features in the reduced dimensional plane after performing LDA are used as input to the support vector machine (SVM) classifier together with the regression based feature obtained from the LSTM network. The regression based feature further boosts the performance of the proposed OPTICAL predictor. OPTICAL showed significant improvement in the ability to accurately classify left and right-hand MI tasks on two publically available datasets. The improvements in the average misclassification rates are 3.09% and 2.07% for BCI Competition IV Dataset I and GigaDB dataset, respectively. The Matlab code is available at https://github.com/ShiuKumar/OPTICAL

    The functional analysis of XhLEA3-2 - a LEA_4 from the resurrection plant, Xerophyta humilis

    Get PDF
    Climate change is a pressing reality in the current era. Changing environmental conditions and limited water availability are associated with the loss of arable land in areas where farming has traditionally thrived. Thus, linked to climate change, is the risk of a global food shortage. Resurrection plants are phenomenal in that they are able to survive extended periods of drought in a state of anhydrobiosis and then resume full metabolism upon rehydration. These plants serve as models to scientists and genetic engineers who hope to replicate, to a degree, the 'resurrection phenomenon' in drought sensitive crop species. The ability of resurrection plants to survive drought needs to be studied on a molecular level if it is to be implemented in transgenic crops. Currently, the molecular mechanisms of desiccation tolerance are only somewhat understood, and considerable investigation is still required. Xerophyta humilis is a monocotyledonous resurrection plant in which one of the responses to extreme water loss is the upregulation of several Late Embryogenesis Abundant (LEA) genes. The protein products of these genes, called LEA proteins, are known to be correlated with abiotic stress tolerance in plants, invertebrates and microorganisms. However, the precise molecular mode(s) of action of LEA proteins are still poorly understood. In this study, a group LEA_4, LEA protein, which we have termed XhLEA3-2, shown to be transcriptionally upregulated during desiccation of the resurrection plant X. humilis, has been characterized. A bioinformatic, predictive analysis was performed to detect any LEA-like characteristics of XhLEA3-2. Recombinant XhLEA3-2 was produced in Escherichia coli, purified, and used to generate XhLEA3-2 specific antibodies for expression analyses. The ability of XhLEA3-2 to function as a molecular chaperone was assessed using a lactate dehydrogenase (LDH) enzyme stability assay. Transgenic expression of XhLEA3-2 in E. coli and tobacco was also investigated. In summary, this thesis demonstrates that XhLEA3-2: has typical LEA protein properties according to bioinformatic analyses, has two close homologs in X. viscosa, is present in dry X. humilis leaf tissue, has homologs present in dry X. viscosa leaf tissue, has some molecular chaperone activity, can protect E. coli from desiccation but not from osmotic stress, and can be transiently expressed in tobacco

    Computational Analysis and Prediction of Intrinsic Disorder and Intrinsic Disorder Functions in Proteins

    Get PDF
    COMPUTATIONAL ANALYSIS AND PREDICTION OF INTRINSIC DISORDER AND INTRINSIC DISORDER FUNCTIONS IN PROTEINS By Akila Imesha Katuwawala A dissertation submitted in partial fulfillment of the requirements for the degree of Engineering, Doctor of Philosophy with a concentration in Computer Science at Virginia Commonwealth University. Virginia Commonwealth University, 2021 Director: Lukasz Kurgan, Professor, Department of Computer Science Proteins, as a fundamental class of biomolecules, have been studied from various perspectives over the past two centuries. The traditional notion is that proteins require fixed and stable three-dimensional structures to carry out biological functions. However, there is mounting evidence regarding a “special” class of proteins, named intrinsically disordered proteins, which do not have fixed three-dimensional structures though they perform a number of important biological functions. Computational approaches have been a vital component to study these intrinsically disordered proteins over the past few decades. Prediction of the intrinsic disorder and functions of intrinsic disorder from protein sequences is one such important computational approach that has recently gained attention, particularly in the advent of the development of modern machine learning techniques. This dissertation runs along two basic themes, namely, prediction of the intrinsic disorder and prediction of the intrinsic disorder functions. The work related to the prediction of intrinsic disorder covers a novel approach to evaluate the predictive performance of the current computational disorder predictors. This approach evaluates the intrinsic disorder predictors at the individual protein level compared to the traditional studies that evaluate them over large protein datasets. We address several interesting aspects concerning the differences in the protein-level vs. dataset-level predictive quality, complementarity and predictive performance of the current predictors. Based on the findings from this assessment we have conceptualized, developed, tested and deployed an innovative platform called DISOselect that recommends the most suitable computational disorder predictors for a given protein, with an underlying goal to maximize the predictive performance. DISOselect provides advice on whether a given disorder predictor would provide an accurate prediction for a given protein of user’s interest, and recommends the most suitable disorder predictor together with an estimate of its expected predictive quality. The second theme, prediction of the intrinsic disorder functions, includes first-of-its-kind evaluation of the current computational disorder predictors on two functional sub-classes of the intrinsically disordered proteins. This study introduces several novel evaluation strategies to assess predictive performance of disorder prediction methods and focuses on the evaluation for disorder functions associated with interactions with partner molecules. Results of this analysis motivated us to conceptualize, design, test and deploy a new and accurate machine learning-based predictor of the disordered lipid-binding residues, DisoLipPred. We empirically show that the strong predictive performance of DisoLipPred stems from several innovative design features and that its predictions complements results produced by current disorder predictors, disorder function predictors and predictors of transmembrane regions. We deploy DisoLipPred as a convenient webserver and discuss its predictions on the yeast proteome
    corecore