812 research outputs found

    Detection of changes in the characteristics of oceanographic time-series using changepoint analysis.

    Get PDF
    Changepoint analysis is used to detect changes in variability within GOMOS hindcast time-series for significant wave heights of storm peak events across the Gulf of Mexico for the period 1900–2005. To detect a change in variance, the two-step procedure consists of (1) validating model assumptions per geographic location, followed by (2) application of a penalized likelihood changepoint algorithm. Results suggest that the most important changes in time-series variance occur in 1916 and 1933 at small clusters of boundary locations at which, in general, the variance reduces. No post-war changepoints are detected. The changepoint procedure can be readily applied to other environmental time-series

    A Time-Series-Based Feature Extraction Approach for Prediction of Protein Structural Class

    Get PDF
    This paper presents a novel feature vector based on physicochemical property of amino acids for prediction protein structural classes. The proposed method is divided into three different stages. First, a discrete time series representation to protein sequences using physicochemical scale is provided. Later on, a wavelet-based time-series technique is proposed for extracting features from mapped amino acid sequence and a fixed length feature vector for classification is constructed. The proposed feature space summarizes the variance information of ten different biological properties of amino acids. Finally, an optimized support vector machine model is constructed for prediction of each protein structural class. The proposed approach is evaluated using leave-one-out cross-validation tests on two standard datasets. Comparison of our result with existing approaches shows that overall accuracy achieved by our approach is better than exiting methods

    Conditions for propagation and block of excitation in an asymptotic model of atrial tissue

    Get PDF
    Detailed ionic models of cardiac cells are difficult for numerical simulations because they consist of a large number of equations and contain small parameters. The presence of small parameters, however, may be used for asymptotic reduction of the models. Earlier results have shown that the asymptotics of cardiac equations are non-standard. Here we apply such a novel asymptotic method to an ionic model of human atrial tissue in order to obtain a reduced but accurate model for the description of excitation fronts. Numerical simulations of spiral waves in atrial tissue show that wave fronts of propagating action potentials break-up and self-terminate. Our model, in particular, yields a simple analytical criterion of propagation block, which is similar in purpose but completely different in nature to the `Maxwell rule' in the FitzHugh-Nagumo type models. Our new criterion agrees with direct numerical simulations of break-up of re-entrant waves.Comment: Revised manuscript submitted to Biophysical Journal (30 pages incl. 10 figures

    Wavelet feature extraction and genetic algorithm for biomarker detection in colorectal cancer data

    Get PDF
    Biomarkers which predict patient’s survival can play an important role in medical diagnosis and treatment. How to select the significant biomarkers from hundreds of protein markers is a key step in survival analysis. In this paper a novel method is proposed to detect the prognostic biomarkers ofsurvival in colorectal cancer patients using wavelet analysis, genetic algorithm, and Bayes classifier. One dimensional discrete wavelet transform (DWT) is normally used to reduce the dimensionality of biomedical data. In this study one dimensional continuous wavelet transform (CWT) was proposed to extract the features of colorectal cancer data. One dimensional CWT has no ability to reduce dimensionality of data, but captures the missing features of DWT, and is complementary part of DWT. Genetic algorithm was performed on extracted wavelet coefficients to select the optimized features, using Bayes classifier to build its fitness function. The corresponding protein markers were located based on the position of optimized features. Kaplan-Meier curve and Cox regression model 2 were used to evaluate the performance of selected biomarkers. Experiments were conducted on colorectal cancer dataset and several significant biomarkers were detected. A new protein biomarker CD46 was found to significantly associate with survival time

    Inter-residue distances derived from fold contact propensities correlate with evolutionary substitution costs

    Get PDF
    BACKGROUND: The wealth of information on protein structure has led to a variety of statistical analyses of the role played by individual amino acid types in the protein fold. In particular, the contact propensities between the various amino acids can be converted into folding energies that have proved useful in structure prediction. The present study addresses the relationship of protein folding propensities to the evolutionary relationship between residues. RESULTS: The contact preferences of residue types observed in a representative sample of protein structures are converted into a residue similarity matrix or inter-residue distance matrix. Remarkably, these distances correlate excellently with evolutionary substitution costs. Residue vectors are derived from the distance matrix. The residue vectors give a concrete picture of the grouping of residues into families sharing properties crucial for protein folding. CONCLUSIONS: Inter-residue distances have proved useful in showing the explicit relationship between contact preferences and evolutionary substitution rates. It is proposed that the distance matrix derived from structural analysis may be useful in aligning proteins where remote homologs share structural features. Residue vectors derived from the distance matrix illustrate the spatial arrangement of residues and point to ways in which they can be grouped

    Analysis of protein secondary structure via the discrete wavelet transform

    Get PDF
    This project develops a secondary structure prediction approach that uses the discrete wavelet transform. In order to use the wavelet technique, we convert the primary amino acid sequence of the protein to a numerical signal using the hydrophobic tendencies associated with the amino acids. The data used in this project consists of both a + B and a/B proteins coming from the Structural Classification of Proteins (SCOP) protein database. This data provides both protein primary sequences and secondary structure locations. In total, 13,435 individual proteins and nearly 15,511 unique protein subunits are analyzed. We use three different experimentally determined hydrophobicity scales for comparison. A control data set is formed by creating 200 realizations of each protein, each realization being a random permutation of the proteins amino acid sequence. The realizations are subjected to the same analysis as the parent protein. Our analysis involves examinining the correlation between locations of significant hydrophobicity fluctuations and secondary structure, where significance is determinded by comparison to the control data set. Our focus is on using the first and second scales of the wavelet detail but we also construct a scale-scale measure that combines these scales to detect secondary structure. Using standard performance measures, like the Matthews correlation coefficient (MCC) and the accuracy(Q), we find that our method does show promise at being a useful tool for predicting the locations of secondary structures in protein given just the amino acid sequence

    An efficient visualization tool for the analysis of protein mutation matrices

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is useful to develop a tool that would effectively describe protein mutation matrices specifically geared towards the identification of mutations that produce either wanted or unwanted effects, such as an increase or decrease in affinity, or a predisposition towards misfolding. Here, we describe a tool where such mutations are efficiently identified, categorized and visualized. To categorize the mutations, amino acids in a mutation matrix are arrang according to one of three sets of physicochemical characteristics, namely hydrophilicity, size and polarizability, and charge and polarity. The magnitude and frequences of mutations for an alignment are subsequently described using color information and scaling factors.</p> <p>Results</p> <p>To illustrate the capabilities of our approach, the technique is used to visualize and to compare mutation patterns in evolving sequences with diametrically opposite characteristics. Results show the emergence of distinct patterns not immediately discernible from the raw matrices.</p> <p>Conclusion</p> <p>Our technique enables effective categorization and visualization of mutations by using specifically-arranged mutation matrices. This tool has a number of possible applications in protein engineering, notably in simplifying the identification of mutations and/or mutation trends that are associated with specific engineered protein characteristics and behavior.</p

    MODULATION OF THE RECEPTOR GATING MECHANISM AND ALLOSTERIC COMMUNICATION IN IONOTROPIC GLUTAMATE RECEPTORS

    Get PDF
    Ionotropic glutamate receptors (iGluRs) found in mammalian brain are primarily known to mediate excitatory synaptic transmission crucial for learning and memory formation. The family of iGluRs consists of AMPA receptors, NMDA receptors and kainate receptors with each member having distinct physiological role. In the recent years, significant progress has been made in understanding the biophysical, and functional properties of iGluRs. The development of Cryo-EM and X-Ray crystallography techniques have further facilitated in the structural understanding of these receptors. However, the multidomain nature, large size of the protein, complex gating mechanism and inadequate knowledge regarding the conformational dynamics of the receptors during channel gating mechanism have been some of the limiting factors in elucidating the structure-function relation of iGluRs. Thus, to understand the conformational dynamics of iGluR family and correlate to its functional behavior, I have utilized single molecule Forster Resonance Energy Transfer (smFRET) and molecular dynamics simulation and specifically investigated the factors influencing gating mechanism and allosteric communication in heteromeric kainate receptor GluK2/K5 and NMDA receptor GluN1/N2A. Some of the major finding in this dissertation includes subunit arrangement of GluK2/K5 and its dynamics involved in resting and desensitized conditions. For the first time we have identified the conformational changes induced at GluK2 and GluK5 subunits in a heteromer GluK2/K5 when bound to different agonists. Utilizing MD simulations in GluN1/N2A NMDA receptors we have identified the structural pathway regarding the mechanism underlying negative cooperativity and how mutation in the receptor leads to abnormal functional behavior. These findings will allow us to understand the conformational control regarding modulation of receptor function and will serve as a basis for developing subunit and conformation-specific therapeutic drugs that can potentially control the abnormal activity of the receptors linked to several neurological diseases

    Characterisation and Classification of Protein Sequences by Using Enhanced Amino Acid Indices and Signal Processing-Based Methods

    Get PDF
    Due to copyright reasons, the authors published papers have been removed from this copy of the thesis.Protein sequencing has produced overwhelming amount of protein sequences, especially in the last decade. Nevertheless, the majority of the proteins' functional and structural classes are still unknown, and experimental methods currently used to determine these properties are very expensive, laborious and time consuming. Therefore, automated computational methods are urgently required to accurately and reliably predict functional and structural classes of the proteins. Several bioinformatics methods have been developed to determine such properties of the proteins directly from their sequence information. Such methods that involve signal processing methods have recently become popular in the bioinformatics area and been investigated for the analysis of DNA and protein sequences and shown to be useful and generally help better characterise the sequences. However, there are various technical issues that need to be addressed in order to overcome problems associated with the signal processing methods for the analysis of the proteins sequences. Amino acid indices that are used to transform the protein sequences into signals have various applications and can represent diverse features of the protein sequences and amino acids. As the majority of indices have similar features, this project proposes a new set of computationally derived indices that better represent the original group of indices. A study is also carried out that resulted in finding a unique and universal set of best discriminating amino acid indices for the characterisation of allergenic proteins. This analysis extracts features directly from the protein sequences by using Discrete Fourier Transform (DFT) to build a classification model based on Support Vector Machines (SVM) for the allergenic proteins. The proposed predictive model yields a higher and more reliable accuracy than those of the existing methods. A new method is proposed for performing a multiple sequence alignment. For this method, DFT-based method is used to construct a new distance matrix in combination with multiple amino acid indices that were used to encode protein sequences into numerical sequences. Additionally, a new type of substitution matrix is proposed where the physicochemical similarities between any given amino acids is calculated. These similarities were calculated based on the 25 amino acids indices selected, where each one represents a unique biological protein feature. The proposed multiple sequence alignment method yields a better and more reliable alignment than the existing methods. In order to evaluate complex information that is generated as a result of DFT, Complex Informational Spectrum Analysis (CISA) is developed and presented. As the results show, when protein classes present similarities or differences according to the Common Frequency Peak (CFP) in specific amino acid indices, then it is probable that these classes are related to the protein feature that the specific amino acid represents. By using only the absolute spectrum in the analysis of protein sequences using the informational spectrum analysis is proven to be insufficient, as biologically related features can appear individually either in the real or the imaginary spectrum. This is successfully demonstrated over the analysis of influenza neuraminidase protein sequences. Upon identification of a new protein, it is important to single out amino acid responsible for the structural and functional classification of the protein, as well as the amino acids contributing to the protein's specific biological characterisation. In this work, a novel approach is presented to identify and quantify the relationship between individual amino acids and the protein. This is successfully demonstrated over the analysis of influenza neuraminidase protein sequences. Characterisation and identification problem of the Influenza A virus protein sequences is tackled through a Subgroup Discovery (SD) algorithm, which can provide ancillary knowledge to the experts. The main objective of the case study was to derive interpretable knowledge for the influenza A virus problem and to consequently better describe the relationships between subtypes of this virus. Finally, by using DFT-based sequence-driven features a Support Vector Machine (SVM)-based classification model was built and tested, that yields higher predictive accuracy than that of SD. The methods developed and presented in this study yield promising results and can be easily applied to proteomic fields

    Mining the Drosophila gustatory receptor family for new thermosensitive proteins : basic science and tool development

    Get PDF
    Extrinsic control of neural activity is a powerful paradigm for understanding how neural circuits operate and regulate behavior. Traditionally, optogenetic tools are used to activate or inhibit neuronal activity with light. However, using visible light as the stimulus has some limitations, such as limited penetration in opaque tissue and overlap of absorption spectra when using multiple probes. A complementary approach is to use temperature as a stimulus, and thermosensitive TRP channels as the thermogenetic probes. These channels also have some limitations, particularly in their temperature sensitivity range. A new and exciting candidate for developing new thermogenetic tools has been recently identified as Gr28bD, a member of the Drosophila gustatory receptor family, normally involved in high-temperature avoidance behavior. My work on Gr28bD started with a characterization of its biophysical properties, particularly temperature sensitivity and ionic selectivity (Chapter 1). Then, to expand the pool of potential candidates for thermogenetic tools, I examined the orthologs of Gr28bD in other species of Drosophila, and I found five other receptors that have distinct thermosensitive properties (Chapter 2). To better understand the mechanism of thermosensitivity, our team successfully modeled the molecular structure of Gr28bD, obtaining preliminary evidence of its homotetrameric organization. To obtain further information on the structural and functional elements of this channel, I tested a series of Gr28bD mutants (Chapter 3). Finally, I participated in writing a book chapter on new computational methods for testing ion channel kinetic mechanisms (Chapter 4).Includes bibliographical references
    corecore