8 research outputs found

    Computational investigations of folded self-avoiding walks related to protein folding

    Full text link
    Various subsets of self-avoiding walks naturally appear when investigating existing methods designed to predict the 3D conformation of a protein of interest. Two such subsets, namely the folded and the unfoldable self-avoiding walks, are studied computationally in this article. We show that these two sets are equal and correspond to the whole nn-step self-avoiding walks for nâ©˝14n\leqslant 14, but that they are different for numerous nâ©ľ108n \geqslant 108, which are common protein lengths. Concrete counterexamples are provided and the computational methods used to discover them are completely detailed. A tool for studying these subsets of walks related to both pivot moves and proteins conformations is finally presented.Comment: Not yet submitte

    The dynamics of complex systems. Studies and applications in computer science and biology

    Get PDF
    Our research has focused on the study of complex dynamics and on their use in both information security and bioinformatics. Our first work has been on chaotic discrete dynamical systems, and links have been established between these dynamics on the one hand, and either random or complex behaviors. Applications on information security are on the pseudorandom numbers generation, hash functions, informationhiding, and on security aspects on wireless sensor networks. On the bioinformatics level, we have applied our studies of complex systems to theevolution of genomes and to protein folding

    Studies of protein designability using reduced models

    Get PDF
    One the most important problems in computational structural biology is protein designability, that is, why protein sequences are not random strings of amino acids but instead show regular patterns that encode protein structures. Many previous studies that have attempted to solve the problem have relied upon reduced models of proteins. In particular, the 2D square and the 3D cubic lattices together with reduced amino acid alphabets have been examined extensively and have lead to interesting results that shed some light on evolutionary relationship among proteins. Here, additionally to the 2D square lattice, we study the 2D triangular and 3D face centered cubic (fcc) lattices, we perform designability studies using different shapes embedded in the 2D square lattice, and we use machine learning algorithms to classify binary sequences folding to highly- or poorly-designable conformations.;In the first part of the thesis we extend the transfer matrix method to the 2D triangular lattice. The transfer matrix method is a highly efficient method of enumerating all conformations within a compact lattice area that has earlier been developed for the 2D square and 3D cubic lattices. In addition we also enumerated all compact conformations within simple geometries on the 2D triangular and 3D face centered cubic (fcc) lattices using a standard backtracking algorithm.;In the second part of the thesis we described protein designability studies on various shapes in the 2D square lattice using a reduced hydrophobic-polar (HP) amino acid alphabet. We used a simple energy function that counted the number of H-H, H-P and P-P interactions within a restricted set of protein shapes that have the same number of residues and non-bonded contacts. We found a difference in the designabilities of different protein shapes.;Finally, in the third part of the thesis we used standard machine learning algorithms to classify two classes of protein sequences. We first performed a designability study for two shapes, using a binary HP alphabet, on the 2D triangular lattice and separated highly- and poorly-designable conformations. Highly-designable conformations had many sequences folding to them with the lowest energy and poorly-designable conformations had few or no sequences folding to them. Sequences were classified as highly- or poorly-designable depending on whether they folded to highly- or poorly-designable structures. Using several machine learning algorithms such as Decision Tree, Naive Bayes, and Support Vector Machine, we were able to classify highly- and poorly-designable sequences with high accuracy

    Sequence Determinants of the Individual and Collective Behaviour of Intrinsically Disordered Proteins

    Get PDF
    Intrinsically disordered proteins and protein regions (IDPs) represent around thirty percent of the eukaryotic proteome. IDPs do not fold into a set three dimensional structure, but instead exist in an ensemble of inter-converting states. Despite being disordered, IDPs are decidedly not random; well-defined - albeit transient - local and long-range interactions give rise to an ensemble with distinct statistical biases over many length-scales. Among a variety of cellular roles, IDPs drive and modulate the formation of phase separated intracellular condensates, non-stoichiometric assemblies of protein and nucleic acid that serve many functions. In this work, we have explored how the amino acid sequence of IDPs determines their conformational behaviour, and how sequence and single chain behaviour influence their collective behaviour in the context of phase separation. In part I, in a series of studies, we used simulation, theory, and statistical analysis coupled with a wide range of experimental approaches to uncover novel rules that further explore how primary sequence and local structure influence the global and local behaviour of disordered proteins, with direct implications for protein function and evolution. We found that amino acid sidechains counteract the intrinsic collapse of the peptide backbone, priming the backbone for interaction and providing a fully reconciliatory explanation for the mechanism of action associated with the denaturants urea and GdmCl. We discovered that proline can engender a conformational buffering effect in IDPs to counteract standard electrostatic effects, and that the patterning those proline residues can be a crucial determinant of the conformational ensemble. We developed a series of tools for analysing primary sequences on a proteome wide scale and used them to discover that different organisms can have substantially different average sequence properties. Finally, we determined that for the normally folded protein NTL9, the unfolded state under folding conditions is relatively expanded but has well defined native and non-native structural preferences. In part II, we identified a novel mode of phase separation in biology, and explored how this could be tuned through sequence design. We discovered that phase separated liquids can be many orders of magnitude more dilute than simple mean-field theories would predict, and developed an analytic framework to explain and understand this phenomenon. Finally, we designed, developed and implemented a novel lattice-based simulation engine (PIMMS) to provide sequence-specific insight into the determinants of conformational behaviour and phase separation. PIMMS allows us to accurately and rapidly generate sequence-specific conformational ensembles and run simulations of hundreds of polymers with the goal of allowing us to systematically elucidate the link between primary sequence of phase separation

    The study of unfoldable self-avoiding walks. Application to protein structure prediction software

    No full text
    International audienceSelf-avoiding walks (SAWs) are the source of very difficult problems in probability and enumerative combinatorics. They are of great interest as, for example, they are the basis of protein structure prediction (PSP) in bioinformatics. The authors of this paper have previously shown that, depending on the prediction algorithm, the sets of obtained walk conformations differ: For example, all the SAWs can be generated using stretching-based algorithms whereas only the unfoldable SAWs can be obtained with methods that iteratively fold the straight line. A deeper study of (non-)unfoldable SAWs is presented in this paper. The contribution is first a survey of what is currently known about these sets. In particular, we provide clear definitions of various subsets of SAWs related to pivot moves (unfoldable and non-unfoldable SAWs, etc.) and the first results that we have obtained, theoretically or computationally, on these sets. Then a new theorem on the number of non-unfoldable SAWs is demonstrated. Finally, a list of open questions is provided and the consequences on the PSP problem is proposed

    XXIII Fungal Genetics Conference

    Get PDF
    Program and abstracts from the 23rd Fungal Genetics Conference and Poster Abstracts at Asilomar, March 15-20, 200
    corecore