4 research outputs found

    On the Computational Complexity of Sequence Design Problems

    No full text
    Inverse protein folding concerns the identification of an amino acid sequence that folds to a given structure. Sequence design problems attempt to avoid the apparant difficulty of inverse protein folding by defining an energy that can be minimized to find protein-like sequences. We evaluate the practical relevance of two sequence design problems by analyzing their computational complexity. We show that the canonical method of sequence design is intractable, and describe approximation algorithms for this problem. We also describe an efficient algorithm that exactly solves the grand canonical method. Our analysis shows how sequence design problems can fail to reduce the difficulty of the inverse protein folding problem, and highlights the need to analyze these problems to evaluate their practical relevance. 1 Introduction The goal of the inverse protein folding problem (IPF) is to design a polymer sequence that folds to a given target conformation. Three criteria have been proposed for..

    The inverse protein folding problem on 2D and 3D lattices

    Get PDF
    AbstractIn this paper we investigate the inverse protein folding (IPF) problem under the Canonical model on 3D and 2D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128–136; E.I. Shakhnovich, A.M. Gutin, Engineering of stable and fast-folding sequences of model proteins, Proc. Natl. Acad. Sci. 90 (1993) 7195–7199]. In this problem, we are given a contact graph G=(V,E) of a protein sequence that is embeddable in a 3D (respectively, 2D) lattice and an integer 1⩽K⩽|V|. The goal is to find an induced subgraph of G of at most K vertices with the maximum number of edges. In this paper, we prove the following results:•An earlier proof of NP-completeness of the IPF problem on 3D lattices [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128–136] is based on the NP-completeness of the IPF problem on the 2D lattices. However, the reduction was not correct and we show that the IPF problem for 2D lattices can be solved in O(K|V|) time. But, we show that the IPF problem on 3D lattices is indeed NP-complete by a providing a different reduction from a different NP-complete problem.•We design a polynomial-time approximation scheme for the IPF problem on 3D lattices using the shifted slice-and-dice approach in [P. Berman, B. DasGupta, S. Muthukrishnan, Approximation algorithms for MAX-MIN tiling, J. Algorithms 47(2) (2003) 122–134; D. Hochbaum, Approximation Algorithms for NP-hard Problems, PWS Publishing Company, MA, 1997; D.S. Hochbaum, W. Mass, Approximation schemes for covering and packing problems in image processing and VLSI, J. ACM 32(1) (1985) 130–136], thereby improving the previous best polynomial-time approximation algorithm which had a performance ratio of 12 [W.E. Hart, On the computational complexity of sequence design problems, Proceedings of the First Annual International Conference on Computational Molecular Biology 1997, pp. 128–136]
    corecore