1,707 research outputs found
Long Proteins with Unique Optimal Foldings in the H-P Model
It is widely accepted that (1) the natural or folded state of proteins is a
global energy minimum, and (2) in most cases proteins fold to a unique state
determined by their amino acid sequence. The H-P (hydrophobic-hydrophilic)
model is a simple combinatorial model designed to answer qualitative questions
about the protein folding process. In this paper we consider a problem
suggested by Brian Hayes in 1998: what proteins in the two-dimensional H-P
model have unique optimal (minimum energy) foldings? In particular, we prove
that there are closed chains of monomers (amino acids) with this property for
all (even) lengths; and that there are open monomer chains with this property
for all lengths divisible by four.Comment: 22 pages, 18 figure
The Parallelism Motifs of Genomic Data Analysis
Genomic data sets are growing dramatically as the cost of sequencing
continues to decline and small sequencing devices become available. Enormous
community databases store and share this data with the research community, but
some of these genomic data analysis problems require large scale computational
platforms to meet both the memory and computational requirements. These
applications differ from scientific simulations that dominate the workload on
high end parallel systems today and place different requirements on programming
support, software libraries, and parallel architectural design. For example,
they involve irregular communication patterns such as asynchronous updates to
shared data structures. We consider several problems in high performance
genomics analysis, including alignment, profiling, clustering, and assembly for
both single genomes and metagenomes. We identify some of the common
computational patterns or motifs that help inform parallelization strategies
and compare our motifs to some of the established lists, arguing that at least
two key patterns, sorting and hashing, are missing
A Hybrid Monte Carlo Ant Colony Optimization Approach for Protein Structure Prediction in the HP Model
The hydrophobic-polar (HP) model has been widely studied in the field of
protein structure prediction (PSP) both for theoretical purposes and as a
benchmark for new optimization strategies. In this work we introduce a new
heuristics based on Ant Colony Optimization (ACO) and Markov Chain Monte Carlo
(MCMC) that we called Hybrid Monte Carlo Ant Colony Optimization (HMCACO). We
describe this method and compare results obtained on well known HP instances in
the 3 dimensional cubic lattice to those obtained with standard ACO and
Simulated Annealing (SA). All methods were implemented using an unconstrained
neighborhood and a modified objective function to prevent the creation of
overlapping walks. Results show that our methods perform better than the other
heuristics in all benchmark instances.Comment: In Proceedings Wivace 2013, arXiv:1309.712
On the role of metaheuristic optimization in bioinformatics
Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics
Computational Molecular Biology
Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography
When Can You Fold a Map?
We explore the following problem: given a collection of creases on a piece of
paper, each assigned a folding direction of mountain or valley, is there a flat
folding by a sequence of simple folds? There are several models of simple
folds; the simplest one-layer simple fold rotates a portion of paper about a
crease in the paper by +-180 degrees. We first consider the analogous questions
in one dimension lower -- bending a segment into a flat object -- which lead to
interesting problems on strings. We develop efficient algorithms for the
recognition of simply foldable 1D crease patterns, and reconstruction of a
sequence of simple folds. Indeed, we prove that a 1D crease pattern is
flat-foldable by any means precisely if it is by a sequence of one-layer simple
folds.
Next we explore simple foldability in two dimensions, and find a surprising
contrast: ``map'' folding and variants are polynomial, but slight
generalizations are NP-complete. Specifically, we develop a linear-time
algorithm for deciding foldability of an orthogonal crease pattern on a
rectangular piece of paper, and prove that it is (weakly) NP-complete to decide
foldability of (1) an orthogonal crease pattern on a orthogonal piece of paper,
(2) a crease pattern of axis-parallel and diagonal (45-degree) creases on a
square piece of paper, and (3) crease patterns without a mountain/valley
assignment.Comment: 24 pages, 19 figures. Version 3 includes several improvements thanks
to referees, including formal definitions of simple folds, more figures,
table summarizing results, new open problems, and additional reference
Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences
Questions in computational molecular biology generate various discrete
optimization problems, such as DNA sequence alignment and RNA secondary
structure prediction. However, the optimal solutions are fundamentally
dependent on the parameters used in the objective functions. The goal of a
parametric analysis is to elucidate such dependencies, especially as they
pertain to the accuracy and robustness of the optimal solutions. Techniques
from geometric combinatorics, including polytopes and their normal fans, have
been used previously to give parametric analyses of simple models for DNA
sequence alignment and RNA branching configurations. Here, we present a new
computational framework, and proof-of-principle results, which give the first
complete parametric analysis of the branching portion of the nearest neighbor
thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure
- …