128,154 research outputs found
A combinatorial optimization approach for diverse motif finding applications
BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. RESULTS: We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem. CONCLUSION: Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications
An optimal-basis identification technique for interior-point linear programming algorithms
AbstractThis work concerns a method for identifying an optimal basis for linear programming problems in the setting of interior-point methods. To each iterate xk generated by a primal interior-point algorithm, say, we associate an indicator vector qk with the property that if xk converges to a nondegenerate vertex x∗, then qk converges to the 0–1 vector sign(x∗). More interestingly, we show that the convergence of qk is quadratically faster than that of xk in the sense that ||qk−q7ast;||=O(||xk−x∗||2). This clear-cut separation and rapid convergence allow one to infer at an intermediate stage of the iterative process which variables will be zero at optimality and which will not. We also show that under suitable assumptions this method is applicable to dual as well as primal-dual algorithms and can be extended to handle certain types of degeneracy. Numerical examples are included to corroborate the convergence properties of the indicators. The practical limitations of the indicator technique are also discussed
Interiors of completely positive cones
A symmetric matrix is completely positive (CP) if there exists an
entrywise nonnegative matrix such that . We characterize the
interior of the CP cone. A semidefinite algorithm is proposed for checking
interiors of the CP cone, and its properties are studied. A CP-decomposition of
a matrix in Dickinson's form can be obtained if it is an interior of the CP
cone. Some computational experiments are also presented
A primal-simplex based Tardos' algorithm
In the mid-eighties Tardos proposed a strongly polynomial algorithm for
solving linear programming problems for which the size of the coefficient
matrix is polynomially bounded by the dimension. Combining Orlin's primal-based
modification and Mizuno's use of the simplex method, we introduce a
modification of Tardos' algorithm considering only the primal problem and using
simplex method to solve the auxiliary problems. The proposed algorithm is
strongly polynomial if the coefficient matrix is totally unimodular and the
auxiliary problems are non-degenerate.Comment: 7 page
An improved multi-parametric programming algorithm for flux balance analysis of metabolic networks
Flux balance analysis has proven an effective tool for analyzing metabolic
networks. In flux balance analysis, reaction rates and optimal pathways are
ascertained by solving a linear program, in which the growth rate is maximized
subject to mass-balance constraints. A variety of cell functions in response to
environmental stimuli can be quantified using flux balance analysis by
parameterizing the linear program with respect to extracellular conditions.
However, for most large, genome-scale metabolic networks of practical interest,
the resulting parametric problem has multiple and highly degenerate optimal
solutions, which are computationally challenging to handle. An improved
multi-parametric programming algorithm based on active-set methods is
introduced in this paper to overcome these computational difficulties.
Degeneracy and multiplicity are handled, respectively, by introducing
generalized inverses and auxiliary objective functions into the formulation of
the optimality conditions. These improvements are especially effective for
metabolic networks because their stoichiometry matrices are generally sparse;
thus, fast and efficient algorithms from sparse linear algebra can be leveraged
to compute generalized inverses and null-space bases. We illustrate the
application of our algorithm to flux balance analysis of metabolic networks by
studying a reduced metabolic model of Corynebacterium glutamicum and a
genome-scale model of Escherichia coli. We then demonstrate how the critical
regions resulting from these studies can be associated with optimal metabolic
modes and discuss the physical relevance of optimal pathways arising from
various auxiliary objective functions. Achieving more than five-fold
improvement in computational speed over existing multi-parametric programming
tools, the proposed algorithm proves promising in handling genome-scale
metabolic models.Comment: Accepted in J. Optim. Theory Appl. First draft was submitted on
August 4th, 201
- …