Search CORE

63 research outputs found

On the role of metaheuristic optimization in bioinformatics

Author: Benito Sergio
Calvet Laura
Juan Angel A
Prados Ferran
Publication venue: 'Royal College of Obstetricians & Gynaecologists (RCOG)'
Publication date: 01/01/2022
Field of study

Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics

A comprehensive comparison of metaheuristics for the repetition-free longest common subsequence problem

Author: Blesa Aguilera Maria Josep
Blum Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

This paper deals with an NP-hard string problem from the bio-informatics field: the repetition-free longest common subsequence problem. This problem has enjoyed an increasing interest in recent years, which has resulted in the application of several pure as well as hybrid metaheuristics. However, the literature lacks a comprehensive comparison between those approaches. Moreover, it has been shown that general purpose integer linear programming solvers are very efficient for solving many of the problem instances that were used so far in the literature. Therefore, in this work we extend the available benchmark set, adding larger instances to which integer linear programming solvers cannot be applied anymore. Moreover, we provide a comprehensive comparison of the approaches found in the literature. Based on the results we propose a hybrid between two of the best methods which turns out to inherit the complementary strengths of both methods.Peer ReviewedPostprint (author's final draft

Multiple Biolgical Sequence Alignment: Scoring Functions, Algorithms, and Evaluations

Author: Nguyen Ken D
Publication venue: ScholarWorks @ Georgia State University
Publication date: 14/12/2011
Field of study

Aligning multiple biological sequences such as protein sequences or DNA/RNA sequences is a fundamental task in bioinformatics and sequence analysis. These alignments may contain invaluable information that scientists need to predict the sequences\u27 structures, determine the evolutionary relationships between them, or discover drug-like compounds that can bind to the sequences. Unfortunately, multiple sequence alignment (MSA) is NP-Complete. In addition, the lack of a reliable scoring method makes it very hard to align the sequences reliably and to evaluate the alignment outcomes. In this dissertation, we have designed a new scoring method for use in multiple sequence alignment. Our scoring method encapsulates stereo-chemical properties of sequence residues and their substitution probabilities into a tree-structure scoring scheme. This new technique provides a reliable scoring scheme with low computational complexity. In addition to the new scoring scheme, we have designed an overlapping sequence clustering algorithm to use in our new three multiple sequence alignment algorithms. One of our alignment algorithms uses a dynamic weighted guidance tree to perform multiple sequence alignment in progressive fashion. The use of dynamic weighted tree allows errors in the early alignment stages to be corrected in the subsequence stages. Other two algorithms utilize sequence knowledge-bases and sequence consistency to produce biological meaningful sequence alignments. To improve the speed of the multiple sequence alignment, we have developed a parallel algorithm that can be deployed on reconfigurable computer models. Analytically, our parallel algorithm is the fastest progressive multiple sequence alignment algorithm

Towards a better solution to the shortest common supersequence problem: the deposition and reduction algorithm

Author: D Gusfield
D Sankoff
DE Foulser
EA Hubbell
G Nicosia
Hon Wai Leong
J Branke
JA Storer
K Ning
Kang Ning
P Barone
R Michels
RW Irving
S Kasif
T Jiang
TH Cormen
TK Sellis
VG Timkovsky
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The problem of finding a Shortest Common Supersequence (SCS) of a set of sequences is an important problem with applications in many areas. It is a key problem in biological sequences analysis. The SCS problem is well-known to be NP-complete. Many heuristic algorithms have been proposed. Some heuristics work well on a few long sequences (as in sequence comparison applications); others work well on many short sequences (as in oligo-array synthesis). Unfortunately, most do not work well on large SCS instances where there are many, long sequences. RESULTS: In this paper, we present a Deposition and Reduction (DR) algorithm for solving large SCS instances of biological sequences. There are two processes in our DR algorithm: deposition process, and reduction process. The deposition process is responsible for generating a small set of common supersequences; and the reduction process shortens these common supersequences by removing some characters while preserving the common supersequence property. Our evaluation on simulated data and real DNA and protein sequences show that our algorithm consistently produces the best results compared to many well-known heuristic algorithms, and especially on large instances. CONCLUSION: Our DR algorithm provides a partial answer to the open problem of designing efficient heuristic algorithm for SCS problem on many long sequences. Our algorithm has a bounded approximation ratio. The algorithm is efficient, both in running time and space complexity and our evaluation shows that it is practical even for SCS problems on many long sequences

Springer - Publisher Connector

Directory of Open Access Journals

Preventing premature convergence and proving the optimality in evolutionary algorithms

Author: Alliot Jean-Marc
Durand Nicolas
Gotteland Jean-Baptiste
Vanaret Charlie
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

http://ea2013.inria.fr//proceedings.pdfInternational audienceEvolutionary Algorithms (EA) usually carry out an efficient exploration of the search-space, but get often trapped in local minima and do not prove the optimality of the solution. Interval-based techniques, on the other hand, yield a numerical proof of optimality of the solution. However, they may fail to converge within a reasonable time due to their inability to quickly compute a good approximation of the global minimum and their exponential complexity. The contribution of this paper is a hybrid algorithm called Charibde in which a particular EA, Differential Evolution, cooperates with a Branch and Bound algorithm endowed with interval propagation techniques. It prevents premature convergence toward local optima and outperforms both deterministic and stochastic existing approaches. We demonstrate its efficiency on a benchmark of highly multimodal problems, for which we provide previously unknown global minima and certification of optimality

CiteSeerX

Naive Bayes ant colony optimization for designing high dimensional experiments

Author: Baragona
Berni
Bickel
Blum
Blum
Blum
Blum
Borrotti
Branden
Caschera
D. De Lucrezia
Damborsky
De Jong
Dorigo
Dorigo
Dorigo
Dorigo
Ferrari
Forlin
G. Minervini
Gambardella
Garlapati
Goldberg
Holland
I. Poli
Ji
Jones
Lindsay
Longhi
M. Borrotti
Minervini
Mitchell
Mohsen
Montemanni
Nestl
Pellegrini
Rish
Rosen
Rubinstein
Sahami
Sambo
Shyu
Slanzi
Stützle
Tang
Ullah
Ullah
Yang
Yousef
Zanghellini
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

In a large number of experimental problems, high dimensionality of the search area and economical constraints can severely limit the number of experimental points that can be tested. Within these constraints, classical optimization techniques perform poorly, in particular, when little a priori knowledge is available. In this work we investigate the possibility of combining approaches from statistical modeling and bio-inspired algorithms to effectively explore a huge search space, sampling only a limited number of experimental points. To this purpose, we introduce a novel approach, combining ant colony optimization (ACO) and naive Bayes classifier (NBC) that is, the naive Bayes ant colony optimization (NACO) procedure. We compare NACO with other similar approaches developing a simulation study. We then derive the NACO procedure with the goal to design artificial enzymes with no sequence homology to the extant one. Our final aim is to mimic the natural fold of 200 amino acids 1AGY serine esterase from Fusarium solani

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari