research

Short adjacent repeat identification based on chemical reaction optimization

Abstract

IEEE World Congress on Computational Intelligence (WCCI 2012), Brisbane, Australia, 10-15 June 2012 hosted three conferences: the 2012 International Joint Conference on Neural Networks (IJCNN 2012), the 2012 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2012), and the 2012 IEEE Congress on Evolutionary Computation (IEEE CEC 2012)The analysis of short tandem repeats (STRs) in DNA sequences has become an attractive method for determining the genetic profile of an individual. Here we focus on a more general and practical issue named short adjacent repeats identification problem (SARIP), which is extended from STR by allowing short gaps between neighboring units. Presently, the best available solution to SARIP is BASARD, which uses Markov chain Monte Carlo algorithms to determine the posterior estimate. However, the computational complexity and the tendency to get stuck in a local mode lower the efficiency of BASARD and impede its wide application. In this paper, we prove that SARIP is NP-hard, and we also solve it with Chemical Reaction Optimization (CRO), a recently developed metaheuristic approach. CRO mimics the interactions of molecules in a chemical reaction and it can explore the solution space efficiently to find the optimal or near optimal solution(s). We test the CRO algorithm with both synthetic and real data, and compare its performance in mode searching with BASARD. Simulation results show that CRO enjoys dozens of times, or even a hundred times shorter computational time compared with BASARD. It is also demonstrated that CRO can obtain the global optima most of the time. Moreover, CRO is more stable in different runs, which is of great importance in practical use. Thus, CRO is by far the best method on SARIP. © 2012 IEEE.published_or_final_versio

    Similar works