Search CORE

270 research outputs found

Analysis of Ant Colony Optimization and Population-Based Evolutionary Algorithms on Dynamic Problems

Author: Lissovoi Andrei
Publication venue: DTU Transport
Publication date: 01/01/2016
Field of study

Online Research Database In Technology

zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm

Author
Publication venue: BioMed Central
Publication date: 22/11/2013
Field of study

Springer - Publisher Connector

The Right Mutation Strength for Multi-Valued Decision Variables

Author: Auger A.
Jansen T.
Johannsen D.
Neumann F.
Publication venue
Publication date: 12/04/2016
Field of study

The most common representation in evolutionary computation are bit strings. This is ideal to model binary decision variables, but less useful for variables taking more values. With very little theoretical work existing on how to use evolutionary algorithms for such optimization problems, we study the run time of simple evolutionary algorithms on some OneMax-like functions defined over

\Omega = \{0, 1, \dots, r-1\}^n

. More precisely, we regard a variety of problem classes requesting the component-wise minimization of the distance to an unknown target vector

z \in \Omega

. For such problems we see a crucial difference in how we extend the standard-bit mutation operator to these multi-valued domains. While it is natural to select each position of the solution vector to be changed independently with probability

1/n

, there are various ways to then change such a position. If we change each selected position to a random value different from the original one, we obtain an expected run time of

\Theta(nr \log n)

. If we change each selected position by either

+1

-1

(random choice), the optimization time reduces to

\Theta(nr + n\log n)

. If we use a random mutation strength

i \in \{0,1,\ldots,r-1\}^n

with probability inversely proportional to

i

and change the selected position by either

+i

-i

(random choice), then the optimization time becomes

\Theta(n \log(r)(\log(n)+\log(r)))

, bringing down the dependence on

r

from linear to polylogarithmic. One of our results depends on a new variant of the lower bounding multiplicative drift theorem.Comment: an extended abstract of this work is to appear at GECCO 201

arXiv.org e-Print Archive

Crossref

HAL-Polytechnique

Entropy-scaling search of massive biological data

Author: Berger Bonnie
Daniels Noah M.
Danko David Christian
Yu Y. William
Publication venue: 'Elsevier BV'
Publication date: 01/06/2015
Field of study

Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

arXiv.org e-Print Archive

Elsevier - Publisher Connector

DSpace@MIT

Crossref

PubMed Central

Algorithmic complexity for psychology: A user-friendly implementation of the coding theorem method

Author: Gauvrit Nicolas
Singmann Henrik
Soler-Toscano Fernando
Zenil Hector
Publication venue
Publication date: 01/01/2015
Field of study

Kolmogorov-Chaitin complexity has long been believed to be impossible to approximate when it comes to short sequences (e.g. of length 5-50). However, with the newly developed \emph{coding theorem method} the complexity of strings of length 2-11 can now be numerically estimated. We present the theoretical basis of algorithmic complexity for short strings (ACSS) and describe an R-package providing functions based on ACSS that will cover psychologists' needs and improve upon previous methods in three ways: (1) ACSS is now available not only for binary strings, but for strings based on up to 9 different symbols, (2) ACSS no longer requires time-consuming computing, and (3) a new approach based on ACSS gives access to an estimation of the complexity of strings of any length. Finally, three illustrative examples show how these tools can be applied to psychology.Comment: to appear in "Behavioral Research Methods", 14 pages in journal format, R package at http://cran.r-project.org/web/packages/acss/index.htm

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

ZORA

Reverse-Safe Data Structures for Text Indexing

Author: Gabriele Fici
Giulia Bernardini
Grigorios Loukides
Huiping Chen
Solon P. Pissis
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2020
Field of study

We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(n ω log d) time, where ω is the matrix multiplication exponent. We show that, despite the n ω factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model

Archivio istituzionale della ricerca - Università di Trieste

Crossref

CWI's Institutional Repository

University of Birmingham Research Portal

Archivio istituzionale della ricerca - Università di Palermo

Dynastic Potential Crossover Operator

Author: Chicano Francisco
Ochoa Gabriela
Tinós Renato
Whitley L. Darrell
Publication venue: 'MIT Press - Journals'
Publication date: 01/12/2021
Field of study

An optimal recombination operator for two parent solutions provides the best solution among those that take the value for each variable from one of the parents (gene transmission property). If the solutions are bit strings, the offspring of an optimal recombination operator is optimal in the smallest hyperplane containing the two parent solutions. Exploring this hyperplane is computationally costly, in general, requiring exponential time in the worst case. However, when the variable interaction graph of the objective function is sparse, exploration can be done in polynomial time. In this paper, we present a recombination operator, called Dynastic Potential Crossover (DPX), that runs in polynomial time and behaves like an optimal recombination operator for low-epistasis combinatorial problems. We compare this operator, both theoretically and experimentally, with traditional crossover operators, like uniform crossover and network crossover, and with two recently defined efficient recombination operators: partition crossover and articulation points partition crossover. The empirical comparison uses NKQ Landscapes and MAX-SAT instances. DPX outperforms the other crossover operators in terms of quality of the offspring and provides better results included in a trajectory and a population-based metaheuristic, but it requires more time and memory to compute the offspring.This research is partially funded by the Universidad de M\'alaga, Consejería de Economía y Conocimiento de la Junta de Andalucía and FEDER under grant number UMA18-FEDERJA-003 (PRECOG); under grant PID 2020-116727RB-I00 (HUmove) funded by MCIN/AEI/10.13039/501100011033; and TAILOR ICT-48 Network (No 952215) funded by EU Horizon 2020 research and innovation programme. The work is also partially supported in Brazil by São Paulo Research Foundation (FAPESP), under grants 2021/09720-2 and 2019/07665-4, and National Council for Scientific and Technological Development (CNPq), under grant 305755/2018-8

Stirling Online Research Repository (RIOXX)

Repositorio Institucional Universidad de Málaga

Stirling Online Research Repository