21 research outputs found
Adaptive multiresolution search: How to beat brute force?
AbstractMultiresolution and wavelet-based search methods are suited to problems for which acceptable solutions are in regions of high average local fitness. In this paper, two different approaches are presented. In the Markov-based approach, the sampling resolution is chosen adaptively depending on the fitness of the last sample(s). The advantage of this method, behind its simplicity, is that it allows the computation of the discovery probability of a target sample for quite large search spaces. This permits to “reverse-engineer” search-and-optimization problems. Starting from some prototypic examples of fitness functions the discovery rate can be computed as a function of the free parameters. The second approach is a wavelet-based multiresolution search using a memory to store local average values of the fitness functions. The sampling density probability is chosen per design proportional to a low-resolution approximation of the fitness function. High average fitness regions are sampled more often, and at a higher resolution, than low average fitness regions. If splines are used as scaling mother functions, a fuzzy description of the search strategy can be given within the framework of the Takagi–Sugeno model
Phylogenetic Applications of the Minimum Contradiction Approach on Continuous Characters
We describe the conditions under which a set of continuous variables or
characters can be described as an X-tree or a split network. A distance matrix
corresponds exactly to a split network or a valued X-tree if, after ordering of
the taxa, the variables values can be embedded into a function with at most a
local maxima and a local minima, and crossing any horizontal line at most
twice. In real applications, the order of the taxa best satisfying the above
conditions can be obtained using the Minimum Contradiction method. This
approach is applied to 2 sets of continuous characters. The first set
corresponds to craniofacial landmarks in Hominids. The contradiction matrix is
used to identify possible tree structures and some alternatives when they
exist. We explain how to discover the main structuring characters in a tree.
The second set consists of a sample of 100 galaxies. In that second example one
shows how to discretize the continuous variables describing physical properties
of the galaxies without disrupting the underlying tree structure.Comment: To appear in Evolutionary Bioinformatic
Minimum Contradiction Matrices in Whole Genome Phylogenies
Minimum contradiction matrices are a useful complement to distance-based phylogenies. A minimum contradiction matrix represents phylogenetic information under the form of an ordered distance matrix Yi, jn. A matrix element corresponds to the distance from a reference vertex n to the path (i, j). For an X-tree or a split network, the minimum contradiction matrix is a Robinson matrix. It therefore fulfills all the inequalities defining perfect order: Yi, jn ≥ Yi,kn, Yk jn ≥ Yk, In, i ≤ j ≤ k < n. In real phylogenetic data, some taxa may contradict the inequalities for perfect order. Contradictions to perfect order correspond to deviations from a tree or from a split network topology. Efficient algorithms that search for the best order are presented and tested on whole genome phylogenies with 184 taxa including many Bacteria, Archaea and Eukaryota. After optimization, taxa are classified in their correct domain and phyla. Several significant deviations from perfect order correspond to well-documented evolutionary events
Multivariate Approaches to Classification in Extragalactic Astronomy
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.Comment: Open Access paper.
http://www.frontiersin.org/milky\_way\_and\_galaxies/10.3389/fspas.2015.00003/abstract\>.
\<10.3389/fspas.2015.00003 \&g
Emergence of Collective Intelligence in Stochastic Local Search-and-Optimization Systems
Two new stochastic search methods are introduced as prototypic examples showing how collective intelligence may emerge in a system of locally interacting units. They share the property of being theoretically understandable and computationally tractable, a quite "rare" feature. The first search method, based on multiresolution search algorithms, can be typically implemented under the form of search agents. The method is appropriate if the target element(s) is located in high-average fitness regions of the search space. The search may be improved by introducing some interaction between the search agents. As the agents search preferentially in high-average fitness regions, there is a correlation between the number of agents in a region of the search space and the local average fitness in that region. It is therefore natural to introduce some extra sampling when several agents are in the same neighborhood. The theoretical framework of multiresolution analysis and wavelet theory permits to give a precise description of the above strategy and to define simple conditions guarantying that the search with interacting agents is better than a search with a single agent. The second example shows how a satisfiability problem (3-SAT) can be solved by an ensemble of small computing units working in parallel. The search uses a number of noisy integrate-and-fire neurons as local optimizer. The satisfiability problem is coded so that if a solution does exist then the integrate-and-fire system is in a ground state. The resulting algorithm is new, easily scalable and is better than existing stochastic algorithms for random nonstationary 3-SAT problems. Under some particular conditions, the algorithm reduces to RwalkSAT, a local stochastic search algorithm whose properties in conjunctio..