Search CORE

114 research outputs found

How Fitch-Margoliash Algorithm can Benefit from Multi Dimensional Scaling

Author: Hitchcock E.
Darwin C.
Edwards A.W.F.
Sneath P.H.A.
Saitou N.
Salemi M.
Lespinats S.
Jolliffe I.
Kuhner M.K.
Zaretsky K.
Cavalli-Sforza L.L.
Matsuda H.
Swofford D.L.
Li J.
Press W.H.
Glover F.
Goldberg D.E.
Reeves C.R.
Dowsland K.A.
Chalmers M.
Gromov M.
Milman V.D.
Bulmer M.
Demartines P.
Fleiss J.L.
Publication venue: Libertas Academica
Publication date: 01/01/2011
Field of study

Whatever the phylogenetic method, genetic sequences are often described as strings of characters, thus molecular sequences can be viewed as elements of a multi-dimensional space. As a consequence, studying motion in this space (ie, the evolutionary process) must deal with the amazing features of high-dimensional spaces like concentration of measured phenomenon

Crossref

Hal - Université Grenoble Alpes

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Warwick Research Archives Portal Repository

Online Research Database In Technology

Recommended from our members

The application of software visualization technology to evolutionary computation: a case study in Genetic Algorithms

Author: Collins Trevor D.
Publication venue
Publication date: 01/01/1998
Field of study

Evolutionary computation is an area within the field of artificial intelligence that is founded upon the principles of biological evolution. Evolution can be defined as the process of gradual development. Evolutionary algorithms are typically applied as a generic problem solving method, searching a problem space in order to locate good solutions. These solutions are found through an iterative evolutionary search that progresses by means of gradual developments. In the majority of cases of evolutionary computation the user is not aware of their algorithm's search behaviour. This causes two problems. First, the user has no way of assuring the quality of any solutions found other than to compare the solutions found by the algorithm with any available benchmark solutions or to re-run the algorithm and check if the results can be repeated or improved upon. Second, because the user is unaware of the algorithm's behaviour they have no way of identifying the contribution of the different components of the algorithm and therefore, no direct way of analyzing the algorithm's design and assigning credit to good algorithm components, or locating and improving ineffective algorithm components. The artificial intelligence and engineering communities have been slow to accept evolutionary computation as a robust problem-solving method because, unlike cased-based systems, rule-based systems or belief networks, they are unable to follow the algorithm's reasoning when locating a set of solutions in the problem space. During an evolutionary algorithm's execution the user may be able to see the results of the search but the search process itself like is a "black box" to the user. It is the search behaviour of evolutionary algorithms that needs to be understood by the user, in order for evolutionary computation to become more accepted within these communities. The aim of software visualization is to help people understand and use computer software. Software visualization technology has been applied successfully to illustrate a variety of heuristic search algorithms, programming languages and data structures. This thesis adopts software visualization as an approach for illustrating the search behaviour of evolutionary algorithms. Genetic Algorithms ("GAs") are used here as a specific case study to illustrate how software visualization may be applied to evolutionary computation. A set of visualization requirements are derived from the findings of a GA user study. A number of search space visualization techniques are examined for illustrating the search behaviour of a GA. "Henson," an extendable framework for developing visualization tools for genetic algorithms is presented. Finally, the application of the Henson framework is illustrated by the development of "Gonzo," a visualization tool designed to enable GA users to explore their algorithm's search behaviour. The contributions made in this thesis extend into the areas of software visualization, evolutionary computation and the psychology of programming. The GA user study presented here is the first and only known study of the working practices of GA users. The search space visualization techniques proposed here have never been applied in this domain before, and the resulting interactive visualizations provide the GA user with a previously unavailable insight into their algorithm's operation

Open Research Online

Towards explainable metaheuristics: PCA for trajectory mining in evolutionary algorithms.

Author: Christie Lee A.
Fyvie Martin
McCall John A.W.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/12/2021
Field of study

The generation of explanations regarding decisions made by population-based meta-heuristics is often a difficult task due to the nature of the mechanisms employed by these approaches. With the increase in use of these methods for optimisation in industries that require end-user confirmation, the need for explanations has also grown. We present a novel approach to the extraction of features capable of supporting an explanation through the use of trajectory mining - extracting key features from the populations of NDAs. We apply Principal Components Analysis techniques to identify new methods of population diversity tracking post-runtime after projection into a lower dimensional space. These methods are applied to a set of benchmark problems solved by a Genetic Algorithm and a Univariate Estimation of Distribution Algorithm. We show that the new sub-space derived metrics can capture key learning steps in the algorithm run and how solution variable patterns that explain the fitness function may be captured in the principal component coefficients

Open Access Institutional Repository at Robert Gordon University

MEDVIR: 3D visual interface applied to gene profile analisys.

Author: Gracia Berná Antonio
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2012
Field of study

The origins for this work arise in response to the increasing need for biologists and doctors to obtain tools for visual analysis of data. When dealing with multidimensional data, such as medical data, the traditional data mining techniques can be a tedious and complex task, even to some medical experts. Therefore, it is necessary to develop useful visualization techniques that can complement the expert’s criterion, and at the same time visually stimulate and make easier the process of obtaining knowledge from a dataset. Thus, the process of interpretation and understanding of the data can be greatly enriched. Multidimensionality is inherent to any medical data, requiring a time-consuming effort to get a clinical useful outcome. Unfortunately, both clinicians and biologists are not trained in managing more than four dimensions. Specifically, we were aimed to design a 3D visual interface for gene profile analysis easy in order to be used both by medical and biologist experts. In this way, a new analysis method is proposed: MedVir. This is a simple and intuitive analysis mechanism based on the visualization of any multidimensional medical data in a three dimensional space that allows interaction with experts in order to collaborate and enrich this representation. In other words, MedVir makes a powerful reduction in data dimensionality in order to represent the original information into a three dimensional environment. The experts can interact with the data and draw conclusions in a visual and quickly way

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

A methodology to compare dimensionality reduction algorithms in terms of loss of quality

Author: González Tortosa Santiago
Gracia Berná Antonio
Menasalvas Ruiz Ernestina
Robles Forcada Víctor
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Dimensionality Reduction (DR) is attracting more attention these days as a result of the increasing need to handle huge amounts of data effectively. DR methods allow the number of initial features to be reduced considerably until a set of them is found that allows the original properties of the data to be kept. However, their use entails an inherent loss of quality that is likely to affect the understanding of the data, in terms of data analysis. This loss of quality could be determinant when selecting a DR method, because of the nature of each method. In this paper, we propose a methodology that allows different DR methods to be analyzed and compared as regards the loss of quality produced by them. This methodology makes use of the concept of preservation of geometry (quality assessment criteria) to assess the loss of quality. Experiments have been carried out by using the most well-known DR algorithms and quality assessment criteria, based on the literature. These experiments have been applied on 12 real-world datasets. Results obtained so far show that it is possible to establish a method to select the most appropriate DR method, in terms of minimum loss of quality. Experiments have also highlighted some interesting relationships between the quality assessment criteria. Finally, the methodology allows the appropriate choice of dimensionality for reducing data to be established, whilst giving rise to a minimum loss of quality

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

An Interactive Visualisation System for Engineering Design using Evolutionary Computing

Author: PACKHAM IAN STEPHEN JAMES
Publication venue: 'University of Plymouth'
Publication date: 01/01/2003
Field of study

This thesis describes a system designed to promote collaboration between the human and computer during engineering design tasks. Evolutionary algorithms (in particular the genetic algorithm) can find good solutions to engineering design problems in a small number of iterations, but a review of the interactive evolutionary computing literature reveals that users would benefit from understanding the design space and having the freedom to direct the search. The main objective of this research is to fulfil a dual requirement: the computer should generate data and analyse the design space to identify high performing regions in terms of the quality and robustness of solutions, while at the same time the user should be allowed to interact with the data and use their experience and the information provided to guide the search inside and outside regions already found. To achieve these goals a flexible user interface was developed that links and clarifies the research fields of evolutionary computing, interactive engineering design and multivariate visualisation. A number of accessible visualisation techniques were incorporated into the system. An innovative algorithm based on univariate kernel density estimation is introduced that quickly identifies the relevant clusters in the data from the point of view of the original design variables or a natural coordinate system such as the principal or independent components. The robustness of solutions inside a region can be investigated by novel use of 'negative' genetic algorithm search to find the worst case scenario. New high performance regions can be discovered in further runs of the evolutionary algorithm; penalty functions are used to avoid previously found regions. The clustering procedure was also successfully applied to multiobjective problems and used to force the genetic algorithm to find desired solutions in the trade-off between objectives. The system was evaluated by a small number of users who were asked to solve simulated engineering design scenarios by finding and comparing robust regions in artificial test functions. Empirical comparison with benchmark algorithms was inconclusive but it was shown that even a devoted hybrid algorithm needs help to solve a design task. A critical analysis of the feedback and results suggested modifications to the clustering algorithm and a more practical way to evaluate the robustness of solutions. The system was also shown to experienced engineers working on their real world problems, new solutions were found in pertinent regions of objective space; links to the artefact aided comparison of results. It was confirmed that in practice a lot of design knowledge is encoded into design problems but experienced engineers use subjective knowledge of the problem to make decisions and evaluate the robustness of solutions. So the full potential of the system was seen in its ability to support decision making by supplying a diverse range of alternative design options, thereby enabling knowledge discovery in a wide-ranging number of applications

Plymouth Electronic Archive and Research Library

OpenGrey Repository

Single- and Multi-Distribution Dimensionality Reduction Approaches for a Better Data Structure Capturing

Author: Chen D.
Chen D.
Dudley-McEvoy S
Dudley-McEvoy S
Grisan E.
Grisan E.
Hajderanj L.
Hajderanj L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

In recent years, the huge expansion of digital technologies has vastly increased the volume of data to be explored, such that reducing the dimensionality of data is an essential step in data exploration. The integrity of a dimensionality reduction technique relates to the goodness of maintaining the data structure. Dimensionality reduction techniques such as Principal Component Analyses (PCA) and Multidimensional Scaling (MDS) globally preserve the distance ranking at the expense of neglecting small-distance preservation. Conversely, the structure capturing of some other methods such as Isomap, Locally Linear Embedding (LLE), Laplacian Eigenmaps t-Stochastic Neighbour Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and TriMap rely on the number of neighbours considered. This paper presents a dimensionality reduction technique, Same Degree Distribution (SDD) that does not rely on the number of neighbours, thanks to using degree-distributions in both high and low dimensional spaces. Degree-distribution is similar to Student-t distribution and is less expensive than Gaussian distribution. As such, it enables better global data preservation in less computational time. Moreover, to improve the data structure capturing, SDD has been extended to Multi-SDDs (MSDD), which employs various degree distributions on top of SDD. The proposed approach and its extension demonstrated a greater performance compared with eight other benchmark methods, tested in several popular synthetics and real datasets such as Iris, Breast Cancer, Swiss Roll, MNIST, and Make Blob evaluated by the co-ranking matrix and Kendall’s Tau coefficient. For further work, we aim to approximate the number of distributions and their degrees in relation to the given dataset. Reducing the computational complexity is another objective for further work

LSBU Research Open

Projection-Based Clustering through Self-Organization and Swarm Intelligence

Author: Thrun Michael Christoph
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2018
Field of study

It covers aspects of unsupervised machine learning used for knowledge discovery in data science and introduces a data-driven approach to cluster analysis, the Databionic swarm (DBS). DBS consists of the 3D landscape visualization and clustering of data. The 3D landscape enables 3D printing of high-dimensional data structures. The clustering and number of clusters or an absence of cluster structure are verified by the 3D landscape at a glance. DBS is the first swarm-based technique that shows emergent properties while exploiting concepts of swarm intelligence, self-organization and the Nash equilibrium concept from game theory. It results in the elimination of a global objective function and the setting of parameters. By downloading the R package DBS can be applied to data drawn from diverse research fields and used even by non-professionals in the field of data mining

Crossref

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

A Visualization Technique for Accessing Solution Pool in Interactive Methods of Multiobjective Optimization

Author: Filatovas Ernestas
Kurasova Olga
Podkopaev Dmitry
Publication venue: Agora University Press
Publication date: 23/06/2015
Field of study

Interactive methods of multiobjective optimization repetitively derive Pareto optimal solutions based on decision maker's preference information and present the obtained solutions for his/her consideration. Some interactive methods save the obtained solutions into a solution pool and, at each iteration, allow the decision maker considering any of solutions obtained earlier. This feature contributes to the flexibility of exploring the Pareto optimal set and learning about the optimization problem. However, in the case of many objective functions, the accumulation of derived solutions makes accessing the solution pool cognitively difficult for the decision maker. We propose to enhance interactive methods with visualization of the set of solution outcomes using dimensionality reduction and interactive mechanisms for exploration of the solution pool. We describe a proposed visualization technique and demonstrate its usage with an example problem solved using the interactive method NIMBUS

Agora University Editing House: Journals