8 research outputs found

    Configurable Pattern-based Evolutionary Biclustering of Gene Expression Data

    Get PDF
    BACKGROUND: Biclustering algorithms for microarray data aim at discovering functionally related gene sets under different subsets of experimental conditions. Due to the problem complexity and the characteristics of microarray datasets, heuristic searches are usually used instead of exhaustive algorithms. Also, the comparison among different techniques is still a challenge. The obtained results vary in relevant features such as the number of genes or conditions, which makes it difficult to carry out a fair comparison. Moreover, existing approaches do not allow the user to specify any preferences on these properties. RESULTS: Here, we present the first biclustering algorithm in which it is possible to particularize several biclusters features in terms of different objectives. This can be done by tuning the specified features in the algorithm or also by incorporating new objectives into the search. Furthermore, our approach bases the bicluster evaluation in the use of expression patterns, being able to recognize both shifting and scaling patterns either simultaneously or not. Evolutionary computation has been chosen as the search strategy, naming thus our proposal Evo-Bexpa (Evolutionary Biclustering based in Expression Patterns). CONCLUSIONS: We have conducted experiments on both synthetic and real datasets demonstrating Evo-Bexpa abilities to obtain meaningful biclusters. Synthetic experiments have been designed in order to compare Evo-Bexpa performance with other approaches when looking for perfect patterns. Experiments with four different real datasets also confirm the proper performing of our algorithm, whose results have been biologically validated through Gene Ontology

    Knowledge-Based Fast Evaluation for Evolutionary Learning

    Get PDF
    The increasing amount of information available is encouraging the search for efficient techniques to improve the data mining methods, especially those which consume great computational resources, such as evolutionary computation.Efficacy and efficiency are two critical aspects for knowledge-based techniques.The incorporation of knowledge into evolutionary algorithms (EAs) should provide either better solutions (efficacy) or the equivalent solutions in shorter time (efficiency), regarding the same evolutionary algorithm without incorporating such knowledge. In this paper, we categorize and summarize some of the incorporation of knowledge techniques for evolutionary algorithms and present a novel data structure, called efficient evaluation structure (EES), which helps the evolutionary algorithm to provide decision rules using less computational resources.The EES-based EA is tested and compared to another EA system and the experimental results show the quality of our approach, reducing the computational cost about 50%, maintaining the global accuracy of the final set of decision rules.CICYT TIN2004-0015

    Separation Surfaces through Genetic Programming

    Get PDF
    The aim of this paper is to describe a study for the obtaining, symbolically, of the separation surfaces between clusters of a labelled database. A separation surface is an equation with the form ø; (x)=0, where ø is a function of R n → R. The calculation of function ø is begun by the development of the parametric regression by means of the use of the Genetic Programming. The symbolic regression consists in approximating an unknown function’s equation, through knowledge of certain points’ coordinates and the value that a function reaches with the same ones. This possibility was propose in [Koza92a] and its advantage in front of the classic statistical regressions is that it is not necessary previously to know the form the function. Once this surface is found, a classifier for the database could be obtained. The technique has been applied to different examples and the results have been very satisfactory

    Statistical Test-Based Evolutionary Segmentation of Yeast Genome

    Get PDF
    Segmentation algorithms emerge observing fluctuations of DNA sequences in alternative homogeneous domains, which are named segments [1]. The key idea is that two genes that are controlled by a single regulatory system should have similar expression patterns in any data set. In this work, we present a new approach based on Evolutionary Algorithms (EAs) that differentiate segments of genes, which are represented by its level of meiotic recombination. We have tested the algorithm with the yeast genome [2][3] because this organism is very interesting for the research community, as it preserves many biological properties from more complex organisms and it is simple enough to run experiments. We have a file with about 6100 genes, divided into sixteen yeast chromosomes (N). Each gene is a row of the file. Each column of file represents a genomic characteristic under speci.c conditions (in this case, only the activity of meiotic recombination). The goal is to group consecutive genes properly differentiated from adjacent segments. Each group will be a segment of genes, as it will maintain the physical location within the genome. To measure the relevance of segments the Mann–Whitney statistical test has been used

    Mejoras en eficiencia y eficacia de algoritmos evolutivos para aprendizaje supervisado

    No full text
    Los algoritmos evolutivos conforman una de las más importantes familias de modelos computacionales con aplicación en el campo del aprendizaje automático, cuya validez y efectividad han sido ampliamente estudiada en la bibliografía. Enmarcada dentro del área del aprendizaje supervisado, esta tesis doctoral tiene como objetivo fundamental el desarrollo de diversos métodos algorítmicos dirigidos hacia la mejora de este tipo de técnicas para la generación de reglas de decisión. Se pretende reducir el coste computacional asociado a los aspectos críticos de los algoritmos evolutivos, así como aumentar la calidad de los resultados mediante una búsqueda más eficiente y eficaz de las soluciones.Premio Extraordinario de Doctorado U

    Estimación y toma de decisiones mediante Minería de Datos

    No full text
    En este capítulo se describen diversas técnicas de Minería de Datos que muestran la utilidad de las mismas para la extracción de conocimiento en proyectos de desarrollo del Software. En concreto, se describen las tres fases centrales del proceso: el preprocesado de los datos, mediante la selección de los atributos más relevantes; la Minería de Datos, aplicando una herramienta de aprendizaje evolutivo: y la visualización de los resultados, para ayudar de forma interactiva a guiar la búsqueda y a simplificar en gran medida tanto el análisis de los datos como la interpretación de los resultados. Los experimentos realizados reflejan la validez de los métodos aplicados, mostrando la posibilidad de clasificar datos sobre proyectos software con un número reducido de atributos

    An efficient data structure for decision rules discovery

    No full text
    The increasing amount of information available is encouraging the search for efficient techniques to improve the data mining methods, especially those which consume great computational resources. We present a novel structure, called EES, which helps the data mining algorithms which generate decision rules to reduce the aforementioned cost. Given that decision rules establish conditions for database attributes, EES stores the information in such a way that the search can be carried out by attributes instead of by examples. EES could be useful for any method which generates decision rules. Moreover, it is of particular interest when the search for the solution involves a great many hypothetical solutions. Thus, this structure is designed for speeding up the rule-evaluation process in methods based on Evolutionary Algorithms. The traditional structure, based on vectors of examples (in which the database is stored) is evaluated and compared with EES, including the costs for a stratified set of cases. Finally, the experimental results demonstrate the quality of our proposal, reducing the computational cost by approximately 50%
    corecore