166 research outputs found

    Evolutionary Biclustering based on Expression Patterns

    Get PDF
    The majority of the biclustering approaches for microarray data analysis use the Mean Squared Residue (MSR) as the main evaluation measure for guiding the heuristic. MSR has been proven to be inefficient to recognize several kind of interesting patterns for biclusters. Transposed Virtual Error (VEt ) has recently been discovered to overcome MSR drawbacks, being able to recognize shifting and/or scaling patterns. In this work we propose a parallel evolutionary biclustering algorithm which uses VEt as the main part of the fitness function, which has been designed using the volume and overlapping as other objectives to optimize. The resulting algorithm has been tested on both synthetic and benchmark real data producing satisfactory results. These results has been compared to those of the most popular biclustering algorithm developed by Cheng and Church and based in the use of MSR.Ministerio de Ciencia y Tecnología TIN2007-68084-C02-0

    Unsupervised Algorithms for Microarray Sample Stratification

    Get PDF
    The amount of data made available by microarrays gives researchers the opportunity to delve into the complexity of biological systems. However, the noisy and extremely high-dimensional nature of this kind of data poses significant challenges. Microarrays allow for the parallel measurement of thousands of molecular objects spanning different layers of interactions. In order to be able to discover hidden patterns, the most disparate analytical techniques have been proposed. Here, we describe the basic methodologies to approach the analysis of microarray datasets that focus on the task of (sub)group discovery.Peer reviewe

    Configurable Pattern-based Evolutionary Biclustering of Gene Expression Data

    Get PDF
    BACKGROUND: Biclustering algorithms for microarray data aim at discovering functionally related gene sets under different subsets of experimental conditions. Due to the problem complexity and the characteristics of microarray datasets, heuristic searches are usually used instead of exhaustive algorithms. Also, the comparison among different techniques is still a challenge. The obtained results vary in relevant features such as the number of genes or conditions, which makes it difficult to carry out a fair comparison. Moreover, existing approaches do not allow the user to specify any preferences on these properties. RESULTS: Here, we present the first biclustering algorithm in which it is possible to particularize several biclusters features in terms of different objectives. This can be done by tuning the specified features in the algorithm or also by incorporating new objectives into the search. Furthermore, our approach bases the bicluster evaluation in the use of expression patterns, being able to recognize both shifting and scaling patterns either simultaneously or not. Evolutionary computation has been chosen as the search strategy, naming thus our proposal Evo-Bexpa (Evolutionary Biclustering based in Expression Patterns). CONCLUSIONS: We have conducted experiments on both synthetic and real datasets demonstrating Evo-Bexpa abilities to obtain meaningful biclusters. Synthetic experiments have been designed in order to compare Evo-Bexpa performance with other approaches when looking for perfect patterns. Experiments with four different real datasets also confirm the proper performing of our algorithm, whose results have been biologically validated through Gene Ontology

    Multi-objective clustering of gene expression data with evolutionary algorithms: a query gene approach

    Get PDF

    Evolutionary Search of Biclusters by Minimal Intrafluctuation

    Get PDF
    Biclustering techniques aim at extracting significant subsets of genes and conditions from microarray gene expression data. This kind of algorithms is mainly based on two key aspects: the way in which they deal with gene similarity across the experimental conditions, that determines the quality of biclusters; and the heuristic or search strategy used for exploring the search space. A measure that is often adopted for establishing the quality of biclusters is the mean squared residue. This measure has been successfully used in many approaches. However, it has been recently proven that the mean squared residue fails to recognize some kind of biclusters as quality biclusters, mainly due to the difficulty of detecting scaling patterns in data. In this work, we propose a novel measure for trying to overcome this drawback. This measure is based on the area between two curves. Such curves are built from the maximum and minimum standardized expression values exhibited for each experimental condition. In order to test the proposed measure, we have incorporated it into a multiobjective evolutionary algorithm. Experimental results confirm the effectiveness of our approach. The combination of the measure we propose with the mean squared residue yields results that would not have been obtained if only the mean squared residue had been used.Comisión Interministerial de Ciencia y Tecnología (CICYT) TIN2004-0015

    TriGen: A genetic algorithm to mine triclusters in temporal gene expression data

    Get PDF
    Analyzing microarray data represents a computational challenge due to the characteristics of these data. Clustering techniques are widely applied to create groups of genes that exhibit a similar behavior under the conditions tested. Biclustering emerges as an improvement of classical clustering since it relaxes the constraints for grouping genes to be evaluated only under a subset of the conditions and not under all of them. However, this technique is not appropriate for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at several time points. We present the TriGen algorithm, a genetic algorithm that finds triclusters of gene expression that take into account the experimental conditions and the time points simultaneously. We have used TriGen to mine datasets related to synthetic data, yeast (Saccharomyces cerevisiae) cell cycle and human inflammation and host response to injury experiments. TriGen has proved to be capable of extracting groups of genes with similar patterns in subsets of conditions and times, and these groups have shown to be related in terms of their functional annotations extracted from the Gene Ontology.Ministerio de Ciencia y Tecnología TIN2011-28956-C00Ministerio de Ciencia y Tecnología TIN2009-13950Junta de Andalucía TIC-752
    corecore