166 research outputs found
Evolutionary Biclustering based on Expression Patterns
The majority of the biclustering approaches for
microarray data analysis use the Mean Squared Residue (MSR)
as the main evaluation measure for guiding the heuristic.
MSR has been proven to be inefficient to recognize several
kind of interesting patterns for biclusters. Transposed Virtual
Error (VEt ) has recently been discovered to overcome MSR
drawbacks, being able to recognize shifting and/or scaling
patterns. In this work we propose a parallel evolutionary
biclustering algorithm which uses VEt as the main part of
the fitness function, which has been designed using the volume
and overlapping as other objectives to optimize. The resulting
algorithm has been tested on both synthetic and benchmark
real data producing satisfactory results. These results has been
compared to those of the most popular biclustering algorithm
developed by Cheng and Church and based in the use of MSR.Ministerio de Ciencia y Tecnología TIN2007-68084-C02-0
Unsupervised Algorithms for Microarray Sample Stratification
The amount of data made available by microarrays gives researchers the opportunity to delve into the complexity of biological systems. However, the noisy and extremely high-dimensional nature of this kind of data poses significant challenges. Microarrays allow for the parallel measurement of thousands of molecular objects spanning different layers of interactions. In order to be able to discover hidden patterns, the most disparate analytical techniques have been proposed. Here, we describe the basic methodologies to approach the analysis of microarray datasets that focus on the task of (sub)group discovery.Peer reviewe
Configurable Pattern-based Evolutionary Biclustering of Gene Expression Data
BACKGROUND: Biclustering algorithms for microarray data aim at discovering functionally related gene sets under different subsets of experimental conditions. Due to the problem complexity and the characteristics of microarray datasets, heuristic searches are usually used instead of exhaustive algorithms. Also, the comparison among different techniques is still a challenge. The obtained results vary in relevant features such as the number of genes or conditions, which makes it difficult to carry out a fair comparison. Moreover, existing approaches do not allow the user to specify any preferences on these properties. RESULTS: Here, we present the first biclustering algorithm in which it is possible to particularize several biclusters features in terms of different objectives. This can be done by tuning the specified features in the algorithm or also by incorporating new objectives into the search. Furthermore, our approach bases the bicluster evaluation in the use of expression patterns, being able to recognize both shifting and scaling patterns either simultaneously or not. Evolutionary computation has been chosen as the search strategy, naming thus our proposal Evo-Bexpa (Evolutionary Biclustering based in Expression Patterns). CONCLUSIONS: We have conducted experiments on both synthetic and real datasets demonstrating Evo-Bexpa abilities to obtain meaningful biclusters. Synthetic experiments have been designed in order to compare Evo-Bexpa performance with other approaches when looking for perfect patterns. Experiments with four different real datasets also confirm the proper performing of our algorithm, whose results have been biologically validated through Gene Ontology
Evolutionary Search of Biclusters by Minimal Intrafluctuation
Biclustering techniques aim at extracting significant
subsets of genes and conditions from microarray gene
expression data. This kind of algorithms is mainly based on two
key aspects: the way in which they deal with gene similarity
across the experimental conditions, that determines the quality
of biclusters; and the heuristic or search strategy used for
exploring the search space. A measure that is often adopted
for establishing the quality of biclusters is the mean squared
residue. This measure has been successfully used in many
approaches. However, it has been recently proven that the
mean squared residue fails to recognize some kind of biclusters
as quality biclusters, mainly due to the difficulty of detecting
scaling patterns in data. In this work, we propose a novel
measure for trying to overcome this drawback. This measure
is based on the area between two curves. Such curves are
built from the maximum and minimum standardized expression
values exhibited for each experimental condition. In order
to test the proposed measure, we have incorporated it into
a multiobjective evolutionary algorithm. Experimental results
confirm the effectiveness of our approach. The combination of
the measure we propose with the mean squared residue yields
results that would not have been obtained if only the mean
squared residue had been used.Comisión Interministerial de Ciencia y Tecnología (CICYT) TIN2004-0015
TriGen: A genetic algorithm to mine triclusters in temporal gene expression data
Analyzing microarray data represents a computational challenge due to the characteristics of these data. Clustering
techniques are widely applied to create groups of genes that exhibit a similar behavior under the conditions tested.
Biclustering emerges as an improvement of classical clustering since it relaxes the constraints for grouping genes to
be evaluated only under a subset of the conditions and not under all of them. However, this technique is not
appropriate for the analysis of longitudinal experiments in which the genes are evaluated under certain conditions at
several time points. We present the TriGen algorithm, a genetic algorithm that finds triclusters of gene expression that
take into account the experimental conditions and the time points simultaneously. We have used TriGen to mine
datasets related to synthetic data, yeast (Saccharomyces cerevisiae) cell cycle and human inflammation and host
response to injury experiments. TriGen has proved to be capable of extracting groups of genes with similar patterns in
subsets of conditions and times, and these groups have shown to be related in terms of their functional annotations
extracted from the Gene Ontology.Ministerio de Ciencia y Tecnología TIN2011-28956-C00Ministerio de Ciencia y Tecnología TIN2009-13950Junta de Andalucía TIC-752
- …