Location of Repository

ArrayMining: a modular web-application for microarray analysis combining ensemble and consensus methods with cross-study normalization

By Enrico Glaab, Jon Garibaldi and Natalio Krasnogor


Background:\ud Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks.\ud \ud Results:\ud We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining.\ud \ud Conclusion:\ud ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases

Publisher: BioMed Central Ltd
Year: 2009
OAI identifier: oai:eprints.nottingham.ac.uk:1271
Provided by: Nottingham ePrints

Suggested articles



  1. (2001). A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics
  2. (1996). A Language for Data Analysis and Graphics.
  3. (2008). A: Merging two gene-expression studies via cross-platform normalization. Bioinformatics
  4. (2008). Analysis of the MammaPrint breast cancer assay in a predominantly postmenopausal cohort. Clin Cancer Res
  5. (2004). ArrayPipe: a flexible processing pipeline for microarray data. Nucleic Acids Res
  6. (2007). Asterias: integrated analysis of expression and aCGH data using an open-source, web-based, parallelized software suite. Nucleic Acids Res
  7. (2004). Bioconductor: open software development for computational biology and bioinformatics. Genome Biol
  8. (2005). Brors B: Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes.
  9. (2007). Classification based upon gene expression data: bias and precision of error rates. Bioinformatics
  10. (2006). Components of the antigen processing and presentation pathway revealed by gene expression microarray analysis following B cell antigen receptor (BCR) stimulation.
  11. (2004). Consensus clustering and functional interpretation of geneexpression data. Genome Biol doi
  12. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing.
  13. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning.
  14. Cross-species microarray analysis with the OSCAR system suggests an doi
  15. (2002). Diagnosis of multiple cancer types by shrunken centroids of gene expression.
  16. (2008). Empirical Bayes accomodation of batch-effects in microarray data using identical replicate reference samples: application to RNA expression profiling of blood from Duchenne muscular dystrophy patients.
  17. (2003). Ensemble machine learning on gene expression data for cancer classification. Appl Bioinformatics
  18. (2004). Expression Profiler: next generation-an online platform for analysis of microarray data.
  19. (1986). Fast simulated annealing.
  20. (2009). Filtering genes for cluster and network analysis.
  21. (2004). Gene expression module discovery using Gibbs sampling. Genome Inform
  22. (2002). Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res
  23. (2000). Gene Ontology: tool for the unification of biology.
  24. GenMiner: Mining Informative Association Rules from Genomic Data.
  25. (2008). GEPAS, a web-based tool for microarray data analysis and interpretation. Nucleic Acids Res
  26. (2000). Goto S: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res
  27. (2005). Horimoto K: ASIAN: a web server for inferring a regulatory network framework from gene expression profiles. Nucleic Acids Res doi
  28. (2009). Improving the scalability of rule-based evolutionary learning. Memetic Computing
  29. Krasnogor N: Data Mining in Proteomics with Learning Classifier Systems.
  30. (2005). Lanfranchi G: MIDAW: a web tool for statistical analysis of microarray data. doi
  31. (2001). Lin CJ: LIBSVM: a library for support vector machines
  32. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments.
  33. (2007). ProCKSI: A decision support system for protein (structure) comparison, knowledge, similarity and information.
  34. (2003). R: DAVID: database for annotation, visualization, and integrated discovery. Genome Biol
  35. (2001). Random Forests. Machine Learning
  36. (2008). Robust Feature Selection Using Ensemble Feature Selection Techniques. doi
  37. (2001). S: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet doi
  38. (2007). Schlapbach R: MAGMA: analysis of twochannel microarrays made easy.
  39. (2001). Self-Organizing Maps Berlin:
  40. (2005). Sick B: RACE: remote analysis computation for gene expression data. Nucleic Acids Res
  41. (2001). Significance analysis of microarrays applied to the ionizing radiation response.
  42. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis.
  43. (2002). Speed T: Replicated microarray data. Stat Sin
  44. (2003). Speed T: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res
  45. (2007). Strimmer K: Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Brief Bioinform
  46. (2003). T: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning
  47. (1996). The B cell transcriptional coactivator BOB1/OBF1 gene fuses to the LAZ3/BCL6 gene by t(3;11)(q27;q23.1) chromosomal translocation in a B cell leukemia line (Karpas 231). Leukemia
  48. (2006). Tibshirani R: Hybrid hierarchical clustering with applications to microarray data. Biostatistics
  49. (2005). Towards precise classification of cancers based on robust gene functional expression profiles.
  50. (2006). Trajanoski Z: CARMAweb: comprehensive R-and bioconductor-based web BMC Bioinformatics 2009, 10:358 http://www.biomedcentral.com/1471-2105/10/358 service for microarray data analysis.
  51. (2006). Van't Veer L: Successful classification of metastatic carcinoma of known primary using the CUPPRINT. J Clin Oncol
  52. (2005). Volsky D: PAGE: parametric analysis of gene set enrichment.
  53. (2005). WebArray: an online platform for microarray data analysis. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.