Search CORE

97,704 research outputs found

A Regularized Method for Selecting Nested Groups of Relevant Genes from Microarray Data

Author: De Mol Christine
Mosci Sofia
Traskine Magali
Verri Alessandro
Publication venue
Publication date: 10/09/2008
Field of study

Gene expression analysis aims at identifying the genes able to accurately predict biological parameters like, for example, disease subtyping or progression. While accurate prediction can be achieved by means of many different techniques, gene identification, due to gene correlation and the limited number of available samples, is a much more elusive problem. Small changes in the expression values often produce different gene lists, and solutions which are both sparse and stable are difficult to obtain. We propose a two-stage regularization method able to learn linear models characterized by a high prediction performance. By varying a suitable parameter these linear models allow to trade sparsity for the inclusion of correlated genes and to produce gene lists which are almost perfectly nested. Experimental results on synthetic and microarray data confirm the interesting properties of the proposed method and its potential as a starting point for further biological investigationsComment: 17 pages, 8 Post-script figure

arXiv.org e-Print Archive

DI-fusion

Random model trees: an effective and scalable regression method

Author: Pfahringer Bernhard
Publication venue: University of Waikato, Department of Computer Science
Publication date: 01/06/2010
Field of study

We present and investigate ensembles of randomized model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivaling the state of the art in numeric prediction. An extensive empirical investigation shows that Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude

Research Commons@Waikato