Search CORE

2,867 research outputs found

Differential gene expression graphs: A data structure for classification in DNA microarrays

Author: Benso Alfredo
Di Carlo Stefano
Politano Gianfranco Michele Maria
Sterpone Luca
Publication venue: IEEE Press
Publication date: 01/01/2008
Field of study

This paper proposes an innovative data structure to be used as a backbone in designing microarray phenotype sample classifiers. The data structure is based on graphs and it is built from a differential analysis of the expression levels of healthy and diseased tissue samples in a microarray dataset. The proposed data structure is built in such a way that, by construction, it shows a number of properties that are perfectly suited to address several problems like feature extraction, clustering, and classificatio

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Recommended from our members

The robust selection of predictive genes via a simple classifier

Author: Kellum P
Liu X
Tucker A
Vinciotti V
Publication venue: Adis International
Publication date: 01/01/2006
Field of study

Identifying genes that direct the mechanism of a disease from expression data is extremely useful in understanding how that mechanism works. This in turn may lead to better diagnoses and potentially can lead to a cure for that disease. This task becomes extremely challenging when the data are characterised by only a small number of samples and a high number of dimensions, as it is often the case with gene expression data. Motivated by this challenge, we present a general framework that focuses on simplicity and data perturbation. These are the keys for the robust identification of the most predictive features in such data. Within this framework, we propose a simple selective na¨ıve Bayes classifier discovered using a global search technique, and combine it with data perturbation to increase its robustness to small sample sizes. An extensive validation of the method was carried out using two applied datasets from the field of microarrays and a simulated dataset, all confounded by small sample sizes and high dimensionality. The method has been shown capable of identifying genes previously confirmed or associated with prostate cancer and viral infections

Brunel University Research Archive

Inference of the genetic network regulating lateral root initiation in Arabidopsis thaliana

Author: Bennett M.
Byrne H. M.
de Smet I.
Hodgman C.
King J. R.
Muraro D.
Voß U.
Wilson M.
Publication venue
Publication date: 01/01/2012
Field of study

Regulation of gene expression is crucial for organism growth, and it is one of the challenges in Systems Biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyse two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants to infer their regulatory network. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale-free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation

Oxford University Research Archive

Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models

Author: Liu Han
Roeder Kathryn
Wasserman Larry
Publication venue
Publication date: 01/01/2010
Field of study

A challenging problem in estimating high-dimensional graphical models is to choose the regularization parameter in a data-dependent way. The standard techniques include

K

-fold cross-validation (

K

-CV), Akaike information criterion (AIC), and Bayesian information criterion (BIC). Though these methods work well for low-dimensional problems, they are not suitable in high dimensional settings. In this paper, we present StARS: a new stability-based method for choosing the regularization parameter in high dimensional inference for undirected graphs. The method has a clear interpretation: we use the least amount of regularization that simultaneously makes a graph sparse and replicable under random sampling. This interpretation requires essentially no conditions. Under mild conditions, we show that StARS is partially sparsistent in terms of graph estimation: i.e. with high probability, all the true edges will be included in the selected model even when the graph size diverges with the sample size. Empirically, the performance of StARS is compared with the state-of-the-art model selection procedures, including

K

-CV, AIC, and BIC, on both synthetic data and a real microarray dataset. StARS outperforms all these competing procedures

arXiv.org e-Print Archive

CiteSeerX