Search CORE

6,628 research outputs found

Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition

Author: Chen Yun-Mei
Tian Li
Wu Wen-Ming
Xu Shuang
Yang Li-Jun
Yang Xiao-Hui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Sparse representation based classification (SRC) methods have achieved remarkable results. SRC, however, still suffer from requiring enough training samples, insufficient use of test samples and instability of representation. In this paper, a stable inverse projection representation based classification (IPRC) is presented to tackle these problems by effectively using test samples. An IPR is firstly proposed and its feasibility and stability are analyzed. A classification criterion named category contribution rate is constructed to match the IPR and complete classification. Moreover, a statistical measure is introduced to quantify the stability of representation-based classification methods. Based on the IPRC technique, a robust tumor recognition framework is presented by interpreting microarray gene expression data, where a two-stage hybrid gene selection method is introduced to select informative genes. Finally, the functional analysis of candidate's pathogenicity-related genes is given. Extensive experiments on six public tumor microarray gene expression datasets demonstrate the proposed technique is competitive with state-of-the-art methods.Comment: 14 pages, 19 figures, 10 table

arXiv.org e-Print Archive

Crossref

Random Forest as a tumour genetic marker extractor

Author: Pérez Arnal Raquel Leandra
Publication venue: Universitat Politécnica de Catalunya
Publication date: 01/01/2019
Field of study

Identifying tumour genetic markers is an essential task for biomedicine. In this thesis, we analyse a dataset of chromosomal rearrangements of cancer samples and present a methodology for extracting genetic markers from this dataset by using a Random Forest as a feature selection tool

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Sparse reduced-rank regression for imaging genetics studies: models and applications

Author: Vounou Maria
Vounou Maria
Publication venue: Mathematics, Imperial College London
Publication date: 01/02/2012
Field of study

We present a novel statistical technique; the sparse reduced rank regression (sRRR) model which is a strategy for multivariate modelling of high-dimensional imaging responses and genetic predictors. By adopting penalisation techniques, the model is able to enforce sparsity in the regression coefficients, identifying subsets of genetic markers that best explain the variability observed in subsets of the phenotypes. To properly exploit the rich structure present in each of the imaging and genetics domains, we additionally propose the use of several structured penalties within the sRRR model. Using simulation procedures that accurately reflect realistic imaging genetics data, we present detailed evaluations of the sRRR method in comparison with the more traditional univariate linear modelling approach. In all settings considered, we show that sRRR possesses better power to detect the deleterious genetic variants. Moreover, using a simple genetic model, we demonstrate the potential benefits, in terms of statistical power, of carrying out voxel-wise searches as opposed to extracting averages over regions of interest in the brain. Since this entails the use of phenotypic vectors of enormous dimensionality, we suggest the use of a sparse classification model as a de-noising step, prior to the imaging genetics study. Finally, we present the application of a data re-sampling technique within the sRRR model for model selection. Using this approach we are able to rank the genetic markers in order of importance of association to the phenotypes, and similarly rank the phenotypes in order of importance to the genetic markers. In the very end, we illustrate the application perspective of the proposed statistical models in three real imaging genetics datasets and highlight some potential associations

Spiral - Imperial College Digital Repository

Discovering study-specific gene regulatory networks

Author: A Lysenko
AL Barabási
Alberto de la Fuente
Allan Tucker
Artem Lysenko
B Grigorova
B Zhang
D Baek
D Heckerman
DA Samac
DJ Spiegelhalter
E Baalmann
E Segal
E Steele
E Wientjes
F Alakwaa
F Llorente
H Parkinson
J Choi
J Friedman
J Hartigan
J Zhang
JA Ihalainen
JT Damkjær
K Ando
L Marri
L Marri
M Ashburner
Mansoor Saqi
N Friedman
N Meinshausen
N Mochizuki
O Thimm
P Erdős
P Kirk
P Langfelder
PE Jensen
R Srinivasan
RA Irizarry
S Anvar
S Dash
S Infanger
S Madeira
S Swift
S Zhang
Stephen Swift
T Obayashi
Tanya Curtis
U Andersson
U Sengupta
Valeria Bo
WY Bang
Y Kluger
Y Kwon
YJ Kim
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

This article has been made available through the Brunel Open Access Publishing Fund.Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust. The purpose of this paper, however, is to combine multiple microarray studies to automatically identify subnetworks that are distinctive to specific experimental conditions rather than common to them all. To better understand key regulatory mechanisms and how they change under different conditions, we derive unique networks from multiple independent networks built using glasso which goes beyond standard correlations. This involves calculating cluster prediction accuracies to detect the most predictive genes for a specific set of conditions. We differentiate between accuracies calculated using cross-validation within a selected cluster of studies (the intra prediction accuracy) and those calculated on a set of independent studies belonging to different study clusters (inter prediction accuracy). Finally, we compare our method's results to related state-of-the art techniques. We explore how the proposed pipeline performs on both synthetic data and real data (wheat and Fusarium). Our results show that subnetworks can be identified reliably that are specific to subsets of studies and that these networks reflect key mechanisms that are fundamental to the experimental conditions in each of those subsets

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Rothamsted Repository

Brunel University Research Archive

Probabilistic estimation of microarray data reliability and underlying gene expression

Author: Bilke S.
Breslin T.
Sigvardsson M.
Publication venue
Publication date: 01/01/2003
Field of study

Background: The availability of high throughput methods for measurement of mRNA concentrations makes the reliability of conclusions drawn from the data and global quality control of samples and hybridization important issues. We address these issues by an information theoretic approach, applied to discretized expression values in replicated gene expression data. Results: Our approach yields a quantitative measure of two important parameter classes: First, the probability

P(\sigma | S)

that a gene is in the biological state

\sigma

in a certain variety, given its observed expression

S

in the samples of that variety. Second, sample specific error probabilities which serve as consistency indicators of the measured samples of each variety. The method and its limitations are tested on gene expression data for developing murine B-cells and a

t

-test is used as reference. On a set of known genes it performs better than the

t

-test despite the crude discretization into only two expression levels. The consistency indicators, i.e. the error probabilities, correlate well with variations in the biological material and thus prove efficient. Conclusions: The proposed method is effective in determining differential gene expression and sample reliability in replicated microarray data. Already at two discrete expression levels in each sample, it gives a good explanation of the data and is comparable to standard techniques.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Lund University Publications

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digitala Vetenskapliga Arkivet - Academic Archive On-line