Search CORE

2,887 research outputs found

Sparse integrative clustering of multiple omics data sets

Author: Mo Qianxing
Shen Ronglai
Wang Sijian
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 13/02/2012
Field of study

High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation and gene expression associated with a disease. An integrated genomic profiling approach measures multiple omics data types simultaneously in the same set of biological samples. Such approach renders an integrated data resolution that would not be available with any single data type. In this study, we use penalized latent variable regression methods for joint modeling of multiple omics data types to identify common latent variables that can be used to cluster patient samples into biologically and clinically relevant disease subtypes. We consider lasso [J. Roy. Statist. Soc. Ser. B 58 (1996) 267-288], elastic net [J. R. Stat. Soc. Ser. B Stat. Methodol. 67 (2005) 301-320] and fused lasso [J. R. Stat. Soc. Ser. B Stat. Methodol. 67 (2005) 91-108] methods to induce sparsity in the coefficient vectors, revealing important genomic features that have significant contributions to the latent variables. An iterative ridge regression is used to compute the sparse coefficient vectors. In model selection, a uniform design [Monographs on Statistics and Applied Probability (1994) Chapman & Hall] is used to seek "experimental" points that scattered uniformly across the search domain for efficient sampling of tuning parameter combinations. We compared our method to sparse singular value decomposition (SVD) and penalized Gaussian mixture model (GMM) using both real and simulated data sets. The proposed method is applied to integrate genomic, epigenomic and transcriptomic data for subtype analysis in breast and lung cancer data sets.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS578 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

PubMed Central

Collection Of Biostatistics Research Archive

Evaluation of Modified Categorical Data Fuzzy Clustering Algorithm on the Wisconsin Breast Cancer Dataset

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

Crossref

Recommended from our members

Broad and thematic remodeling of the surfaceome and glycoproteome on isogenic cells transformed with driving proliferative oncogenes.

Author: Coon Joshua
Kirkemo Lisa
Leung Kevin
Riley Nicholas
Wells James
Wilson Gary
Publication venue: eScholarship, University of California
Publication date: 07/04/2020
Field of study

The cell surface proteome, the surfaceome, is the interface for engaging the extracellular space in normal and cancer cells. Here we apply quantitative proteomics of N-linked glycoproteins to reveal how a collection of some 700 surface proteins is dramatically remodeled in an isogenic breast epithelial cell line stably expressing any of six of the most prominent proliferative oncogenes, including the receptor tyrosine kinases, EGFR and HER2, and downstream signaling partners such as KRAS, BRAF, MEK, and AKT. We find that each oncogene has somewhat different surfaceomes, but the functions of these proteins are harmonized by common biological themes including up-regulation of nutrient transporters, down-regulation of adhesion molecules and tumor suppressing phosphatases, and alteration in immune modulators. Addition of a potent MEK inhibitor that blocks MAPK signaling brings each oncogene-induced surfaceome back to a common state reflecting the strong dependence of the oncogene on the MAPK pathway to propagate signaling. Cell surface protein capture is mediated by covalent tagging of surface glycans, yet current methods do not afford sequencing of intact glycopeptides. Thus, we complement the surfaceome data with whole cell glycoproteomics enabled by a recently developed technique called activated ion electron transfer dissociation (AI-ETD). We found massive oncogene-induced changes to the glycoproteome and differential increases in complex hybrid glycans, especially for KRAS and HER2 oncogenes. Overall, these studies provide a broad systems-level view of how specific driver oncogenes remodel the surfaceome and the glycoproteome in a cell autologous fashion, and suggest possible surface targets, and combinations thereof, for drug and biomarker discovery

eScholarship - University of California

Ants constructing rule-based classifiers.

Author: Baesens Bart
De Backer Manu
Haesen Raf
Holvoet Tom
Martens David
Publication venue
Publication date
Field of study

Classifiers; Data; Data mining; Studies;

Research Papers in Economics