7 research outputs found

    Analysis and Computational Dissection of Molecular Signature Multiplicity

    Get PDF
    Molecular signatures are computational or mathematical models created to diagnose disease and other phenotypes and to predict clinical outcomes and response to treatment. It is widely recognized that molecular signatures constitute one of the most important translational and basic science developments enabled by recent high-throughput molecular assays. A perplexing phenomenon that characterizes high-throughput data analysis is the ubiquitous multiplicity of molecular signatures. Multiplicity is a special form of data analysis instability in which different analysis methods used on the same data, or different samples from the same population lead to different but apparently maximally predictive signatures. This phenomenon has far-reaching implications for biological discovery and development of next generation patient diagnostics and personalized treatments. Currently the causes and interpretation of signature multiplicity are unknown, and several, often contradictory, conjectures have been made to explain it. We present a formal characterization of signature multiplicity and a new efficient algorithm that offers theoretical guarantees for extracting the set of maximally predictive and non-redundant signatures independent of distribution. The new algorithm identifies exactly the set of optimal signatures in controlled experiments and yields signatures with significantly better predictivity and reproducibility than previous algorithms in human microarray gene expression datasets. Our results shed light on the causes of signature multiplicity, provide computational tools for studying it empirically and introduce a framework for in silico bioequivalence of this important new class of diagnostic and personalized medicine modalities

    Robust Sparse Hyperplane Classifiers: Application to Uncertain Molecular Profiling Data

    No full text
    Molecular profiling studies can generate abundance measurements for thousands of transcripts, proteins, metabolites, or other species in, for example, normal and tumor tissue samples. Treating such measurements as features and the samples as labeled data points, sparse hyperplanes provide a statistical methodology for classifying data points into one of two categories (classification and prediction) and defining a small subset of discriminatory features (relevant feature identification). However, this and other extant classification methods address only implicitly the issue of observed data being a combination of underlying signals and noise. Recently, robust optimization has emerged as a powerful framework for handling uncertain data explicitly. Here, ideas from this field are exploited to develop robust sparse hyperplanes, i.e., classification and relevant feature identification algorithms that are resilient to variation in the data. Specifically, each data point is associated with an explicit data uncertainty model in the form of an ellipsoid parameterized by a center and covariance matrix. The task of learning a robust sparse hyperplane from such data is formulated as a second order cone program (SOCP). Gaussian and distribution-free data uncertainty models are shown to yield SOCPs that are equivalent to the SCOP based on ellipsoidal uncertainty. The real-world utility of robust sparse hyperplanes is demonstrated via retrospective analysis of breast cancer related transcript profiles. Data-dependent heuristics are used to compute the parameters of each ellipsoidal data uncertainty model. The generalization performance of a specific implementation, designated "robust LIKNON," is better than its nominal counterpart. Finally, the strengths and limitations of robust sparse hyperplanes are discussed

    A human breast cell model of preinvasive to invasive transition

    No full text
    A crucial step in human breast cancer progression is the acquisition of invasiveness. There is a distinct lack of human cell culture models to study the transition from preinvasive to invasive phenotype as it may occur "spontaneously" in vivo. To delineate molecular alterations important for this transition, we isolated human breast epithelial cell lines that showed partial loss of tissue polarity in three-dimensional reconstituted basement membrane cultures. These cells remained noninvasive; however, unlike their nonmalignant counterparts, they exhibited a high propensity to acquire invasiveness through basement membrane in culture. The genomic aberrations and gene expression profiles of the cells in this model showed a high degree of similarity to primary breast tumor profiles. The xenograft tumors formed by the cell lines in three different microenvironments in nude mice displayed metaplastic phenotypes, including squamous and basal characteristics, with invasive cells exhibiting features of higher-grade tumors. To find functionally significant changes in transition from preinvasive to invasive phenotype, we performed attribute profile clustering analysis on the list of genes differentially expressed between preinvasive and invasive cells. We found integral membrane proteins, transcription factors, kinases, transport molecules, and chemokines to be highly represented. In addition, expression of matrix metalloproteinases MMP9, MMP13, MMP15, and MMP17 was up-regulated in the invasive cells. Using small interfering RNA-based approaches, we found these MMPs to be required for the invasive phenotype. This model provides a new tool for dissection of mechanisms by which preinvasive breast cells could acquire invasiveness in a metaplastic context

    Die Stoffwechselkrankheiten und ihre Behandlung.

    No full text
    corecore