12 research outputs found

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery

    Get PDF
    BACKGROUND: Molecular profiling generates abundance measurements for thousands of gene transcripts in biological samples such as normal and tumor tissues (data points). Given such two-class high-dimensional data, many methods have been proposed for classifying data points into one of the two classes. However, finding very small sets of features able to correctly classify the data is problematic as the fundamental mathematical proposition is hard. Existing methods can find "small" feature sets, but give no hint how close this is to the true minimum size. Without fundamental mathematical advances, finding true minimum-size sets will remain elusive, and more importantly for the microarray community there will be no methods for finding them. RESULTS: We use the brute force approach of exhaustive search through all genes, gene pairs (and for some data sets gene triples). Each unique gene combination is analyzed with a few-parameter linear-hyperplane classification method looking for those combinations that form training error-free classifiers. All 10 published data sets studied are found to contain predictive small feature sets. Four contain thousands of gene pairs and 6 have single genes that perfectly discriminate. CONCLUSION: This technique discovered small sets of genes (3 or less) in published data that form accurate classifiers, yet were not reported in the prior publications. This could be a common characteristic of microarray data, thus making looking for them worth the computational cost. Such small gene sets could indicate biomarkers and portend simple medical diagnostic tests. We recommend checking for small gene sets routinely. We find 4 gene pairs and many gene triples in the large hepatocellular carcinoma (HCC, Liver cancer) data set of Chen et al. The key component of these is the "placental gene of unknown function", PLAC8. Our HMM modeling indicates PLAC8 might have a domain like part of lP59's crystal structure (a Non-Covalent Endonuclease lii-Dna Complex). The previously identified HCC biomarker gene, glypican 3 (GPC3), is part of an accurate gene triple involving MT1E and ARHE. We also find small gene sets that distinguish leukemia subtypes in the large pediatric acute lymphoblastic leukemia cancer set of Yeoh et al

    A human breast cell model of pre-invasive to invasive transition

    Get PDF
    A crucial step in human breast cancer progression is the acquisition of invasiveness. There is a distinct lack of human cell culture models to study the transition from pre-invasive to invasive phenotype as it may occur 'spontaneously' in vivo. To delineate molecular alterations important for this transition, we isolated human breast epithelial cell lines that showed partial loss of tissue polarity in three-dimensional reconstituted-basement membrane cultures. These cells remained non-invasive; however, unlike their non-malignant counterparts, they exhibited a high propensity to acquire invasiveness through basement membrane in culture. The genomic aberrations and gene expression profiles of the cells in this model showed a high degree of similarity to primary breast tumor profiles. The xenograft tumors formed by the cell lines in three different microenvironments in nude mice displayed metaplastic phenotypes, including squamous and basal characteristics, with invasive cells exhibiting features of higher grade tumors. To find functionally significant changes in transition from pre-invasive to invasive phenotype, we performed attribute profile clustering analysis on the list of genes differentially expressed between pre-invasive and invasive cells. We found integral membrane proteins, transcription factors, kinases, transport molecules, and chemokines to be highly represented. In addition, expression of matrix metalloproteinases MMP-9,-13,-15,-17 was up regulated in the invasive cells. Using siRNA based approaches, we found these MMPs to be required for the invasive phenotype. This model provides a new tool for dissection of mechanisms by which pre-invasive breast cells could acquire invasiveness in a metaplastic context

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-6

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p>A subtype samples

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p> with PLAC8

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-4

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p>ween the two classes. In this example there is a large separation between the two classes and perfect separation is achieved and no data point is close to the plane

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-2

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p>ontains an interactive plot

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-5

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p>beit with a small margin. Plus signs are the T subtype samples

    Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery-7

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery"</p><p>BMC Bioinformatics 2005;6():97-97.</p><p>Published online 13 Apr 2005</p><p>PMCID:PMC1090559.</p><p>Copyright © 2005 Grate; licensee BioMed Central Ltd.</p>bined with RPS6 can perfectly separate the data
    corecore