60 research outputs found

    Data Mining for Gene Networks Relevant to Poor Prognosis in Lung Cancer Via Backward-Chaining Rule Induction

    Get PDF
    We use Backward Chaining Rule Induction (BCRI), a novel data mining method for hypothesizing causative mechanisms, to mine lung cancer gene expression array data for mechanisms that could impact survival. Initially, a supervised learning system is used to generate a prediction model in the form of “IF <conditions> THEN <outcome>” style rules. Next, each antecedent (i.e. an IF condition) of a previously discovered rule becomes the outcome class for subsequent application of supervised rule induction. This step is repeated until a termination condition is satisfied. “Chains” of rules are created by working backward from an initial condition (e.g. survival status). Through this iterative process of “backward chaining,” BCRI searches for rules that describe plausible gene interactions for subsequent validation. Thus, BCRI is a semi-supervised approach that constrains the search through the vast space of plausible causal mechanisms by using a top-level outcome to kick-start the process. We demonstrate the general BCRI task sequence, how to implement it, the validation process, and how BCRI-rules discovered from lung cancer microarray data can be combined with prior knowledge to generate hypotheses about functional genomics

    Transforming growth factor beta-regulated gene expression in a mouse mammary gland epithelial cell line

    Get PDF
    BACKGROUND: Transforming growth factor beta (TGF-β) plays an essential role in a wide array of cellular processes. The most well studied TGF-β response in normal epithelial cells is growth inhibition. In some cell types, TGF-β induces an epithelial to mesenchymal transition (EMT). NMuMG is a nontransformed mouse mammary gland epithelial cell line that exhibits both a growth inhibitory response and an EMT response to TGF-β, rendering NMuMG cells a good model system for studying these TGF-β effects. METHOD: A National Institutes of Aging mouse 15,000 cDNA microarray was used to profile the gene expression of NMuMG cells treated with TGF-β1 for 1, 6, or 24 hours. Data analyses were performed using GenePixPro and GeneSpring software. Selected microarray results were verified by northern analyses. RESULTS: Of the 15,000 genes examined by microarray, 939 were upregulated or downregulated by TGF-β. This represents approximately 10% of the genes examined, minus redundancy. Seven genes previously not known to be regulated by TGF-β at the transcriptional level (Akt and RhoB) or not at all (IQGAP1, mCalpain, actinin α3, Ikki, PP2A-PR53), were identified and their regulation by TGF-β verified by northern blotting. Cell cycle pathway examination demonstrated downregulation of cyclin D(2), c-myc, Id2, p107, E2F5, cyclin A, cyclin B, and cyclin H. Examination of cell adhesion-related genes revealed upregulation of c-Jun, α-actinin, actin, myosin light chain, p120cas catenin (Catns), α-integrin, integrin β5, fibronectin, IQGAP1, and mCalpain. CONCLUSION: Using a cDNA microarray to examine TGF-β-regulated gene expression in NMuMG cells, we have shown regulation of multiple genes that play important roles in cell cycle control and EMT. In addition, we have identified several novel TGF-β-regulated genes that may mediate previously unknown TGF-β functions

    The tissue microarray data exchange specification: A community-based, open source tool for sharing tissue microarray data

    Get PDF
    BACKGROUND: Tissue Microarrays (TMAs) allow researchers to examine hundreds of small tissue samples on a single glass slide. The information held in a single TMA slide may easily involve Gigabytes of data. To benefit from TMA technology, the scientific community needs an open source TMA data exchange specification that will convey all of the data in a TMA experiment in a format that is understandable to both humans and computers. A data exchange specification for TMAs allows researchers to submit their data to journals and to public data repositories and to share or merge data from different laboratories. In May 2001, the Association of Pathology Informatics (API) hosted the first in a series of four workshops, co-sponsored by the National Cancer Institute, to develop an open, community-supported TMA data exchange specification. METHODS: A draft tissue microarray data exchange specification was developed through workshop meetings. The first workshop confirmed community support for the effort and urged the creation of an open XML-based specification. This was to evolve in steps with approval for each step coming from the stakeholders in the user community during open workshops. By the fourth workshop, held October, 2002, a set of Common Data Elements (CDEs) was established as well as a basic strategy for organizing TMA data in self-describing XML documents. RESULTS: The TMA data exchange specification is a well-formed XML document with four required sections: 1) Header, containing the specification Dublin Core identifiers, 2) Block, describing the paraffin-embedded array of tissues, 3)Slide, describing the glass slides produced from the Block, and 4) Core, containing all data related to the individual tissue samples contained in the array. Eighty CDEs, conforming to the ISO-11179 specification for data elements constitute XML tags used in the TMA data exchange specification. A set of six simple semantic rules describe the complete data exchange specification. Anyone using the data exchange specification can validate their TMA files using a software implementation written in Perl and distributed as a supplemental file with this publication. CONCLUSION: The TMA data exchange specification is now available in a draft form with community-approved Common Data Elements and a community-approved general file format and data structure. The specification can be freely used by the scientific community. Efforts sponsored by the Association for Pathology Informatics to refine the draft TMA data exchange specification are expected to continue for at least two more years. The interested public is invited to participate in these open efforts. Information on future workshops will be posted at (API we site)

    Methodologies for in vitro and in vivo evaluation of efficacy of antifungal and antibiofilm agents and surface coatings against fungal biofilms

    Get PDF
    KT acknowledges receipt of a mandate of Industrial Research Fund (IOFm/05/022). JB acknowledges funding from the European Research Council Advanced Award 3400867/RAPLODAPT and the Israel Science Foundation grant # 314/13 (www.isf.il). NG acknowledges the Wellcome Trust and MRC for funding. CD acknowledges funding from the Agence Nationale de Recherche (ANR-10-LABX-62-IBEID). CJN acknowledges funding from the National Institutes of Health R35GM124594 and R21AI125801. AW is supported by the Wellcome Trust Strategic Award (grant 097377), the MRC Centre for Medical Mycology (grant MR/N006364/1) at the University of Aberdeen MaCA: outside this study MaCA has received personal speaker’s honoraria the past five years from Astellas, Basilea, Gilead, MSD, Pfizer, T2Candida, and Novartis. She has received research grants and contract work paid to the Statens Serum Institute from Astellas, Basilea, Gilead, MSD, NovaBiotics, Pfizer, T2Biosystems, F2G, Cidara, and Amplyx. CAM acknowledges the Wellcome Trust and the MRC MR/N006364/1. PVD, TC and KT acknowledge the FWO research community: Biology and ecology of bacterial and fungal biofilms in humans (FWO WO.009.16N). AAB acknowledges the Deutsche Forschungsgemeinschaft – CRC FungiNet.Peer reviewedPublisher PD

    Multiplicity: an organizing principle for cancers and somatic mutations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the advent of whole-genome analysis for profiling tumor tissue, a pressing need has emerged for principled methods of organizing the large amounts of resulting genomic information. We propose the concept of multiplicity measures on cancer and gene networks to organize the information in a clinically meaningful manner. Multiplicity applied in this context extends Fearon and Vogelstein's multi-hit genetic model of colorectal carcinoma across multiple cancers.</p> <p>Methods</p> <p>Using the Catalogue of Somatic Mutations in Cancer (COSMIC), we construct networks of interacting cancers and genes. Multiplicity is calculated by evaluating the number of cancers and genes linked by the measurement of a somatic mutation. The Kamada-Kawai algorithm is used to find a two-dimensional minimum energy solution with multiplicity as an input similarity measure. Cancers and genes are positioned in two dimensions according to this similarity. A third dimension is added to the network by assigning a maximal multiplicity to each cancer or gene. Hierarchical clustering within this three-dimensional network is used to identify similar clusters in somatic mutation patterns across cancer types.</p> <p>Results</p> <p>The clustering of genes in a three-dimensional network reveals a similarity in acquired mutations across different cancer types. Surprisingly, the clusters separate known causal mutations. The multiplicity clustering technique identifies a set of causal genes with an area under the ROC curve of 0.84 versus 0.57 when clustering on gene mutation rate alone. The cluster multiplicity value and number of causal genes are positively correlated via Spearman's Rank Order correlation (<it>r<sub>s</sub></it>(8) = 0.894, Spearman's <it>t </it>= 17.48, <it>p </it>< 0.05). A clustering analysis of cancer types segregates different types of cancer. All blood tumors cluster together, and the cluster multiplicity values differ significantly (Kruskal-Wallis, <it>H </it>= 16.98, <it>df </it>= 2, <it>p </it>< 0.05).</p> <p>Conclusion</p> <p>We demonstrate the principle of multiplicity for organizing somatic mutations and cancers in clinically relevant clusters. These clusters of cancers and mutations provide representations that identify segregations of cancer and genes driving cancer progression.</p

    Gene expression meta-analysis supports existence of molecular apocrine breast cancer with a role for androgen receptor and implies interactions with ErbB family

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pathway discovery from gene expression data can provide important insight into the relationship between signaling networks and cancer biology. Oncogenic signaling pathways are commonly inferred by comparison with signatures derived from cell lines. We use the Molecular Apocrine subtype of breast cancer to demonstrate our ability to infer pathways directly from patients' gene expression data with pattern analysis algorithms.</p> <p>Methods</p> <p>We combine data from two studies that propose the existence of the Molecular Apocrine phenotype. We use quantile normalization and XPN to minimize institutional bias in the data. We use hierarchical clustering, principal components analysis, and comparison of gene signatures derived from Significance Analysis of Microarrays to establish the existence of the Molecular Apocrine subtype and the equivalence of its molecular phenotype across both institutions. Statistical significance was computed using the Fasano & Franceschini test for separation of principal components and the hypergeometric probability formula for significance of overlap in gene signatures. We perform pathway analysis using LeFEminer and Backward Chaining Rule Induction to identify a signaling network that differentiates the subset. We identify a larger cohort of samples in the public domain, and use Gene Shaving and Robust Bayesian Network Analysis to detect pathways that interact with the defining signal.</p> <p>Results</p> <p>We demonstrate that the two separately introduced ER<sup>- </sup>breast cancer subsets represent the same tumor type, called Molecular Apocrine breast cancer. LeFEminer and Backward Chaining Rule Induction support a role for AR signaling as a pathway that differentiates this subset from others. Gene Shaving and Robust Bayesian Network Analysis detect interactions between the AR pathway, EGFR trafficking signals, and ErbB2.</p> <p>Conclusion</p> <p>We propose criteria for meta-analysis that are able to demonstrate statistical significance in establishing molecular equivalence of subsets across institutions. Data mining strategies used here provide an alternative method to comparison with cell lines for discovering seminal pathways and interactions between signaling networks. Analysis of Molecular Apocrine breast cancer implies that therapies targeting AR might be hampered if interactions with ErbB family members are not addressed.</p

    Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin

    Get PDF
    Recent genomic analyses of pathologically-defined tumor types identify “within-a-tissue” disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head & neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multi-platform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All datasets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies

    Dedication: Chiara Silvestrini

    No full text
    Cancer Informatics 2007 Dedication is on Chiara Silvestrini
    corecore