51 research outputs found

    Determination of strongly overlapping signaling activity from microarray data

    Get PDF
    BACKGROUND: As numerous diseases involve errors in signal transduction, modern therapeutics often target proteins involved in cellular signaling. Interpretation of the activity of signaling pathways during disease development or therapeutic intervention would assist in drug development, design of therapy, and target identification. Microarrays provide a global measure of cellular response, however linking these responses to signaling pathways requires an analytic approach tuned to the underlying biology. An ongoing issue in pattern recognition in microarrays has been how to determine the number of patterns (or clusters) to use for data interpretation, and this is a critical issue as measures of statistical significance in gene ontology or pathways rely on proper separation of genes into groups. RESULTS: Here we introduce a method relying on gene annotation coupled to decompositional analysis of global gene expression data that allows us to estimate specific activity on strongly coupled signaling pathways and, in some cases, activity of specific signaling proteins. We demonstrate the technique using the Rosetta yeast deletion mutant data set, decompositional analysis by Bayesian Decomposition, and annotation analysis using ClutrFree. We determined from measurements of gene persistence in patterns across multiple potential dimensionalities that 15 basis vectors provides the correct dimensionality for interpreting the data. Using gene ontology and data on gene regulation in the Saccharomyces Genome Database, we identified the transcriptional signatures of several cellular processes in yeast, including cell wall creation, ribosomal disruption, chemical blocking of protein synthesis, and, criticially, individual signatures of the strongly coupled mating and filamentation pathways. CONCLUSION: This works demonstrates that microarray data can provide downstream indicators of pathway activity either through use of gene ontology or transcription factor databases. This can be used to investigate the specificity and success of targeted therapeutics as well as to elucidate signaling activity in normal and disease processes

    High-Resolution Comparative Genomic Hybridization of Inflammatory Breast Cancer and Identification of Candidate Genes

    Get PDF
    BACKGROUND: Inflammatory breast cancer (IBC) is an aggressive form of BC poorly defined at the molecular level. We compared the molecular portraits of 63 IBC and 134 non-IBC (nIBC) clinical samples. METHODOLOGY/FINDINGS: Genomic imbalances of 49 IBCs and 124 nIBCs were determined using high-resolution array-comparative genomic hybridization, and mRNA expression profiles of 197 samples using whole-genome microarrays. Genomic profiles of IBCs were as heterogeneous as those of nIBCs, and globally relatively close. However, IBCs showed more frequent "complex" patterns and a higher percentage of genes with CNAs per sample. The number of altered regions was similar in both types, although some regions were altered more frequently and/or with higher amplitude in IBCs. Many genes were similarly altered in both types; however, more genes displayed recurrent amplifications in IBCs. The percentage of genes whose mRNA expression correlated with CNAs was similar in both types for the gained genes, but ∼7-fold lower in IBCs for the lost genes. Integrated analysis identified 24 potential candidate IBC-specific genes. Their combined expression accurately distinguished IBCs and nIBCS in an independent validation set, and retained an independent prognostic value in a series of 1,781 nIBCs, reinforcing the hypothesis for a link with IBC aggressiveness. Consistent with the hyperproliferative and invasive phenotype of IBC these genes are notably involved in protein translation, cell cycle, RNA processing and transcription, metabolism, and cell migration. CONCLUSIONS: Our results suggest a higher genomic instability of IBC. We established the first repertory of DNA copy number alterations in this tumor, and provided a list of genes that may contribute to its aggressiveness and represent novel therapeutic targets

    Characterization of unknown adult stem cell samples by large scale data integration and artificial neural networks.

    No full text
    International audienceStem cells represent not only a potential source of treatment for degenerative diseases but can also shed light on developmental biology and cancer. It is believed that stem cells differentiation and fate is triggered by a common genetic program that endows those cells with the ability to differentiate into specialized progenitors and fully differentiated cells. To extract the stemness signature of several cells types at the transcription level, we integrated heterogeneous datasets (microarray experiments) performed in different adult and embryonic tissues (liver, blood, bone, prostate and stomach in Homo sapiens and Mus musculus). Data were integrated by generalization of the hematopoietic stem cell hierarchy and by homology between mouse and human. The variation-filtered and integrated gene expression dataset was fed to a single-layered neural network to create a classifier to (i) extract the stemness signature and (ii) characterize unknown stem cell tissue samples by attribution of a stem cell differentiation stage. We were able to characterize mouse stomach progenitor and human prostate progenitor samples and isolate gene signatures playing a fundamental role for every level of the generalized stem cell hierarchy

    Analyse de profils phylogéniques et de niveaux d'expression génétique par Décomposition Bayésienne

    No full text
    Nous détaillons ici une nouvelle technique, la Décomposition Bayésienne, et son application à l'analyse de données biologiques: expression génétique et profils phylogéniques.La Décomposition Bayésienne associe un modèle Bayésien à un échantillonneur de Monte-Carlo par Chaîne de Markov (MCMC) permettant de déduire un modèle prenant la forme d'un produit de deux matrices à partir de données expérimentales.L'application de la Décomposition Bayésienne sur une matrice de similarité contenant environ un millier de gènes pour 31 bactéries a permis d'isoler les gènes spécifiques à certaines lignées de bactéries. Ce système a le potentiel d'aider à la découverte de gènes cibles pour le développement de nouveaux antibiotiques, et de répondre à la résistance croissante des bactéries.En analyse de microarrays, nous avons pu grouper des gènes de façon cohérente dans un jeu de données complexe (Le Compendium publié par Rosetta Inpharmatics). Son analyse par Décomposition Bayésienne a permis d'isoler un groupe de gènes relatif à la reproduction.We present a new data mining technique, Bayesian Decomposition, and its application to the analysis of biological data: gene expression microarrays and phylogenomic profiles.Bayesian Decomposition uses a Markov Chain Monte Carlo method together with a Bayesian model. This permits to infer a model that takes the form of two matrices that, multiplied together, reconstruct the data. Unlike classical approaches such as hierarchical clustering, where genes are dispatched into single groups, Bayesian Decomposition brings a model physiologically meaningful where genes can belong to multiple functional groups.The application of the approach to a phylogenomic dataset with a similarity matrix of a thousand genes for 31 bacteria allowed the separation of genes related to specific bacterial lineages. This data has the potential to help the discovery of gene targets for new antibiotics and tackle bacterial resistance.In microarrays analysis, we grouped genes coherently in a complex dataset (The Rosetta Inpharmatics Compendium). The data analysis by Bayesian Decomposition allowed the retrieval of the genes involved in the mating pathway.AIX-MARSEILLE2-BU Sci.Luminy (130552106) / SudocSudocFranceF

    Analysis of Phylogenetic Profiles Using Bayesian Decomposition

    No full text
    Antibiotic resistance together with the side effects of broad spectrum antibacterials make development of targeted antibiotics of great interest. To meet the problem of identifying potential targets specific to some genuses, a dataset comprising a series of phylogenetic profiles was built for a series of pathogenic bacteria of interest. The profiles are the highest BLAST scores for genes compared to selected genes of E. coli and M. tuberculosis. The dataset reflects the past evolution of those genes due to adaptation to specific niches, marked by lateral gene transfer, duplication and mutation of existing genes, or merging of existing genes. Genes that function together will be constrained to evolve together, to maintain viability in the organism. However, a given gene may have a role in multiple functional groups through the evolutionary process. Analysis using Bayesian Decomposition helps to retrieve those relationships by retrieving fundamental patterns related to the evolutionary retained functions. 1

    Interactome-Transcriptome integration for predicting distant metastasis in breast cancer

    Get PDF
    International audienceMotivation: High-throughput gene expression profiling yields genomic signatures that allow the prediction of clinical conditions including patient outcome. However, these signatures have limitations, such as dependency on the training set, and worse, lack of generalization.Results: We propose a novel algorithm called ITI (interactome-transcriptome integration), to extract a genomic signature predicting distant metastasis in breast cancer by superimposition of large-scale protein-protein interaction data over a compendium of several gene expression datasets. Training on two different compendia showed that the estrogen receptor-specific signatures obtained are more stable (11-35% stability), can be generalized on independent data and performs better than previously published methods (53-74% accuracy).Availability: The ITI algorithm source code from analysis are available under CeCILL from the ITI companion website: http://mv.ezproxy.com.proxy.insermbiblio.inist.fr/iti.Supplementary information: Supplementary data are available at Bioinformatics online
    • …
    corecore