242 research outputs found

    Microarray Analysis in Drug Discovery and Biomarker Identification

    Get PDF

    Functional assessment of time course microarray data

    Get PDF
    <p>Abstract</p> <p>Motivation</p> <p>Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated.</p> <p>Methods</p> <p>We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies.</p> <p>Results</p> <p>Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study.</p

    Mining microarray data to predict the histological grade of a breast cancer

    Get PDF
    BACKGROUND: The aim of this study was to develop an original method to extract sets of relevant molecular biomarkers (gene sequences) that can be used for class prediction and can be included as prognostic and predictive tools. MATERIALS AND METHODS: The method is based on sequential patterns used as features for class prediction. We applied it to classify breast cancer tumors according to their histological grade. RESULTS: We obtained very good recall and precision for grades 1 and 3 tumors, but, like other authors, our results were less satisfactory for grade 2 tumors. CONCLUSIONS: We demonstrated the interest of sequential patterns for class prediction of microarrays and we now have the material to use them for prognostic and predictive applications

    Dynamic metabolomic data analysis: a tutorial review

    Get PDF
    In metabolomics, time-resolved, dynamic or temporal data is more and more collected. The number of methods to analyze such data, however, is very limited and in most cases the dynamic nature of the data is not even taken into account. This paper reviews current methods in use for analyzing dynamic metabolomic data. Moreover, some methods from other fields of science that may be of use to analyze such dynamic metabolomics data are described in some detail. The methods are put in a general framework after providing a formal definition on what constitutes a ‘dynamic’ method. Some of the methods are illustrated with real-life metabolomics examples

    ARSyN: a method for the identification and removal of systematic noise in multifactorial time-course microarray experiments

    Full text link
    Transcriptomic profiling experiments that aim to the identification of responsive genes in specific biological conditions are commonly set up under defined experimental designs that try to assess the effects of factors and their interactions on gene expression. Data from these controlled experiments, however, may also contain sources of unwanted noise that can distort the signal under study, affect the residuals of applied statistical models, and hamper data analysis. Commonly, normalization methods are applied to transcriptomics data to remove technical artifacts, but these are normally based on general assumptions of transcript distribution and greatly ignore both the characteristics of the experiment under consideration and the coordinative nature of gene expression. In this paper, we propose a novel methodology, ARSyN, for the preprocessing of microarray data that takes into account these 2 last aspects. By combining analysis of variance (ANOVA) modeling of gene expression values and multivariate analysis of estimated effects, the method identifies the nonstructured part of the signal associated to the experimental factors (the noise within the signal) and the structured variation of the ANOVA errors (the signal of the noise). By removing these noise fractions from the original data, we create a filtered data set that is rich in the information of interest and includes only the random noise required for inferential analysis. In this work, we focus on multifactorial time course microarray (MTCM) experiments with 2 factors: one quantitative such as time or dosage and the other qualitative, as tissue, strain, or treatment. However, the method can be used in other situations such as experiments with only one factor or more complex designs with more than 2 factors. The filtered data obtained after applying ARSyN can be further analyzed with the appropriate statistical technique to obtain the biological information required. To evaluate the performance of the filtering strategy, we have applied different statistical approaches for MTCM analysis to several real and simulateddata sets, studying also the efficiency of these techniques. By comparing the results obtained with the original and ARSyN filtered data and also with other filtering techniques, we can conclude that the proposed method increases the statistical power to detect biological signals, especially in cases where there are high levels of structural noise. Software for ARSyN is freely available at http://www.ua.es/personal/mj.nuedaSpanish MICINN Project (BIO2008-04368-E and DPI2008-06880-C03-03/DPI).Nueda, MJ.; Ferrer Riquelme, AJ.; Conesa, A. (2011). ARSyN: a method for the identification and removal of systematic noise in multifactorial time-course microarray experiments. Biostatistics. 13(3):553-566. doi:10.1093/biostatistics/kxr042S553566133Al-Shahrour, F., Minguez, P., Tárraga, J., Medina, I., Alloza, E., Montaner, D., & Dopazo, J. (2007). FatiGO +: a functional profiling tool for genomic data. Integration of functional annotation, regulatory motifs and interaction data with microarray experiments. Nucleic Acids Research, 35(suppl_2), W91-W96. doi:10.1093/nar/gkm260Alter, O., Brown, P. O., & Botstein, D. (2000). Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences, 97(18), 10101-10106. doi:10.1073/pnas.97.18.10101Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C. M., & Marron, J. S. (2003). Adjustment of systematic microarray data biases. Bioinformatics, 20(1), 105-114. doi:10.1093/bioinformatics/btg385Brumós, J., Colmenero-Flores, J. M., Conesa, A., Izquierdo, P., Sánchez, G., Iglesias, D. J., … Talón, M. (2009). Membrane transporters and carbon metabolism implicated in chloride homeostasis differentiate salt stress responses in tolerant and sensitive Citrus rootstocks. Functional & Integrative Genomics, 9(3), 293-309. doi:10.1007/s10142-008-0107-6Conesa, A., Nueda, M. J., Ferrer, A., & Talon, M. (2006). maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics, 22(9), 1096-1102. doi:10.1093/bioinformatics/btl056Heijne, W. H. ., Stierum, R. H., Slijper, M., van Bladeren, P. J., & van Ommen, B. (2003). Toxicogenomics of bromobenzene hepatotoxicity: a combined transcriptomics and proteomics approach. Biochemical Pharmacology, 65(5), 857-875. doi:10.1016/s0006-2952(02)01613-1Jansen, J. J., Hoefsloot, H. C. J., van der Greef, J., Timmerman, M. E., Westerhuis, J. A., & Smilde, A. K. (2005). ASCA: analysis of multivariate data obtained from an experimental design. Journal of Chemometrics, 19(9), 469-481. doi:10.1002/cem.952Johnson, W. E., Li, C., & Rabinovic, A. (2006). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8(1), 118-127. doi:10.1093/biostatistics/kxj037Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., … Irizarry, R. A. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics, 11(10), 733-739. doi:10.1038/nrg2825Luo, J., Schumacher, M., Scherer, A., Sanoudou, D., Megherbi, D., Davison, T., … Zhang, J. (2010). A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data. The Pharmacogenomics Journal, 10(4), 278-291. doi:10.1038/tpj.2010.57(2010). The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nature Biotechnology, 28(8), 827-838. doi:10.1038/nbt.1665Morán, J. M., Ortiz-Ortiz, M. A., Ruiz-Mesa, L. M., & Fuentes, J. M. (2010). Nitric oxide in paraquat-mediated toxicity: A review. Journal of Biochemical and Molecular Toxicology, 24(6), 402-409. doi:10.1002/jbt.20348Nueda, M. J., Conesa, A., Westerhuis, J. A., Hoefsloot, H. C. J., Smilde, A. K., Talón, M., & Ferrer, A. (2007). Discovering gene expression patterns in time course microarray experiments by ANOVA–SCA. Bioinformatics, 23(14), 1792-1800. doi:10.1093/bioinformatics/btm251Rensink, W. A., Iobst, S., Hart, A., Stegalkina, S., Liu, J., & Buell, C. R. (2005). Gene expression profiling of potato responses to cold, heat, and salt stress. Functional & Integrative Genomics, 5(4), 201-207. doi:10.1007/s10142-005-0141-6Smilde, A. K., Jansen, J. J., Hoefsloot, H. C. J., Lamers, R.-J. A. N., van der Greef, J., & Timmerman, M. E. (2005). ANOVA-simultaneous component analysis (ASCA): a new tool for analyzing designed metabolomics data. Bioinformatics, 21(13), 3043-3048. doi:10.1093/bioinformatics/bti476Storey, J. D., Xiao, W., Leek, J. T., Tompkins, R. G., & Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences, 102(36), 12837-12842. doi:10.1073/pnas.0504609102Svendsen, C., Owen, J., Kille, P., Wren, J., Jonker, M. J., Headley, B. A., … Spurgeon, D. J. (2008). Comparative Transcriptomic Responses to Chronic Cadmium, Fluoranthene, and Atrazine Exposure in Lumbricus rubellus. Environmental Science & Technology, 42(11), 4208-4214. doi:10.1021/es702745dTai, Y. C., & Speed, T. P. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data. The Annals of Statistics, 34(5), 2387-2412. doi:10.1214/009053606000000759Chuan Tai, Y., & Speed, T. P. (2008). On Gene Ranking Using Replicated Microarray Time Course Data. Biometrics, 65(1), 40-51. doi:10.1111/j.1541-0420.2008.01057.xYang, Y. H. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research, 30(4), 15e-15. doi:10.1093/nar/30.4.e1

    The Transcriptional Landscape of Hematopoietic Stem and Progenitor Cells during Acute Inflammatory Stress

    Get PDF
    Hematopoietic stem cells (HSCs) are critical components of the hematopoietic system and are responsible for renewing all blood cell lineages throughout life. These cells are quiescent and reside in niches in the bone marrow (BM). Over the past decade, our group and others have discovered that inflammatory stress impacts quiescent HSCs in vivo, leading to their activation. However, the dynamics, heterogeneity, and mechanisms underlying stress-induced activation of HSCs remain unclear. In this thesis, I unraveled the mechanisms regulating HSCs proliferation and recovery in response to acute treatment with the proinflammatory cytokine interferon alpha(IFNα) by initially determining three-time points representing the sensing, proliferation, and recovery phases of HSCs' proliferative response to acute IFNα treatment. Using time series bulk RNA sequencing (RNAseq), I identified distinct molecular patterns and changes in the activation and repression of various biological categories in HSCs. Surprisingly, even after returning to a quiescent state 72 hours (h) post-treatment, HSCs remained metabolically active and underwent a significant metabolic shift towards oxidative phosphorylation (OXPHOS). In addition, the tricarboxylic acid cycle (TCA), pentose phosphate pathway (PPP), fatty acid, and purine metabolism were reduced, and HSCs showed decreased myeloid priming and bias. Thus far, little is known about the dynamics and heterogeneity of these stress responses in the whole hematopoietic stem and progenitor (HSPC) cells. Inflammation-induced marker changes in the HSPCs compartment make it challenging to investigate the heterogeneity in the inflammatory response in HSPCs. Thus, I employed a single-cell (Sc) time series RNAseq experiment to study the heterogeneous and dynamic impacts of IFNα on HSPCs. The results showed heterogeneity in the response of HSPCs to IFNα, with HSCs being the strongest responders based on their gene expression changes. In collaboration with Brigitte Bouman and Dr. Laleh Haghverdi at the MDC in Berlin, we developed and used a response-pseudotime inference approach to analyze the scRNAseq data and identified global and cell type-specific inflammation signatures, revealing unique molecular patterns of gene expression and biological processes in response to IFNα. Interestingly, we were able to associate reduced myeloid differentiation programs in HSPCs with a reduced abundance of myeloid progenitors and differentiated cells following IFNα treatment. Taken together, the single-cell time series analyses have allowed us to unbiasedly study the heterogeneous and dynamic impact of IFNα on the HSPCs. In addition to investigating the dynamics and heterogeneity of the response of HSCs to IFNα, I compared the immediate transcriptional response of HSCs to various other proinflammatory cytokines. This analysis showed that IFNs, TNFα, ILs, and mimetics of viral and bacterial infections induced unique gene alterations in HSCs, underscoring the diversity of cytokine responses in these cells. Finally, I investigated how the baseline levels of these proinflammatory cytokines regulate hematopoiesis. Analysis of the hematopoietic system in Ifnar-/-Ifngr-/- (2KO) and Ifnar-/-Ifngr-/-Tnfrsf1dKOIl1r-/- (5KO) mice under homeostatic conditions revealed a decrease in HSCs and LSKs compared with wild-type (WT) mice. Furthermore, HSCs from these cytokine receptors knockout (KO) mice showed impaired colony-forming capacity and early competitive advantage. Interestingly, 5KO mice also showed a delayed recovery of HSCs cycling following 5-FU treatment. In addition, bulk RNA sequencing of 5KO HSCs revealed altered cell cycle pathways. Overall, these results underscore the essential role of proinflammatory cytokines in regulating HSC function during homeostasis. In conclusion, this thesis comprehensively explains the transcriptional changes within the HSPCs population in response to proinflammatory cytokines, focusing on IFNα

    Generalized genetical genomics : advanced methods and applications

    Get PDF
    Generalized genetical genomics (GGG) is a systems genetics approach that combines the analysis of genetic variation with population-wide assessment of variation in molecular traits in multiple environments to identify genotype-by-environment interactions. This thesis starts by introducing the generalized genetical genomics strategy (Chapter 1). Then, we present a newly developed software, designGG for designing optimal GGG experiments (Chapter 2). Next, two important statistical issues relevant to GGG studies were addressed. We discussed the critical concerns on causal inference with genetic data. In addition, we examined the permutation method used for determining the significance of quantitative trait loci (QTL) hotspots in linkage and association studies (Chapter 3−4). Furthermore, we applied the GGG strategy to three pilot studies: In the first of these, we showed that heritable differences in the plastic responses of gene expression are largely regulated in “trans''. In the second pilot study, we demonstrated that heritable differences in transcript abundance are highly sensitive to cellular differentiation stage. In the third study, we found that the alternative splicing machinery exhibits a general genetic robustness in C. elegans and that only a minor fraction of genes shows heritable variation in splicing forms and relative abundance. (Chapter 5−7). Finally, we conclude by discussing various fundamental issues involved in data preprocessing, QTL mapping, result interpretation and network reconstruction and suggesting future directions yet to be explored in order to expand the reach of systems genetics (Chapter 8).
    corecore