318 research outputs found

    arrayQualityMetrics—a bioconductor package for quality assessment of microarray data

    Get PDF
    Summary:: The assessment of data quality is a major concern in microarray analysis. arrayQualityMetrics is a Bioconductor package that provides a report with diagnostic plots for one or two colour microarray data. The quality metrics assess reproducibility, identify apparent outlier arrays and compute measures of signal-to-noise ratio. The tool handles most current microarray technologies and is amenable to use in automated analysis pipelines or for automatic report generation, as well as for use by individuals. The diagnosis of quality remains, in principle, a context-dependent judgement, but our tool provides powerful, automated, objective and comprehensive instruments on which to base a decision

    RGG: A general GUI Framework for R scripts

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>R is the leading open source statistics software with a vast number of biostatistical and bioinformatical analysis packages. To exploit the advantages of R, extensive scripting/programming skills are required.</p> <p>Results</p> <p>We have developed a software tool called R GUI Generator (RGG) which enables the easy generation of Graphical User Interfaces (GUIs) for the programming language R by adding a few Extensible Markup Language (XML) – tags. RGG consists of an XML-based GUI definition language and a Java-based GUI engine. GUIs are generated in runtime from defined GUI tags that are embedded into the R script. User-GUI input is returned to the R code and replaces the XML-tags. RGG files can be developed using any text editor. The current version of RGG is available as a stand-alone software (RGGRunner) and as a plug-in for JGR.</p> <p>Conclusion</p> <p>RGG is a general GUI framework for R that has the potential to introduce R statistics (R packages, built-in functions and scripts) to users with limited programming skills and helps to bridge the gap between R developers and GUI-dependent users. RGG aims to abstract the GUI development from individual GUI toolkits by using an XML-based GUI definition language. Thus RGG can be easily integrated in any software. The RGG project further includes the development of a web-based repository for RGG-GUIs. RGG is an open source project licensed under the Lesser General Public License (LGPL) and can be downloaded freely at <url>http://rgg.r-forge.r-project.org</url></p

    Importing ArrayExpress datasets into R/Bioconductor

    Get PDF
    Summary:ArrayExpress is one of the largest public repositories of microarray datasets. R/Bioconductor provides a comprehensive suite of microarray analysis and integrative bioinformatics software. However, easy ways for importing datasets from ArrayExpress into R/Bioconductor have been lacking. Here, we present such a tool that is suitable for both interactive and automated use

    An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate.</p> <p>Results</p> <p>We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures.</p> <p>Conclusion</p> <p>T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially.</p

    EMA - A R package for Easy Microarray data analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The increasing number of methodologies and tools currently available to analyse gene expression microarray data can be confusing for non specialist users.</p> <p>Findings</p> <p>Based on the experience of biostatisticians of Institut Curie, we propose both a clear analysis strategy and a selection of tools to investigate microarray gene expression data. The most usual and relevant existing R functions were discussed, validated and gathered in an easy-to-use R package (EMA) devoted to gene expression microarray analysis. These functions were improved for ease of use, enhanced visualisation and better interpretation of results.</p> <p>Conclusions</p> <p>Strategy and tools proposed in the EMA R package could provide a useful starting point for many microarrays users. EMA is part of Comprehensive R Archive Network and is freely available at <url>http://bioinfo.curie.fr/projects/ema/</url>.</p

    BeadArray Expression Analysis Using Bioconductor

    Get PDF
    Illumina whole-genome expression BeadArrays are a popular choice in gene profiling studies. Aside from the vendor-provided software tools for analyzing BeadArray expression data (GenomeStudio/BeadStudio), there exists a comprehensive set of open-source analysis tools in the Bioconductor project, many of which have been tailored to exploit the unique properties of this platform. In this article, we explore a number of these software packages and demonstrate how to perform a complete analysis of BeadArray data in various formats. The key steps of importing data, performing quality assessments, preprocessing, and annotation in the common setting of assessing differential expression in designed experiments will be covered

    Functional characterization and annotation of trait-associated genomic regions by transcriptome analysis

    Get PDF
    In this work, two novel implementations have been presented, which could assist in the design and data analysis of high-throughput genomic experiments. An efficient and flexible tiling probe selection pipeline utilizing the penalized uniqueness score has been implemented, which could be employed in the design of various types and scales of genome tiling task. A novel hidden semi-Markov model (HSMM) implementation is made available within the Bioconductor project, which provides a unified interface for segmenting genomic data in a wide range of research subjects.In dieser Arbeit werden zwei neuartige Implementierungen präsentiert, die im Design und in der Datenanalyse von genomischen Hochdurchsatz-Experiment hilfreich sein könnten. Die erste Implementierung bildet eine effiziente und flexible Auswahl-Pipeline für Tiling-Proben, basierend auf einem Eindeutigkeitsmaß mit einer Maluswertung. Als zweite Implementierung wurde ein neuartiges Hidden-Semi-Markov-Modell (HSMM) im Bioconductor Projekt verfügbar gemacht

    Genomic analysis of macrophage gene signatures during idiopathic pulmonary fibrosis development

    Get PDF
    Idiopathic Pulmonary Fibrosis (IPF) is a chronic, progressive, irreversible lung disease. After diagnosis, the interstitial condition commonly presents 3-5 years of life expectancy if untreated. Despite the limited capacity of recapitulating IPF, animal models have been useful for identifying related pathways relevant for drug discovery and diagnostic tools development. Using these techniques, several immune-related mechanisms have been implicated to IPF. For instance, subpopulations of macrophages and monocytes-derived cells are recognized as centrally active in pulmonary immunological processes. One of the most used technologies is high-throughput gene expression analysis, which has been available for almost two decades now. The “omics” revolution has presented major impacts on macrophage and pulmonary fibrosis research. The present study aims to investigate macrophage dynamics within the context of IPF at the transcriptomic level. Using publicly available gene-expression data, we applied modern data science approaches to (1) understand longitudinal profiles within IPF models; (2) investigate correlation between macrophage genomic dynamics and IPF development; and (3) apply longitudinal profiles uncovered through multivariate data analysis to the development of new sets of predictors able to classify IPF and control samples accordingly. Principal Component Analysis and Hierarchical Clustering showed that our pipeline was able to construct a complex set of biomarker candidates that together outperformed gene expression alone in separating treatment groups in an IPF animal model dataset. We further assessed the predictive performance of our candidates on publicly available gene expression data from IPF patients. Once again, the constructed biomarker candidates were significantly differentiated between IPF and control samples. The data presented in this work strongly suggest that longitudinal data analysis holds major unappreciated potentials for translational medicine research

    EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management.</p> <p>Results</p> <p>EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms.</p> <p>Conclusion</p> <p>EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume.</p
    corecore