Search CORE

691 research outputs found

Removing batch effects for prediction problems with frozen surrogate variable analysis

Author: Bravo Héctor Corrada
Leek Jeffrey T.
Parker Hilary S.
Publication venue
Publication date: 16/01/2013
Field of study

Batch effects are responsible for the failure of promising genomic prognos- tic signatures, major ambiguities in published genomic results, and retractions of widely-publicized findings. Batch effect corrections have been developed to re- move these artifacts, but they are designed to be used in population studies. But genomic technologies are beginning to be used in clinical applications where sam- ples are analyzed one at a time for diagnostic, prognostic, and predictive applica- tions. There are currently no batch correction methods that have been developed specifically for prediction. In this paper, we propose an new method called frozen surrogate variable analysis (fSVA) that borrows strength from a training set for individual sample batch correction. We show that fSVA improves prediction ac- curacy in simulations and in public genomic studies. fSVA is available as part of the sva Bioconductor package

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

PubMed Central

Transcriptomics in Toxicogenomics, Part II : Preprocessing and Differential Expression Analysis for High Quality Data

Author: Afantitis Antreas
Cattelani Luca
Choi Jang-Sik
Federico Antonio
Fratello Michele
Grafström Roland
Greco Dario
Gulumian Mary
Ha My Kieu
Jagiello Karolina
Kinaret Pia Anneli Sofia
Kohonen Pekka
Liampa Irene
Melagraki Georgia
Nymark Penny
Puzyn Tomasz
Sanabria Natasha
Sarimveis Haralambos
Serra Angela
Yoon Tae-Hyun
Publication venue
Publication date: 01/01/2020
Field of study

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.Peer reviewe

Institutional Repository Universiteit Antwerpen

TamPub Julkaisuarkisto - TamPub Institutional Repository

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

An evaluation of processing methods for HumanMethylation450 BeadChip data

Author: Jie Liu
Kimberly D. Siegmund
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

BackgroundIllumina's HumanMethylation450 arrays provide the most cost-effective means of high-throughput DNA methylation analysis. As with other types of microarray platforms, technical artifacts are a concern, including background fluorescence, dye-bias from the use of two color channels, bias caused by type I/II probe design, and batch effects. Several approaches and pipelines have been developed, either targeting a single issue or designed to address multiple biases through a combination of methods. We evaluate the effect of combining separate approaches to improve signal processing.ResultsIn this study nine processing methods, including both within- and between- array methods, are applied and compared in four datasets. For technical replicates, we found both within- and between-array methods did a comparable job in reducing variance across replicates. For evaluating biological differences, within-array processing always improved differential DNA methylation signal detection over no processing, and always benefitted from performing background correction first. Combinations of within-array procedures were always among the best performing methods, with a slight advantage appearing for the between-array method Funnorm when batch effects explained more variation in the data than the methylation alterations between cases and controls. However, when this occurred, RUVm, a new batch correction method noticeably improved reproducibility of differential methylation results over any of the signal-processing methods alone.ConclusionsThe comparisons in our study provide valuable insights in preprocessing HumanMethylation450 BeadChip data. We found the within-array combination of Noob + BMIQ always improved signal sensitivity, and when combined with the RUVm batch-correction method, outperformed all other approaches in performing differential DNA methylation analysis. The effect of the data processing method, in any given data set, was a function of both the signal and noise

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

FigShare

Microarray Data Preprocessing: From Experimental Design to Differential Analysis

Author: del Giudice Giusy
Federico Antonio
Greco Dario
Kinaret Pia Anneli Sofia
Saarimäki Laura Aliisa
Scala Giovanni
Serra Angela
Publication venue: Springer, UK
Publication date: 01/01/2022
Field of study

DNA microarray data preprocessing is of utmost importance in the analytical path starting from the experimental design and leading to a reliable biological interpretation. In fact, when all relevant aspects regarding the experimental plan have been considered, the following steps from data quality check to differential analysis will lead to robust, trustworthy results. In this chapter, all the relevant aspects and considerations about microarray preprocessing will be discussed. Preprocessing steps are organized in an orderly manner, from experimental design to quality check and batch effect removal, including the most common visualization methods. Furthermore, we will discuss data representation and differential testing methods with a focus on the most common microarray technologies, such as gene expression and DNA methylation.Peer reviewe

Archivio della ricerca - Università degli studi di Napoli Federico II

Helsingin yliopiston digitaalinen arkisto

Trepo - Institutional Repository of Tampere University

V-SVA: an R Shiny application for detecting and annotating hidden sources of variation in single-cell RNA-seq data.

Author: Lawlor Nathan
Lee Donghyung
Marquez Eladio J
Ucar Duygu
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/06/2020
Field of study

SUMMARY: Single-cell RNA-sequencing (scRNA-seq) technology enables studying gene expression programs from individual cells. However, these data are subject to diverse sources of variation, including \u27unwanted\u27 variation that needs to be removed in downstream analyses (e.g. batch effects) and \u27wanted\u27 or biological sources of variation (e.g. variation associated with a cell type) that needs to be precisely described. Surrogate variable analysis (SVA)-based algorithms, are commonly used for batch correction and more recently for studying \u27wanted\u27 variation in scRNA-seq data. However, interpreting whether these variables are biologically meaningful or stemming from technical reasons remains a challenge. To facilitate the interpretation of surrogate variables detected by algorithms including IA-SVA, SVA or ZINB-WaVE, we developed an R Shiny application [Visual Surrogate Variable Analysis (V-SVA)] that provides a web-browser interface for the identification and annotation of hidden sources of variation in scRNA-seq data. This interactive framework includes tools for discovery of genes associated with detected sources of variation, gene annotation using publicly available databases and gene sets, and data visualization using dimension reduction methods. AVAILABILITY AND IMPLEMENTATION: The V-SVA Shiny application is publicly hosted at https://vsva.jax.org/ and the source code is freely available at https://github.com/nlawlor/V-SVA. CONTACT: [email protected] or [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Visualization and normalization of drift effect across batches in metabolome-wide association studies

Author: Augsburger M.
Bararpour N.
Bochud M.
Caputo T.
Carmeli C.
Desvergne B.
Gilardi F.
Grabherr S.
Guex N.
Ivanisevic J.
Sidibe J.
Thomas A.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 22/01/2020
Field of study

Serveur académique lausannois