Search CORE

5,251 research outputs found

Application of Volcano Plots in Analyses of mRNA Differential Expressions with Microarrays

Author: Alvord W. G.
Auer P. L.
Chen Y.
Chen Z.
Cohen J.
Fechner G. T.
Guyon I.
Göhlmann H.
Lee J.
Li C.
Schwender H.
Smyth G. K.
Snedecor G. W.
Trevino V.
Vandesompele J.
Welsh B. L.
WENTIAN LI
Zhao C.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 28/08/2013
Field of study

Volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log10(p-value) from the t test). We review the basic and an interactive use of the volcano plot, and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide an unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility to apply volcano plots to other fields beyond microarray.Comment: 8 figure

arXiv.org e-Print Archive

Crossref

Bayesian variable selection and data integration for biological regulatory networks

Author: Chen Guang
Jensen Shane T.
Stoeckert Jr, Christian J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Application of new probabilistic graphical models in the genetic regulatory networks studies

Author: Anderson
Bar-Joseph
Chiang
Chickering
Cox
de la Fuente
Edwards
Friedman
Futcher
Geiger
Hartemink
Jan Delabie
Jong
Junbai Wang
Kikuchi
Lee
Leo Wang-Kit Cheung
Li
Meek
Qian
Rangel
Roberts
Rung
Segal
Somogyi
Spirtes
Spirtes
Spirtes
Steffen
Toh
Troyanskaya
Wang
Wu
Yeung
Yu
Yu
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 31/12/2005
Field of study

This paper introduces two new probabilistic graphical models for reconstruction of genetic regulatory networks using DNA microarray data. One is an Independence Graph (IG) model with either a forward or a backward search algorithm and the other one is a Gaussian Network (GN) model with a novel greedy search method. The performances of both models were evaluated on four MAPK pathways in yeast and three simulated data sets. Generally, an IG model provides a sparse graph but a GN model produces a dense graph where more information about gene-gene interactions is preserved. Additionally, we found two key limitations in the prediction of genetic regulatory networks using DNA microarray data, the first is the sufficiency of sample size and the second is the complexity of network structures may not be captured without additional data at the protein level. Those limitations are present in all prediction methods which used only DNA microarray data.Comment: 38 pages, 3 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

A temporal switch model for estimating transcriptional activity in gene expression

Author: Finkenstädt Bärbel
Jenkins Dafyd J.
Rand D. A. (David A.)
Publication venue: Oxford University Press
Publication date: 11/03/2013
Field of study

Motivation: The analysis and mechanistic modelling of time series gene expression data provided by techniques such as microarrays, NanoString, reverse transcription–polymerase chain reaction and advanced sequencing are invaluable for developing an understanding of the variation in key biological processes. We address this by proposing the estimation of a flexible dynamic model, which decouples temporal synthesis and degradation of mRNA and, hence, allows for transcriptional activity to switch between different states. Results: The model is flexible enough to capture a variety of observed transcriptional dynamics, including oscillatory behaviour, in a way that is compatible with the demands imposed by the quality, time-resolution and quantity of the data. We show that the timing and number of switch events in transcriptional activity can be estimated alongside individual gene mRNA stability with the help of a Bayesian reversible jump Markov chain Monte Carlo algorithm. To demonstrate the methodology, we focus on modelling the wild-type behaviour of a selection of 200 circadian genes of the model plant Arabidopsis thaliana. The results support the idea that using a mechanistic model to identify transcriptional switch points is likely to strongly contribute to efforts in elucidating and understanding key biological processes, such as transcription and degradation

Warwick Research Archives Portal Repository

Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Author: Akaike
Allan Tucker
Beissbarth
Conlon
Courcelle
DerSimonian
Eisen
Emma Steele
Faith
Friedman
Gasch
Grigull
Hanley
Hartemink
Jarvinen
Khil
Kuo
Matzkevich
Ng
Pearl
Pearl
Pennock
Pe’er
Pe’er
Quillardet
Salgado
Sangurdekar
Smyth
Soinov
Spellman
Stoica
Sutton
Teixeira
Wang
Yauk
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

Methods to improve gene signal : Application to cDNA microarrays

Author: Gupta Rashi
Publication venue: 'University of Helsinki Libraries'
Publication date: 17/04/2009
Field of study

Microarrays are high throughput biological assays that allow the screening of thousands of genes for their expression. The main idea behind microarrays is to compute for each gene a unique signal that is directly proportional to the quantity of mRNA that was hybridized on the chip. A large number of steps and errors associated with each step make the generated expression signal noisy. As a result, microarray data need to be carefully pre-processed before their analysis can be assumed to lead to reliable and biologically relevant conclusions. This thesis focuses on developing methods for improving gene signal and further utilizing this improved signal for higher level analysis. To achieve this, first, approaches for designing microarray experiments using various optimality criteria, considering both biological and technical replicates, are described. A carefully designed experiment leads to signal with low noise, as the effect of unwanted variations is minimized and the precision of the estimates of the parameters of interest are maximized. Second, a system for improving the gene signal by using three scans at varying scanner sensitivities is developed. A novel Bayesian latent intensity model is then applied on these three sets of expression values, corresponding to the three scans, to estimate the suitably calibrated true signal of genes. Third, a novel image segmentation approach that segregates the fluorescent signal from the undesired noise is developed using an additional dye, SYBR green RNA II. This technique helped in identifying signal only with respect to the hybridized DNA, and signal corresponding to dust, scratch, spilling of dye, and other noises, are avoided. Fourth, an integrated statistical model is developed, where signal correction, systematic array effects, dye effects, and differential expression, are modelled jointly as opposed to a sequential application of several methods of analysis. The methods described in here have been tested only for cDNA microarrays, but can also, with some modifications, be applied to other high-throughput technologies. Keywords: High-throughput technology, microarray, cDNA, multiple scans, Bayesian hierarchical models, image analysis, experimental design, MCMC, WinBUGS.Tarkastellaan menetelmiä, joilla voidaan parantaa geneetisiä signaaleja ja hyödyntää vahvistetun signaalin käyttöä myöhemmissä analyyseissä

Helsingin yliopiston digitaalinen arkisto

Circular RNAs in Clear Cell Renal Cell Carcinoma: Their Microarray-Based Identification, Analytical Validation, and Potential Use in a Clinico-Genomic Model to Improve Prognostic Accuracy

Author: Erbersdobler Andreas
Fendler Annika
Franz Antonia
Jung Klaus
Jung Monika
Kilic Ergin
Ralla Bernhard
Rochow Hannah
Stephan Carsten
Weickmann Sabine
Publication venue
Publication date: 01/01/2019
Field of study

Circular RNAs (circRNAs) may act as novel cancer biomarkers. However, a genome-wide evaluation of circRNAs in clear cell renal cell carcinoma (ccRCC) has yet to be conducted. Therefore, the objective of this study was to identify and validate circRNAs in ccRCC tissue with a focus to evaluate their potential as prognostic biomarkers. A genome-wide identification of circRNAs in total RNA extracted from ccRCC tissue samples was performed using microarray analysis. Three relevant differentially expressed circRNAs were selected (circEGLN3, circNOX4, and circRHOBTB3), their circular nature was experimentally confirmed, and their expression-along with that of their linear counterparts-was measured in 99 malignant and 85 adjacent normal tissue samples using specifically established RT-qPCR assays. The capacity of circRNAs to discriminate between malignant and adjacent normal tissue samples and their prognostic potential (with the endpoints cancer-specific, recurrence-free, and overall survival) after surgery were estimated by C-statistics, Kaplan-Meier method, univariate and multivariate Cox regression analysis, decision curve analysis, and Akaike and Bayesian information criteria. CircEGLN3 discriminated malignant from normal tissue with 97% accuracy. We generated a prognostic for the three endpoints by multivariate Cox regression analysis that included circEGLN3, circRHOBT3 and linRHOBTB3. The predictive outcome accuracy of the clinical models based on clinicopathological factors was improved in combination with this circRNA-based signature. Bootstrapping as well as Akaike and Bayesian information criteria confirmed the statistical significance and robustness of the combined models. Limitations of this study include its retrospective nature and the lack of external validation. The study demonstrated the promising potential of circRNAs as diagnostic and particularly prognostic biomarkers in ccRCC patients

Institutional Repository of the Freie Universität Berlin

MDC Repository