Search CORE

7,459 research outputs found

Appropriate Methodology of Statistical Tests According to Prior Probability and Required Objectivity

Author: Konishi Tomokazu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In contrast to its common definition and calculation, interpretation of p-values diverges among statisticians. Since p-value is the basis of various methodologies, this divergence has led to a variety of test methodologies and evaluations of test results. This chaotic situation has complicated the application of tests and decision processes. Here, the origin of the divergence is found in the prior probability of the test. Effects of difference in Pr(H0 = true) on the character of p-values are investigated by comparing real microarray data and its artificial imitations as subjects of Student's t-tests. Also, the importance of the prior probability is discussed in terms of the applicability of Bayesian approaches. Suitable methodology is found in accordance with the prior probability and purpose of the test.Comment: 16 pages, 3 figures, and 1 tabl

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Recommended from our members

Temporal Bayesian classifiers for modelling muscular dystrophy expression data

Author: Hoen PAC't
Liu X
Tucker A
Vinciotti V
Publication venue: 'IOS Press'
Publication date: 01/01/2006
Field of study

The analysis of microarray data from time-series experiments requires specialised algorithms, which take the temporal ordering of the data into account. In this paper we explore a new architecture of Bayesian classifier that can be used to understand how biological mechanisms differ with respect to time. We show that this classifier improves the classification of microarray data and at the same time ensures that the models can easily be analysed by biologists by incorporating time transparently. In this paper we focus on data that has been generated to explore different types of muscular dystrophy

Brunel University Research Archive

Profiling time course expression of virus genes---an illustration of Bayesian inference under shape restrictions

Author: Chang I-Shou
Chien Li-Chu
Gupta Pramod K.
Hsiung Chao A.
Jiang Shih Sheng
Wen Chi-Chung
Wu Yuh-Jenn
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 29/09/2010
Field of study

There have been several studies of the genome-wide temporal transcriptional program of viruses, based on microarray experiments, which are generally useful in the construction of gene regulation network. It seems that biological interpretations in these studies are directly based on the normalized data and some crude statistics, which provide rough estimates of limited features of the profile and may incur biases. This paper introduces a hierarchical Bayesian shape restricted regression method for making inference on the time course expression of virus genes. Estimates of many salient features of the expression profile like onset time, inflection point, maximum value, time to maximum value, area under curve, etc. can be obtained immediately by this method. Applying this method to a baculovirus microarray time course expression data set, we indicate that many biological questions can be formulated quantitatively and we are able to offer insights into the baculovirus biology.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS258 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

National Health Research Institues

Tamkang University Institutional Repository

Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle

Author: Fan Xiaodan
Liu Jun S.
Pyne Saumyadipta
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/11/2010
Field of study

The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis--Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10--15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Application of new probabilistic graphical models in the genetic regulatory networks studies

Author: Anderson
Bar-Joseph
Chiang
Chickering
Cox
de la Fuente
Edwards
Friedman
Futcher
Geiger
Hartemink
Jan Delabie
Jong
Junbai Wang
Kikuchi
Lee
Leo Wang-Kit Cheung
Li
Meek
Qian
Rangel
Roberts
Rung
Segal
Somogyi
Spirtes
Spirtes
Spirtes
Steffen
Toh
Troyanskaya
Wang
Wu
Yeung
Yu
Yu
Zhang
Zhou
Publication venue: 'Elsevier BV'
Publication date: 31/12/2005
Field of study

This paper introduces two new probabilistic graphical models for reconstruction of genetic regulatory networks using DNA microarray data. One is an Independence Graph (IG) model with either a forward or a backward search algorithm and the other one is a Gaussian Network (GN) model with a novel greedy search method. The performances of both models were evaluated on four MAPK pathways in yeast and three simulated data sets. Generally, an IG model provides a sparse graph but a GN model produces a dense graph where more information about gene-gene interactions is preserved. Additionally, we found two key limitations in the prediction of genetic regulatory networks using DNA microarray data, the first is the sufficiency of sample size and the second is the complexity of network structures may not be captured without additional data at the protein level. Those limitations are present in all prediction methods which used only DNA microarray data.Comment: 38 pages, 3 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Crossref

Laplace Approximated EM Microarray Analysis: An Empirical Bayes Approach for Comparative Microarray Experiments

Author: Bar Haim
Booth James
Schifano Elizabeth
Wells Martin T.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 05/01/2011
Field of study

A two-groups mixed-effects model for the comparison of (normalized) microarray data from two treatment groups is considered. Most competing parametric methods that have appeared in the literature are obtained as special cases or by minor modification of the proposed model. Approximate maximum likelihood fitting is accomplished via a fast and scalable algorithm, which we call LEMMA (Laplace approximated EM Microarray Analysis). The posterior odds of treatment

\times

gene interactions, derived from the model, involve shrinkage estimates of both the interactions and of the gene specific error variances. Genes are classified as being associated with treatment based on the posterior odds and the local false discovery rate (f.d.r.) with a fixed cutoff. Our model-based approach also allows one to declare the non-null status of a gene by controlling the false discovery rate (FDR). It is shown in a detailed simulation study that the approach outperforms well-known competitors. We also apply the proposed methodology to two previously analyzed microarray examples. Extensions of the proposed method to paired treatments and multiple treatments are also discussed.Comment: Published in at http://dx.doi.org/10.1214/10-STS339 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Application of Volcano Plots in Analyses of mRNA Differential Expressions with Microarrays

Author: Alvord W. G.
Auer P. L.
Chen Y.
Chen Z.
Cohen J.
Fechner G. T.
Guyon I.
Göhlmann H.
Lee J.
Li C.
Schwender H.
Smyth G. K.
Snedecor G. W.
Trevino V.
Vandesompele J.
Welsh B. L.
WENTIAN LI
Zhao C.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 28/08/2013
Field of study

Volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log10(p-value) from the t test). We review the basic and an interactive use of the volcano plot, and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide an unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility to apply volcano plots to other fields beyond microarray.Comment: 8 figure

arXiv.org e-Print Archive

Crossref

Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Author: Akaike
Allan Tucker
Beissbarth
Conlon
Courcelle
DerSimonian
Eisen
Emma Steele
Faith
Friedman
Gasch
Grigull
Hanley
Hartemink
Jarvinen
Khil
Kuo
Matzkevich
Ng
Pearl
Pearl
Pennock
Pe’er
Pe’er
Quillardet
Salgado
Sangurdekar
Smyth
Soinov
Spellman
Stoica
Sutton
Teixeira
Wang
Yauk
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

Microarrays, Empirical Bayes and the Two-Groups Model

Author: Efron Bradley
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

The classic frequentist theory of hypothesis testing developed by Neyman, Pearson and Fisher has a claim to being the twentieth century's most influential piece of applied mathematics. Something new is happening in the twenty-first century: high-throughput devices, such as microarrays, routinely require simultaneous hypothesis tests for thousands of individual cases, not at all what the classical theory had in mind. In these situations empirical Bayes information begins to force itself upon frequentists and Bayesians alike. The two-groups model is a simple Bayesian construction that facilitates empirical Bayes analysis. This article concerns the interplay of Bayesian and frequentist ideas in the two-groups setting, with particular attention focused on Benjamini and Hochberg's False Discovery Rate method. Topics include the choice and meaning of the null hypothesis in large-scale testing situations, power considerations, the limitations of permutation methods, significance testing for groups of cases (such as pathways in microarray studies), correlation effects, multiple confidence intervals and Bayesian competitors to the two-groups model.Comment: This paper commented in: [arXiv:0808.0582], [arXiv:0808.0593], [arXiv:0808.0597], [arXiv:0808.0599]. Rejoinder in [arXiv:0808.0603]. Published in at http://dx.doi.org/10.1214/07-STS236 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref