Search CORE

68 research outputs found

Simulating High-Dimensional Multivariate Data using the bigsimr R Package

Author: Bedrick Edward J.
Knudson Alexander D.
Kozubowski Tomasz J.
Nguyen Tin
Panorska Anna K.
Petereit Juli
Piegorsch Walter W.
Schissler A. Grant
Tran Duc
Publication venue
Publication date: 11/11/2021
Field of study

It is critical to accurately simulate data when employing Monte Carlo techniques and evaluating statistical methodology. Measurements are often correlated and high dimensional in this era of big data, such as data obtained in high-throughput biomedical experiments. Due to the computational complexity and a lack of user-friendly software available to simulate these massive multivariate constructions, researchers resort to simulation designs that posit independence or perform arbitrary data transformations. To close this gap, we developed the Bigsimr Julia package with R and Python interfaces. This paper focuses on the R interface. These packages empower high-dimensional random vector simulation with arbitrary marginal distributions and dependency via a Pearson, Spearman, or Kendall correlation matrix. bigsimr contains high-performance features, including multi-core and graphical-processing-unit-accelerated algorithms to estimate correlation and compute the nearest correlation matrix. Monte Carlo studies quantify the accuracy and scalability of our approach, up to

d=10,000

. We describe example workflows and apply to a high-dimensional data set -- RNA-sequencing data obtained from breast cancer tumor samples.Comment: 22 pages, 10 figures, https://cran.r-project.org/web/packages/bigsimr/index.htm

arXiv.org e-Print Archive

Efficient Parallel Statistical Model Checking of Biochemical Networks

Author: A. Pnueli
A. S. Miner
Adnan Aziz
B. Novak
Christel Baier
D. Donaldson R.
D.O. Morgan
D.O. Morgan
D.T. Gillespie
D.T. Gillespie
Davide Prandi
E. B. Wilson
Edmund M Clarke
Edmund M. Clarke
Fran¸ cois Fages
H. A. Hansson
H. Kitano
H. Li
H. Younes
J.-P. Katoen
Jaco van de Pol
Jiv r'ı Barnat
L. Dematte
Laurence Calzone
Lawrence D. Brown
Lawrence D. Brown
Lubos Brim
M. Kwiatkowska
M. Kwiatkowska
M. Scarpa
Michele Forlin
P. Ballarini
P. Ballarini
Paolo Ballarini
T. Tian
Thomas Hérault
Tommaso Mazza
Walter W. Piegorsch
William J. Stewart
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2009
Field of study

We consider the problem of verifying stochastic models of biochemical networks against behavioral properties expressed in temporal logic terms. Exact probabilistic verification approaches such as, for example, CSL/PCTL model checking, are undermined by a huge computational demand which rule them out for most real case studies. Less demanding approaches, such as statistical model checking, estimate the likelihood that a property is satisfied by sampling executions out of the stochastic model. We propose a methodology for efficiently estimating the likelihood that a LTL property P holds of a stochastic model of a biochemical network. As with other statistical verification techniques, the methodology we propose uses a stochastic simulation algorithm for generating execution samples, however there are three key aspects that improve the efficiency: first, the sample generation is driven by on-the-fly verification of P which results in optimal overall simulation time. Second, the confidence interval estimation for the probability of P to hold is based on an efficient variant of the Wilson method which ensures a faster convergence. Third, the whole methodology is designed according to a parallel fashion and a prototype software tool has been implemented that performs the sampling/verification process in parallel over an HPC architecture

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

Stimulus-Response Analysis for Data in the Form of Proportions

Author: Walter W. Piegorsch
Walter W. Piegorsch
Publication venue
Publication date
Field of study

INTRODUCTION Dichotomous response models are common in many engineering settings, and they are an important endpoint in quality control and quality testing. Often, they represent the response of some experimental unit to an environmental or chemical stimulus, or of the unit over time, etc. Independent observations on each unit produce a value in the set {0,1} with some probability of binary response, p. A common design involves T populations, treatment groups, dose levels, etc. When some score or other quantification of the stimulus, x i (i=1,...,T), has been recorded along with the observations, an important issue for statistical study is the characterization of the stimulusresponse for use in prediction or assessment of the underlying phenomenon. Statistically, the recorded observations at the i th treatment level are taken as the number of "positive" outcomes, Y i , among the n i experimental units examined

CiteSeerX

Tables Of P-Values For t- And Chi-Square Reference Distributions

Author: W. W. Piegorsch
Walter W. Piegorsch
Publication venue
Publication date
Field of study

INTRODUCTION An important area of statistical practice involves determination of P-values when performing significance testing. If the null reference distribution is standard normal, then many standard statistical texts provide a table of probabilities that may be used to determine the P-value; examples include Casella and Berger (1990), Hogg and Tanis (1997), Iman (1994), Moore and McCabe (1993), Neter et al. (1996), Snedecor and Cochran (1980), Sokal and Rohlf (1995), and Steel and Torrie (1980), among many others. If the null reference distribution is slightly more complex, however, such as a t-distribution or a x 2 -distribution, most standard textbooks give only upper-a critical points rather than actual P-values. With the advent of modern statistical computing power, this is not a major concern; most statistical computing packages can output P-values associated w

CiteSeerX

Sample sizes for improved binomial confidence intervals

Author: Piegorsch Walter W.
Publication venue
Publication date
Field of study

Research Papers in Economics

On confidence bands and set estimators for the simple linear model

Author: Piegorsch Walter W.
Publication venue
Publication date
Field of study

This paper reviews the duality between confidence bands and (convex) set estimators in a simple linear regression. Applications of this duality are explored. These include the nature of polygonal sets and the development of an algorithm that approximates the coverage probability of smooth confidence band functions.simultaneous inference linear regression linear segment confidence bands coverage probability approximation

Research Papers in Economics