22,563 research outputs found
Simulating Correlated Multivariate Pseudorandom Numbers
A modification of the Kaiser and Dichman (1962) procedure of generating multivariate random numbers with specified intercorrelation is proposed. The procedure works with positive and non-positive definite population correlation matrix. A SAS module is also provided to run the procedure.
Another Look at what to do with Time-series Cross-section Data
Our study revisits Beck and Katz' (1995) comparison of the Parks and PCSE estimators using time-series, cross-sectional data (TSCS). Our innovation is that we construct simulated statistical environments that are designed to approximate actual TSCS data. We pattern our statistical environments after income and tax data on U.S. states from 1960-1999. While PCSE generally does a better job than Parks in estimating standard errors/confidence intervals, it too can be unreliable, sometimes producing standard errors/confidence intervals that are substantially off the mark. Further, we find that the benefits of PCSE can come at a large cost in estimator efficiency.Panel data, Parks model; PCSE estimator; Monte Carlo methods
Bayesian Estimation Under Informative Sampling
Bayesian analysis is increasingly popular for use in social science and other
application areas where the data are observations from an informative sample.
An informative sampling design leads to inclusion probabilities that are
correlated with the response variable of interest. Model inference performed on
the observed sample taken from the population will be biased for the population
generative model under informative sampling since the balance of information in
the sample data is different from that for the population. Typical approaches
to account for an informative sampling design under Bayesian estimation are
often difficult to implement because they require re-parameterization of the
hypothesized generating model, or focus on design, rather than model-based,
inference. We propose to construct a pseudo-posterior distribution that
utilizes sampling weights based on the marginal inclusion probabilities to
exponentiate the likelihood contribution of each sampled unit, which weights
the information in the sample back to the population. Our approach provides a
nearly automated estimation procedure applicable to any model specified by the
data analyst for the population and retains the population model
parameterization and posterior sampling geometry. We construct conditions on
known marginal and pairwise inclusion probabilities that define a class of
sampling designs where consistency of the pseudo posterior is
guaranteed. We demonstrate our method on an application concerning the Bureau
of Labor Statistics Job Openings and Labor Turnover Survey.Comment: 24 pages, 3 figure
Another Look At What To Do With Time-Series Cross-Section Data
Our study revisits Beck and Katz’ (1995) comparison of the Parks and PCSE estimators using time-series, cross-sectional data (TSCS). Our innovation is that we construct simulated statistical environments that are designed to closely match “real-world,” TSCS data. We pattern our statistical environments after income and tax data on U.S. states from 1960-1999. While PCSE generally does a better job than Parks in estimating standard errors, it too can be unreliable, sometimes producing standard errors that are substantially off the mark. Further, we find that the benefits of PCSE can come at a substantial cost in estimator efficiency. Based on our study, we would give the following advice to researchers using TSCS data: Given a choice between Parks and PCSE, we recommend that researchers use PCSE for hypothesis testing, and Parks if their primary interest is accurate coefficient estimates.Panel Data, Panel Corrected Standard Errors, Monte Carlo analysis
Issues and Observations on Applications of the Constrained-Path Monte Carlo Method to Many-Fermion Systems
We report several important observations that underscore the distinctions
between the constrained-path Monte Carlo method and the continuum and lattice
versions of the fixed-node method. The main distinctions stem from the
differences in the state space in which the random walk occurs and in the
manner in which the random walkers are constrained. One consequence is that in
the constrained-path method the so-called mixed estimator for the energy is not
an upper bound to the exact energy, as previously claimed. Several ways of
producing an energy upper bound are given, and relevant methodological aspects
are illustrated with simple examples.Comment: 28 pages, REVTEX, 5 ps figure
Scalable Rejection Sampling for Bayesian Hierarchical Models
Bayesian hierarchical modeling is a popular approach to capturing unobserved
heterogeneity across individual units. However, standard estimation methods
such as Markov chain Monte Carlo (MCMC) can be impracticable for modeling
outcomes from a large number of units. We develop a new method to sample from
posterior distributions of Bayesian models, without using MCMC. Samples are
independent, so they can be collected in parallel, and we do not need to be
concerned with issues like chain convergence and autocorrelation. The algorithm
is scalable under the weak assumption that individual units are conditionally
independent, making it applicable for large datasets. It can also be used to
compute marginal likelihoods
A Monte Carlo Evaluation of Some Common Panel Data Estimators when Serial Correlation and Cross-sectional Dependence are Both Present
This study employs Monte Carlo experiments to evaluate the performances of a number of common panel data estimators when serial correlation and cross-sectional dependence are both present. It focuses on fixed effects models with less than 100 cross-sectional units and between 10 and 25 time periods (such as are commonly employed in empirical growth studies). Estimator performance is compared on two dimensions: (i) root mean square error and (ii) accuracy of estimated confidence intervals. An innovation of our study is that our simulated panel data sets are designed to look like “real-world” panel data. We find large differences in the performances of the respective estimators. Further, estimators that perform well on efficiency grounds may perform poorly when estimating confidence intervals, and vice versa. Our experimental results form the basis for a set of estimator recommendations. These are applied to “out of sample” simulated panel data sets and found to perform well.Panel Data estimation; Monte Carlo analysis; FGLS; PCSE; Groupwise Heteroscedasticity; Serial Correlation; Cross-sectional Dependence; Stata; EViews
- …