13,749 research outputs found
Simulation Theorems via Pseudorandom Properties
We generalize the deterministic simulation theorem of Raz and McKenzie
[RM99], to any gadget which satisfies certain hitting property. We prove that
inner-product and gap-Hamming satisfy this property, and as a corollary we
obtain deterministic simulation theorem for these gadgets, where the gadget's
input-size is logarithmic in the input-size of the outer function. This answers
an open question posed by G\"{o}\"{o}s, Pitassi and Watson [GPW15]. Our result
also implies the previous results for the Indexing gadget, with better
parameters than was previously known. A preliminary version of the results
obtained in this work appeared in [CKL+17]
Hazard models with varying coefficients for multivariate failure time data
Statistical estimation and inference for marginal hazard models with varying
coefficients for multivariate failure time data are important subjects in
survival analysis. A local pseudo-partial likelihood procedure is proposed for
estimating the unknown coefficient functions. A weighted average estimator is
also proposed in an attempt to improve the efficiency of the estimator. The
consistency and asymptotic normality of the proposed estimators are established
and standard error formulas for the estimated coefficients are derived and
empirically tested. To reduce the computational burden of the maximum local
pseudo-partial likelihood estimator, a simple and useful one-step estimator is
proposed. Statistical properties of the one-step estimator are established and
simulation studies are conducted to compare the performance of the one-step
estimator to that of the maximum local pseudo-partial likelihood estimator. The
results show that the one-step estimator can save computational cost without
compromising performance both asymptotically and empirically and that an
optimal weighted average estimator is more efficient than the maximum local
pseudo-partial likelihood estimator. A data set from the Busselton Population
Health Surveys is analyzed to illustrate our proposed methodology.Comment: Published at http://dx.doi.org/10.1214/009053606000001145 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error
This article considers estimation of constant and time-varying coefficients
in nonlinear ordinary differential equation (ODE) models where analytic
closed-form solutions are not available. The numerical solution-based nonlinear
least squares (NLS) estimator is investigated in this study. A numerical
algorithm such as the Runge--Kutta method is used to approximate the ODE
solution. The asymptotic properties are established for the proposed estimators
considering both numerical error and measurement error. The B-spline is used to
approximate the time-varying coefficients, and the corresponding asymptotic
theories in this case are investigated under the framework of the sieve
approach. Our results show that if the maximum step size of the -order
numerical algorithm goes to zero at a rate faster than , the
numerical error is negligible compared to the measurement error. This result
provides a theoretical guidance in selection of the step size for numerical
evaluations of ODEs. Moreover, we have shown that the numerical solution-based
NLS estimator and the sieve NLS estimator are strongly consistent. The sieve
estimator of constant parameters is asymptotically normal with the same
asymptotic co-variance as that of the case where the true ODE solution is
exactly known, while the estimator of the time-varying parameter has the
optimal convergence rate under some regularity conditions. The theoretical
results are also developed for the case when the step size of the ODE numerical
solver does not go to zero fast enough or the numerical error is comparable to
the measurement error. We illustrate our approach with both simulation studies
and clinical data on HIV viral dynamics.Comment: Published in at http://dx.doi.org/10.1214/09-AOS784 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Consistency of Markov chain quasi-Monte Carlo on continuous state spaces
The random numbers driving Markov chain Monte Carlo (MCMC) simulation are
usually modeled as independent U(0,1) random variables. Tribble [Markov chain
Monte Carlo algorithms using completely uniformly distributed driving sequences
(2007) Stanford Univ.] reports substantial improvements when those random
numbers are replaced by carefully balanced inputs from completely uniformly
distributed sequences. The previous theoretical justification for using
anything other than i.i.d. U(0,1) points shows consistency for estimated means,
but only applies for discrete stationary distributions. We extend those results
to some MCMC algorithms for continuous stationary distributions. The main
motivation is the search for quasi-Monte Carlo versions of MCMC. As a side
benefit, the results also establish consistency for the usual method of using
pseudo-random numbers in place of random ones.Comment: Published in at http://dx.doi.org/10.1214/10-AOS831 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Poisson point process models solve the "pseudo-absence problem" for presence-only data in ecology
Presence-only data, point locations where a species has been recorded as
being present, are often used in modeling the distribution of a species as a
function of a set of explanatory variables---whether to map species occurrence,
to understand its association with the environment, or to predict its response
to environmental change. Currently, ecologists most commonly analyze
presence-only data by adding randomly chosen "pseudo-absences" to the data such
that it can be analyzed using logistic regression, an approach which has
weaknesses in model specification, in interpretation, and in implementation. To
address these issues, we propose Poisson point process modeling of the
intensity of presences. We also derive a link between the proposed approach and
logistic regression---specifically, we show that as the number of
pseudo-absences increases (in a regular or uniform random arrangement),
logistic regression slope parameters and their standard errors converge to
those of the corresponding Poisson point process model. We discuss the
practical implications of these results. In particular, point process modeling
offers a framework for choice of the number and location of pseudo-absences,
both of which are currently chosen by ad hoc and sometimes ineffective methods
in ecology, a point which we illustrate by example.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS331 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …