86,456 research outputs found

    Global consensus Monte Carlo

    Get PDF
    To conduct Bayesian inference with large data sets, it is often convenient or necessary to distribute the data across multiple machines. We consider a likelihood function expressed as a product of terms, each associated with a subset of the data. Inspired by global variable consensus optimisation, we introduce an instrumental hierarchical model associating auxiliary statistical parameters with each term, which are conditionally independent given the top-level parameters. One of these top-level parameters controls the unconditional strength of association between the auxiliary parameters. This model leads to a distributed MCMC algorithm on an extended state space yielding approximations of posterior expectations. A trade-off between computational tractability and fidelity to the original model can be controlled by changing the association strength in the instrumental model. We further propose the use of a SMC sampler with a sequence of association strengths, allowing both the automatic determination of appropriate strengths and for a bias correction technique to be applied. In contrast to similar distributed Monte Carlo algorithms, this approach requires few distributional assumptions. The performance of the algorithms is illustrated with a number of simulated examples

    Semiparametric Multivariate Accelerated Failure Time Model with Generalized Estimating Equations

    Full text link
    The semiparametric accelerated failure time model is not as widely used as the Cox relative risk model mainly due to computational difficulties. Recent developments in least squares estimation and induced smoothing estimating equations provide promising tools to make the accelerate failure time models more attractive in practice. For semiparametric multivariate accelerated failure time models, we propose a generalized estimating equation approach to account for the multivariate dependence through working correlation structures. The marginal error distributions can be either identical as in sequential event settings or different as in parallel event settings. Some regression coefficients can be shared across margins as needed. The initial estimator is a rank-based estimator with Gehan's weight, but obtained from an induced smoothing approach with computation ease. The resulting estimator is consistent and asymptotically normal, with a variance estimated through a multiplier resampling method. In a simulation study, our estimator was up to three times as efficient as the initial estimator, especially with stronger multivariate dependence and heavier censoring percentage. Two real examples demonstrate the utility of the proposed method

    Sequential Bayesian inference for implicit hidden Markov models and current limitations

    Full text link
    Hidden Markov models can describe time series arising in various fields of science, by treating the data as noisy measurements of an arbitrarily complex Markov process. Sequential Monte Carlo (SMC) methods have become standard tools to estimate the hidden Markov process given the observations and a fixed parameter value. We review some of the recent developments allowing the inclusion of parameter uncertainty as well as model uncertainty. The shortcomings of the currently available methodology are emphasised from an algorithmic complexity perspective. The statistical objects of interest for time series analysis are illustrated on a toy "Lotka-Volterra" model used in population ecology. Some open challenges are discussed regarding the scalability of the reviewed methodology to longer time series, higher-dimensional state spaces and more flexible models.Comment: Review article written for ESAIM: proceedings and surveys. 25 pages, 10 figure

    Optimal treatment allocations in space and time for on-line control of an emerging infectious disease

    Get PDF
    A key component in controlling the spread of an epidemic is deciding where, whenand to whom to apply an intervention.We develop a framework for using data to informthese decisionsin realtime.We formalize a treatment allocation strategy as a sequence of functions, oneper treatment period, that map up-to-date information on the spread of an infectious diseaseto a subset of locations where treatment should be allocated. An optimal allocation strategyoptimizes some cumulative outcome, e.g. the number of uninfected locations, the geographicfootprint of the disease or the cost of the epidemic. Estimation of an optimal allocation strategyfor an emerging infectious disease is challenging because spatial proximity induces interferencebetween locations, the number of possible allocations is exponential in the number oflocations, and because disease dynamics and intervention effectiveness are unknown at outbreak.We derive a Bayesian on-line estimator of the optimal allocation strategy that combinessimulation–optimization with Thompson sampling.The estimator proposed performs favourablyin simulation experiments. This work is motivated by and illustrated using data on the spread ofwhite nose syndrome, which is a highly fatal infectious disease devastating bat populations inNorth America

    Distributed Constrained Recursive Nonlinear Least-Squares Estimation: Algorithms and Asymptotics

    Full text link
    This paper focuses on the problem of recursive nonlinear least squares parameter estimation in multi-agent networks, in which the individual agents observe sequentially over time an independent and identically distributed (i.i.d.) time-series consisting of a nonlinear function of the true but unknown parameter corrupted by noise. A distributed recursive estimator of the \emph{consensus} + \emph{innovations} type, namely CIWNLS\mathcal{CIWNLS}, is proposed, in which the agents update their parameter estimates at each observation sampling epoch in a collaborative way by simultaneously processing the latest locally sensed information~(\emph{innovations}) and the parameter estimates from other agents~(\emph{consensus}) in the local neighborhood conforming to a pre-specified inter-agent communication topology. Under rather weak conditions on the connectivity of the inter-agent communication and a \emph{global observability} criterion, it is shown that at every network agent, the proposed algorithm leads to consistent parameter estimates. Furthermore, under standard smoothness assumptions on the local observation functions, the distributed estimator is shown to yield order-optimal convergence rates, i.e., as far as the order of pathwise convergence is concerned, the local parameter estimates at each agent are as good as the optimal centralized nonlinear least squares estimator which would require access to all the observations across all the agents at all times. In order to benchmark the performance of the proposed distributed CIWNLS\mathcal{CIWNLS} estimator with that of the centralized nonlinear least squares estimator, the asymptotic normality of the estimate sequence is established and the asymptotic covariance of the distributed estimator is evaluated. Finally, simulation results are presented which illustrate and verify the analytical findings.Comment: 28 pages. Initial Submission: Feb. 2016, Revised: July 2016, Accepted: September 2016, To appear in IEEE Transactions on Signal and Information Processing over Networks: Special Issue on Inference and Learning over Network
    corecore