88 research outputs found

    Beyond first-order asymptotics for Cox regression

    Get PDF
    To go beyond standard first-order asymptotics for Cox regression, we develop parametric bootstrap and second-order methods. In general, computation of PP-values beyond first order requires more model specification than is required for the likelihood function. It is problematic to specify a censoring mechanism to be taken very seriously in detail, and it appears that conditioning on censoring is not a viable alternative to that. We circumvent this matter by employing a reference censoring model, matching the extent and timing of observed censoring. Our primary proposal is a parametric bootstrap method utilizing this reference censoring model to simulate inferential repetitions of the experiment. It is shown that the most important part of improvement on first-order methods - that pertaining to fitting nuisance parameters - is insensitive to the assumed censoring model. This is supported by numerical comparisons of our proposal to parametric bootstrap methods based on usual random censoring models, which are far more unattractive to implement. As an alternative to our primary proposal, we provide a second-order method requiring less computing effort while providing more insight into the nature of improvement on first-order methods. However, the parametric bootstrap method is more transparent, and hence is our primary proposal. Indications are that first-order partial likelihood methods are usually adequate in practice, so we are not advocating routine use of the proposed methods. It is however useful to see how best to check on first-order approximations, or improve on them, when this is expressly desired.Comment: Published at http://dx.doi.org/10.3150/13-BEJ572 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Maximum likelihood estimation based on the Laplace approximation for p2 network regression models

    Get PDF
    The class of p2 models is suitable for modeling binary relation data in social network analysis. A p2 model is essentially a regression model for bivariate binary responses, featuring within‐dyad dependence and correlated crossed random effects to represent heterogeneity of actors. Despite some desirable properties, these models are used less frequently in empirical applications than other models for network data. A possible reason for this is due to the limited possibilities for this model for accounting for (and explicitly modeling) structural dependence beyond the dyad as can be done in exponential random graph models. Another motive, however, may lie in the computational difficulties existing to estimate such models by means of the methods proposed in the literature, such as joint maximization methods and Bayesian methods. The aim of this article is to investigate maximum likelihood estimation based on the Laplace approximation approach, that can be refined by importance sampling. Practical implementation of such methods can be performed in an efficient manner, and the article provides details on a software implementation using R. Numerical examples and simulation studies illustrate the methodology

    Adjusted quasi-profile likelihoods from estimating functions

    Get PDF
    Higher-order adjustments for a quasi-profile likelihood for a scalar parameter of interest in the presence of nuisance parameters are discussed. Paralleling likelihood asymptotics, these adjustments aim to alleviate some of the problems inherent to the presence of nuisance parameters. Indeed, the estimating equation for the parameter of interest, when the nuisance parameter is substituted with an appropriate estimate, is not unbiased and such a bias can lead to poor inference on the parameter of interest. Following the approach of McCullagh and Tibshirani (1990), here we propose adjustments for the estimating equation for the parameter of interest. Moreover, we discuss two methods for their computation: a bootstrap simulation method, and a first-order asymptotic expression, which can be simplified under an orthogonality assumption. Some examples, in the context of generalized linear models and of robust inference, are provided

    Evaluation of Clinical and Clinical Chemical Parameters in Periparturient Cows

    Get PDF
    Certain blood parameters and clinical symptoms have been connected with milk fever and a hypocalcemic condition in the cow. The present study intended to establish a mutual connection between relevant blood parameters and potentially valuable background information about the cow and its observed clinical symptoms at calving. Two veterinarians were summoned within 12 h of parturition of 201 cows, distributed among 41 Danish commercial herds. Cows were at different parity levels (2 to 10) and breeds and management differed broadly among herds. A blood sample was taken from the vena jugularis or the tail vein and was subsequently analyzed in the laboratory. Furthermore, 13 different clinical symptoms were recorded as categorical data. We investigated associations among the data obtained. We assessed an interpretative model for actual blood calcium level with blood parameters and background knowledge of the animals. We established a path analysis using background knowledge, blood parameters, and results of clinical examinations to uncover causal connections among the variables. Twenty-six percent of the animals were diagnosed as having milk fever and subsequent blood analyses revealed a high frequency of hypocalcemia within the general range from 0.69 to 2.73 mmol of Ca per liter. Rectal temperature, inorganic blood phosphate, and potassium were all directly correlated with blood calcium, while glucose, lactate, and magnesium were inversely associated with calcium. Blood osteocalcin was significantly lower in hypocalcemic animals, indicating that de novo synthesis of bone was arrested during hypocalcemia. A mixed effect linear interpretative model explained 75% of the variation in blood calcium. Clinical symptoms like mood, appetite, muscle shivering, rumen motility, and paresis were individually correlated with blood calcium and were thereby predictive of hypocalcemia. The path analysis showed the central role of calcium in affecting the clinical symptoms. However, several other factors contributed to hypocalcemia

    Feature-based tuning of simulated annealing applied to the curriculum-based course timetabling problem

    Full text link
    We consider the university course timetabling problem, which is one of the most studied problems in educational timetabling. In particular, we focus our attention on the formulation known as the curriculum-based course timetabling problem, which has been tackled by many researchers and for which there are many available benchmarks. The contribution of this paper is twofold. First, we propose an effective and robust single-stage simulated annealing method for solving the problem. Secondly, we design and apply an extensive and statistically-principled methodology for the parameter tuning procedure. The outcome of this analysis is a methodology for modeling the relationship between search method parameters and instance features that allows us to set the parameters for unseen instances on the basis of a simple inspection of the instance itself. Using this methodology, our algorithm, despite its apparent simplicity, has been able to achieve high quality results on a set of popular benchmarks. A final contribution of the paper is a novel set of real-world instances, which could be used as a benchmark for future comparison

    When Composite Likelihood Meets Stochastic Approximation

    Full text link
    A composite likelihood is an inference function derived by multiplying a set of likelihood components. This approach provides a flexible framework for drawing inference when the likelihood function of a statistical model is computationally intractable. While composite likelihood has computational advantages, it can still be demanding when dealing with numerous likelihood components and a large sample size. This paper tackles this challenge by employing an approximation of the conventional composite likelihood estimator, which is derived from an optimization procedure relying on stochastic gradients. This novel estimator is shown to be asymptotically normally distributed around the true parameter. In particular, based on the relative divergent rate of the sample size and the number of iterations of the optimization, the variance of the limiting distribution is shown to compound for two sources of uncertainty: the sampling variability of the data and the optimization noise, with the latter depending on the sampling distribution used to construct the stochastic gradients. The advantages of the proposed framework are illustrated through simulation studies on two working examples: an Ising model for binary data and a gamma frailty model for count data. Finally, a real-data application is presented, showing its effectiveness in a large-scale mental health survey

    Estimation of lineup efficiency effects in Basketball using play-by-play data

    Get PDF
    The paper aims at defining a data-driven approach to team management in basketball. A model-based strategy, based on a modification of the adjusted plus-minus approach, is proposed for the analyses of the match progress. The main idea is to define a model based on the 5-man lineups instead of the single players. In this framework, given the large number of possible lineups, the regularization issue is quite relevant. The empirical application is based on the data of the current Italian championship (Serie A1). The play-by-play data are considered along with some information resulting from the game box scores

    Modified Profile Likelihood for Fixed-Effects Panel Data Models

    Get PDF
    We show how modified profile likelihood methods, developed in the statistical literature, may be effectively applied to estimate the structural parameters of econometric models for panel data, with a remarkable reduction of bias with respect to ordinary likelihood methods. Initially, the implementation of these methods is illustrated for general models for panel data including individual-specific fixed effects and then, in more detail, for the truncated linear regression model and dynamic regression models for binary data formulated along with different specifications. Simulation studies show the good behavior of the inference based on the modified profile likelihood, even when compared to an ideal, although infeasible, procedure (in which the fixed effects are known) and also to alternative estimators existing in the econometric literature. The proposed estimation methods are implemented in an R package that we make available to the reader
    corecore