429 research outputs found

    Formal and Informal Model Selection with Incomplete Data

    Full text link
    Model selection and assessment with incomplete data pose challenges in addition to the ones encountered with complete data. There are two main reasons for this. First, many models describe characteristics of the complete data, in spite of the fact that only an incomplete subset is observed. Direct comparison between model and data is then less than straightforward. Second, many commonly used models are more sensitive to assumptions than in the complete-data situation and some of their properties vanish when they are fitted to incomplete, unbalanced data. These and other issues are brought forward using two key examples, one of a continuous and one of a categorical nature. We argue that model assessment ought to consist of two parts: (i) assessment of a model's fit to the observed data and (ii) assessment of the sensitivity of inferences to unverifiable assumptions, that is, to how a model described the unobserved data given the observed ones.Comment: Published in at http://dx.doi.org/10.1214/07-STS253 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Discussion of Likelihood Inference for Models with Unobservables: Another View

    Full text link
    Discussion of "Likelihood Inference for Models with Unobservables: Another View" by Youngjo Lee and John A. Nelder [arXiv:1010.0303]Comment: Published in at http://dx.doi.org/10.1214/09-STS277A the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A goodness-of-fit test for the random-effects distribution in mixed models

    Get PDF
    In this paper, we develop a simple diagnostic test for the random-effects distribution in mixed models. The test is based on the gradient function, a graphical tool proposed by Verbeke and Molenberghs to check the impact of assumptions about the random-effects distribution in mixed models on inferences. Inference is conducted through the bootstrap. The proposed test is easy to implement and applicable in a general class of mixed models. The operating characteristics of the test are evaluated in a simulation study, and the method is further illustrated using two real data analyses

    A combined beta and normal random-effects model for repeated, overdispersed binary and binomial data

    Get PDF
    AbstractNon-Gaussian outcomes are often modeled using members of the so-called exponential family. Notorious members are the Bernoulli model for binary data, leading to logistic regression, and the Poisson model for count data, leading to Poisson regression. Two of the main reasons for extending this family are (1) the occurrence of overdispersion, meaning that the variability in the data is not adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of hierarchical structure in the data, stemming from clustering in the data which, in turn, may result from repeatedly measuring the outcome, for various members of the same family, etc. The first issue is dealt with through a variety of overdispersion models, such as, for example, the beta-binomial model for grouped binary data and the negative-binomial model for counts. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these phenomena may occur simultaneously, models combining them are uncommon. This paper starts from the broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. We place particular emphasis on so-called conjugate random effects at the level of the mean for the first aspect and normal random effects embedded within the linear predictor for the second aspect, even though our family is more general. The binary and binomial cases are our focus. Apart from model formulation, we present an overview of estimation methods, and then settle for maximum likelihood estimation with analytic-numerical integration. The methodology is applied to two datasets of which the outcomes are binary and binomial, respectively

    Hierarchical models with normal and conjugate random effects : a review

    Get PDF
    Molenberghs, Verbeke, and Demétrio (2007) and Molenberghs et al. (2010) proposed a general framework to model hierarchical data subject to within-unit correlation and/or overdispersion. The framework extends classical overdispersion models as well as generalized linear mixed models. Subsequent work has examined various aspects that lead to the formulation of several extensions. A unified treatment of the model framework and key extensions is provided. Particular extensions discussed are: explicit calculation of correlation and other moment-based functions, joint modelling of several hierarchical sequences, versions with direct marginally interpretable parameters, zero-inflation in the count case, and influence diagnostics. The basic models and several extensions are illustrated using a set of key examples, one per data type (count, binary, multinomial, ordinal, and time-to-event)

    Longitudinal quantile regression in presence of informative drop-out through longitudinal-survival joint modeling

    Full text link
    We propose a joint model for a time-to-event outcome and a quantile of a continuous response repeatedly measured over time. The quantile and survival processes are associated via shared latent and manifest variables. Our joint model provides a flexible approach to handle informative drop-out in quantile regression. A general Monte Carlo Expectation Maximization strategy based on importance sampling is proposed, which is directly applicable under any distributional assumption for the longitudinal outcome and random effects, and parametric and non-parametric assumptions for the baseline hazard. Model properties are illustrated through a simulation study and an application to an original data set about dilated cardiomyopathies

    Clusters with random size: maximum likelihood versus weighted estimation

    Get PDF
    Abstract: There are many contemporary designs that do not use a random sample of a fixed, a priori determined size. In case of informative cluster sizes, the cluster size is influenced by the the cluster's data, but here we cope with some issues that even occur when the cluster size and the data are unrelated. First, fitting models to clusters of varying sizes is often more complicated than when all cluster have the same size. Secondly, in such cases, there usually is no so-called complete sufficient statistic

    Missing data in trial-based cost-effectiveness analysis: An incomplete journey.

    Get PDF
    Cost-effectiveness analyses (CEA) conducted alongside randomised trials provide key evidence for informing healthcare decision making, but missing data pose substantive challenges. Recently, there have been a number of developments in methods and guidelines addressing missing data in trials. However, it is unclear whether these developments have permeated CEA practice. This paper critically reviews the extent of and methods used to address missing data in recently published trial-based CEA. Issues of the Health Technology Assessment journal from 2013 to 2015 were searched. Fifty-two eligible studies were identified. Missing data were very common; the median proportion of trial participants with complete cost-effectiveness data was 63% (interquartile range: 47%-81%). The most common approach for the primary analysis was to restrict analysis to those with complete data (43%), followed by multiple imputation (30%). Half of the studies conducted some sort of sensitivity analyses, but only 2 (4%) considered possible departures from the missing-at-random assumption. Further improvements are needed to address missing data in cost-effectiveness analyses conducted alongside randomised trials. These should focus on limiting the extent of missing data, choosing an appropriate method for the primary analysis that is valid under contextually plausible assumptions, and conducting sensitivity analyses to departures from the missing-at-random assumption
    corecore