21,899 research outputs found
Handling non-ignorable dropouts in longitudinal data: A conditional model based on a latent Markov heterogeneity structure
We illustrate a class of conditional models for the analysis of longitudinal
data suffering attrition in random effects models framework, where the
subject-specific random effects are assumed to be discrete and to follow a
time-dependent latent process. The latent process accounts for unobserved
heterogeneity and correlation between individuals in a dynamic fashion, and for
dependence between the observed process and the missing data mechanism. Of
particular interest is the case where the missing mechanism is non-ignorable.
To deal with the topic we introduce a conditional to dropout model. A shape
change in the random effects distribution is considered by directly modeling
the effect of the missing data process on the evolution of the latent
structure. To estimate the resulting model, we rely on the conditional maximum
likelihood approach and for this aim we outline an EM algorithm. The proposal
is illustrated via simulations and then applied on a dataset concerning skin
cancers. Comparisons with other well-established methods are provided as well
Probabilistic Inference from Arbitrary Uncertainty using Mixtures of Factorized Generalized Gaussians
This paper presents a general and efficient framework for probabilistic
inference and learning from arbitrary uncertain information. It exploits the
calculation properties of finite mixture models, conjugate families and
factorization. Both the joint probability density of the variables and the
likelihood function of the (objective or subjective) observation are
approximated by a special mixture model, in such a way that any desired
conditional distribution can be directly obtained without numerical
integration. We have developed an extended version of the expectation
maximization (EM) algorithm to estimate the parameters of mixture models from
uncertain training examples (indirect observations). As a consequence, any
piece of exact or uncertain information about both input and output values is
consistently handled in the inference and learning stages. This ability,
extremely useful in certain situations, is not found in most alternative
methods. The proposed framework is formally justified from standard
probabilistic principles and illustrative examples are provided in the fields
of nonparametric pattern classification, nonlinear regression and pattern
completion. Finally, experiments on a real application and comparative results
over standard databases provide empirical evidence of the utility of the method
in a wide range of applications
Integration of survey data and big observational data for finite population inference using mass imputation
Multiple data sources are becoming increasingly available for statistical
analyses in the era of big data. As an important example in finite-population
inference, we consider an imputation approach to combining a probability sample
with big observational data. Unlike the usual imputation for missing data
analysis, we create imputed values for the whole elements in the probability
sample. Such mass imputation is attractive in the context of survey data
integration (Kim and Rao, 2012). We extend mass imputation as a tool for data
integration of survey data and big non-survey data. The mass imputation methods
and their statistical properties are presented. The matching estimator of
Rivers (2007) is also covered as a special case. Variance estimation with
mass-imputed data is discussed. The simulation results demonstrate the proposed
estimators outperform existing competitors in terms of robustness and
efficiency
Monte Carlo modified profile likelihood in models for clustered data
The main focus of the analysts who deal with clustered data is usually not on
the clustering variables, and hence the group-specific parameters are treated
as nuisance. If a fixed effects formulation is preferred and the total number
of clusters is large relative to the single-group sizes, classical frequentist
techniques relying on the profile likelihood are often misleading. The use of
alternative tools, such as modifications to the profile likelihood or
integrated likelihoods, for making accurate inference on a parameter of
interest can be complicated by the presence of nonstandard modelling and/or
sampling assumptions. We show here how to employ Monte Carlo simulation in
order to approximate the modified profile likelihood in some of these
unconventional frameworks. The proposed solution is widely applicable and is
shown to retain the usual properties of the modified profile likelihood. The
approach is examined in two instances particularly relevant in applications,
i.e. missing-data models and survival models with unspecified censoring
distribution. The effectiveness of the proposed solution is validated via
simulation studies and two clinical trial applications
- …