5 research outputs found
Comparing diagnostic tests with missing data
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.
Sensitivity analysis for incomplete continuous data
In studies with missing data, statisticians typically identify the model via necessarily untestable assumptions and then perform sensitivity analyses to assess their effect on the conclusions. Both the parameterization and the identification of the model play an important role in translating the assumptions to non-statisticians and, consequently, in obtaining relevant information from experts or historical data. Specifically for continuous data, much of the earlier work has been developed under the assumption of normality and/or with hard-to-interpret sensitivity parameters. We derive a simple approach for estimating means, standard deviations and correlations that avoids parametric distributional assumptions for the outcomes. Adopting a pattern-mixture model parameterization, we use non-identifiable means, standard deviations, correlations or functions thereof as sensitivity parameters, which are more easily elicited
Semi-parametric Bayesian analysis of binary responses with a continuous covariate subject to non-random missingness
© 2015 SAGE Publications. Missingness in explanatory variables requires a model for the covariates even if the interest lies only in a model for the outcomes given the covariates. An incorrect specification of the models for the covariates or for the missingness mechanism may lead to biased inferences for the parameters of interest. Previously published articles either use semi-/non-parametric flexible distributions for the covariates and identify the model via a missing at random assumption, or employ parametric distributions for the covariates and allow a more general non-random missingness mechanism. We consider the analysis of binary responses, combining a missing not at random mechanism with a non-parametric model based on a Dirichlet process mixture for the continuous covariates. We illustrate the proposal with simulations and the analysis of a dataset.status: publishe
Inferential Implications of Over-Parametrization: A Case Study in Incomplete Categorical Data
P>In the context of either Bayesian or classical sensitivity analyses of over-parametrized models for incomplete categorical data, it is well known that prior-dependence on posterior inferences of nonidentifiable parameters or that too parsimonious over-parametrized models may lead to erroneous conclusions. Nevertheless, some authors either pay no attention to which parameters are nonidentifiable or do not appropriately account for possible prior-dependence. We review the literature on this topic and consider simple examples to emphasize that in both inferential frameworks, the subjective components can influence results in nontrivial ways, irrespectively of the sample size. Specifically, we show that prior distributions commonly regarded as slightly informative or noninformative may actually be too informative for nonidentifiable parameters, and that the choice of over-parametrized models may drastically impact the results, suggesting that a careful examination of their effects should be considered before drawing conclusions.Resume Que ce soit dans un cadre Bayesien ou classique, il est bien connu que la surparametrisation, dans les modeles pour donnees categorielles incompletes, peut conduire a des conclusions erronees. Cependant, certains auteurs persistent a negliger les problemes lies a la presence de parametres non identifies. Nous passons en revue la litterature dans ce domaine, et considerons quelques exemples surparametres simples dans lesquels les elements subjectifs influencent de facon non negligeable les resultats, independamment de la taille des echantillons. Plus precisement, nous montrons comment des a priori consideres comme peu ou non-informatifs peuvent se reveler extremement informatifs en ce qui concerne les parametres non identifies, et que le recours a des modeles surparametres peut avoir sur les conclusions finales un impact considerable. Ceci suggere un examen tres attentif de l`impact potentiel des a priori.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES), BrazilFundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP), BrazilFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq), BrazilConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Fundação para a Ciência e a Tecnologia de Portugal (FCT)Fundacao para a Ciencia e Tecnologia (FCT) through the research centre CEAUL-FCUL, PortugalIAP research Network of the Belgian Government (Belgian Science Policy)[P6/03]IAP research Network of the Belgian Government (Belgian Science Policy