Search CORE

232,212 research outputs found

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study

Author: A Burton
A Burton
A Marshall
AH Herring
AM Wood
Andrea Marshall
DB Rubin
DB Rubin
DB Rubin
Douglas G Altman
F Barzi
FE Harrell
FE Harrell
FH Kong
HY Chen
I White
J Schafer
J Scheffer
JL Schafer
JL Schafer
JL Schafer
JL Schafer
JL Schafer
KH Li
LM Collins
LQ Tang
M Hu
N Schenker
NJ Horton
P Royston
Patrick Royston
PD Faris
R Bender
R Development Core Team
R Oostenbrink
RJA Little
Roger L Holder
S Demissie
S Greenland
S van Buuren
S van Buuren
SR Lipsitz
SR Lipsitz
TG Clark
W Sauerbrei
W Vach
XL Meng
XL Meng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: There is no consensus on the most appropriate approach to handle missing covariate data within prognostic modelling studies. Therefore a simulation study was performed to assess the effects of different missing data techniques on the performance of a prognostic model. Methods: Datasets were generated to resemble the skewed distributions seen in a motivating breast cancer example. Multivariate missing data were imposed on four covariates using four different mechanisms; missing completely at random (MCAR), missing at random (MAR), missing not at random (MNAR) and a combination of all three mechanisms. Five amounts of incomplete cases from 5% to 75% were considered. Complete case analysis (CC), single imputation (SI) and five multiple imputation (MI) techniques available within the R statistical software were investigated: a) data augmentation (DA) approach assuming a multivariate normal distribution, b) DA assuming a general location model, c) regression switching imputation, d) regression switching with predictive mean matching (MICE-PMM) and e) flexible additive imputation models. A Cox proportional hazards model was fitted and appropriate estimates for the regression coefficients and model performance measures were obtained. Results: Performing a CC analysis produced unbiased regression estimates, but inflated standard errors, which affected the significance of the covariates in the model with 25% or more missingness. Using SI, underestimated the variability; resulting in poor coverage even with 10% missingness. Of the MI approaches, applying MICE-PMM produced, in general, the least biased estimates and better coverage for the incomplete covariates and better model performance for all mechanisms. However, this MI approach still produced biased regression coefficient estimates for the incomplete skewed continuous covariates when 50% or more cases had missing data imposed with a MCAR, MAR or combined mechanism. When the missingness depended on the incomplete covariates, i.e. MNAR, estimates were biased with more than 10% incomplete cases for all MI approaches. Conclusion: The results from this simulation study suggest that performing MICE-PMM may be the preferred MI approach provided that less than 50% of the cases have missing data and the missing data are not MNAR

Crossref

Springer - Publisher Connector

University of Birmingham Research Portal

PubMed Central

UCL Discovery

Warwick Research Archives Portal Repository

Oxford University Research Archive

Fuzzy Logic in Clinical Practice Decision Support Systems

Author: Beliakov Gleb
Warren Jim
Zwaag Berend van der
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2000
Field of study

Computerized clinical guidelines can provide significant benefits to health outcomes and costs, however, their effective implementation presents significant problems. Vagueness and ambiguity inherent in natural (textual) clinical guidelines is not readily amenable to formulating automated alerts or advice. Fuzzy logic allows us to formalize the treatment of vagueness in a decision support architecture. This paper discusses sources of fuzziness in clinical practice guidelines. We consider how fuzzy logic can be applied and give a set of heuristics for the clinical guideline knowledge engineer for addressing uncertainty in practice guidelines. We describe the specific applicability of fuzzy logic to the decision support behavior of Care Plan On-Line, an intranet-based chronic care planning system for General Practitioners

CiteSeerX

Deakin Research Online

Crossref

University of Twente Research Information

Multiple Imputation Using Gaussian Copulas

Author: Bojinov Iavor
Hollenbach Florian M.
Metternich Nils W.
Minhas Shahryar
Minhas Shahryar
Volfovsky Alexander
Ward Michael D.
Publication venue
Publication date: 01/01/2018
Field of study

Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this paper, we present a simple-to-use method for generating multiple imputations using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff, 2007) allows scholars to attain estimation results that have good coverage and small bias. The use of copulas to model the dependence among variables will enable researchers to construct valid joint distributions of the data, even without knowledge of the actual underlying marginal distributions. Multiple imputations are then generated by drawing observations from the resulting posterior joint distribution and replacing the missing values. Using simulated and observational data from published social science research, we compare imputation via Gaussian copulas with two other widely used imputation methods: MICE and Amelia II. Our results suggest that the Gaussian copula approach has a slightly smaller bias, higher coverage rates, and narrower confidence intervals compared to the other methods. This is especially true when the variables with missing data are not normally distributed. These results, combined with theoretical guarantees and ease-of-use suggest that the approach examined provides an attractive alternative for applied researchers undertaking multiple imputations

arXiv.org e-Print Archive

UCL Discovery

A conceptual approach to gene expression analysis enhanced by visual analytics

Author: Andrews Simon
Aufaire Marie-Aude
Burger Albert
McLeod Kenneth
Melo Cassio
Orphanides Constantinos
Publication venue: ACM New York
Publication date: 01/01/2013
Field of study

The analysis of gene expression data is a complex task for biologists wishing to understand the role of genes in the formation of diseases such as cancer. Biologists need greater support when trying to discover, and comprehend, new relationships within their data. In this paper, we describe an approach to the analysis of gene expression data where overlapping groupings are generated by Formal Concept Analysis and interactively analyzed in a tool called CUBIST. The CUBIST workflow involves querying a semantic database and converting the result into a formal context, which can be simplified to make it manageable, before it is visualized as a concept lattice and associated charts

CiteSeerX

Heriot Watt Pure

Crossref

Sheffield Hallam University Research Archive

Reasoning about context in uncertain pervasive computing environments

Author: A. Padovitz
A. Padovitz
A. Ranganathan
C.B. Anagnostopoulos
G. Bruce
H.J. Zimmermann
J.M. Mendel
J.R. Jang
K. Henricksen
P. Castro
P. Haghighi
Publication venue
Publication date: 01/01/2008
Field of study

Crossref

Portsmouth University Research Portal (Pure)

An Empirical Approach to Temporal Reference Resolution

Author: Janyce Wiebe
Kenneth Mckeever
Thorsten Ohrstrsm-s
Thorsten Öhrström-Sandgren
Tom O&apos
Publication venue
Publication date: 01/01/1997
Field of study

This paper presents the results of an empirical investigation of temporal reference resolution in scheduling dialogs. The algorithm adopted is primarily a linear-recency based approach that does not include a model of global focus. A fully automatic system has been developed and evaluated on unseen test data with good results. This paper presents the results of an intercoder reliability study, a model of temporal reference resolution that supports linear recency and has very good coverage, the results of the system evaluated on unseen test data, and a detailed analysis of the dialogs assessing the viability of the approach.Comment: 13 pages, latex using aclap.st

arXiv.org e-Print Archive

CiteSeerX