Search CORE

223 research outputs found

An online survey of adults with cystic fibrosis: accessing and using life expectancy information

Author: Keogh Ruth
Publication venue: London School of Hygiene & Tropical Medicine
Publication date: 12/03/2019
Field of study

A spreadsheet containing a subset of the original data from all respondents (n=85) from an online questionnaire entitled "Online survey to gain understanding of what people with cystic fibrosis aged 16+ would like to learn about their life expectancy and other outcomes". The survey was conducted in July 2016. Responses to all multiple choice questions are included. Free text responses have been removed in accordance with information provided to the respondents. Ages have been categorised. The data do not contain any identifying information

LSHTM Data Compass

Bayesian correction for covariate measurement error: a frequentist evaluation and comparison with regression calibration

Author: Bartlett Jonathan W.
Keogh Ruth H.
Publication venue: 'SAGE Publications'
Publication date: 20/03/2016
Field of study

Bayesian approaches for handling covariate measurement error are well established, and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration (RC), arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to RC, and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next we describe the closely related maximum likelihood and multiple imputation approaches, and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of RC and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey

arXiv.org e-Print Archive

Crossref

LSHTM Research Online

A guide to interpreting estimated median age of survival in cystic fibrosis patient registry reports.

Author: Keogh Ruth H
Stanojevic Sanja
Publication venue: 'Elsevier BV'
Publication date: 01/03/2018
Field of study

Survival statistics, estimated using data collected by national cystic fibrosis (CF) patient registries, are used to inform the CF community and monitor survival of CF populations. Annual registry reports typically give the median age of survival, though different registries use different estimation approaches and terminology, which has created confusion for the community. In this article we explain how median age of survival is estimated, what its interpretation is, and what assumptions and limitations are involved. Information on survival from birth is less useful for individuals who have already reached a certain age and we propose use of conditional survivor curves to address this. We provide recommendations for CF registries with the aim of facilitating clear and consistent reporting of survival statistics. Our recommendations are illustrated using data from the UK Cystic Fibrosis Registry

Crossref

LSHTM Research Online

A toolkit for measurement error correction, with a focus on nutritional epidemiology.

Author: Keogh Ruth H
White Ian R
Publication venue: 'Wiley'
Publication date: 04/02/2014
Field of study

Exposure measurement error is a problem in many epidemiological studies, including those using biomarkers and measures of dietary intake. Measurement error typically results in biased estimates of exposure-disease associations, the severity and nature of the bias depending on the form of the error. To correct for the effects of measurement error, information additional to the main study data is required. Ideally, this is a validation sample in which the true exposure is observed. However, in many situations, it is not feasible to observe the true exposure, but there may be available one or more repeated exposure measurements, for example, blood pressure or dietary intake recorded at two time points. The aim of this paper is to provide a toolkit for measurement error correction using repeated measurements. We bring together methods covering classical measurement error and several departures from classical error: systematic, heteroscedastic and differential error. The correction methods considered are regression calibration, which is already widely used in the classical error setting, and moment reconstruction and multiple imputation, which are newer approaches with the ability to handle differential error. We emphasize practical application of the methods in nutritional epidemiology and other fields. We primarily consider continuous exposures in the exposure-outcome model, but we also outline methods for use when continuous exposures are categorized. The methods are illustrated using the data from a study of the association between fibre intake and colorectal cancer, where fibre intake is measured using a diet diary and repeated measures are available for a subset

Crossref

LSHTM Research Online

PubMed Central

Handling missing data in matched case-control studies using multiple imputation.

Author: Keogh Ruth H
Seaman Shaun R
Publication venue: Biometrics
Publication date: 03/08/2015
Field of study

Analysis of matched case-control studies is often complicated by missing data on covariates. Analysis can be restricted to individuals with complete data, but this is inefficient and may be biased. Multiple imputation (MI) is an efficient and flexible alternative. We describe two MI approaches. The first uses a model for the data on an individual and includes matching variables; the second uses a model for the data on a whole matched set and avoids the need to model the matching variables. Within each approach, we consider three methods: full-conditional specification (FCS), joint model MI using a normal model, and joint model MI using a latent normal model. We show that FCS MI is asymptotically equivalent to joint model MI using a restricted general location model that is compatible with the conditional logistic regression analysis model. The normal and latent normal imputation models are not compatible with this analysis model. All methods allow for multiple partially-observed covariates, non-monotone missingness, and multiple controls per case. They can be easily applied in standard statistical software and valid variance estimates obtained using Rubin's Rules. We compare the methods in a simulation study. The approach of including the matching variables is most efficient. Within each approach, the FCS MI method generally yields the least-biased odds ratio estimates, but normal or latent normal joint model MI is sometimes more efficient. All methods have good confidence interval coverage. Data on colorectal cancer and fibre intake from the EPIC-Norfolk study are used to illustrate the methods, in particular showing how efficiency is gained relative to just using individuals with complete data

Crossref

LSHTM Research Online

PubMed Central

Apollo (Cambridge)

Simulating data from marginal structural models for a survival time outcome

Author: Keogh Ruth H
Seaman Shaun R
Publication venue
Publication date: 10/09/2023
Field of study

Marginal structural models (MSMs) are often used to estimate causal effects of treatments on survival time outcomes from observational data when time-dependent confounding may be present. They can be fitted using, e.g., inverse probability of treatment weighting (IPTW). It is important to evaluate the performance of statistical methods in different scenarios, and simulation studies are a key tool for such evaluations. In such simulation studies, it is common to generate data in such a way that the model of interest is correctly specified, but this is not always straightforward when the model of interest is for potential outcomes, as is an MSM. Methods have been proposed for simulating from MSMs for a survival outcome, but these methods impose restrictions on the data-generating mechanism. Here we propose a method that overcomes these restrictions. The MSM can be a marginal structural logistic model for a discrete survival time or a Cox or additive hazards MSM for a continuous survival time. The hazard of the potential survival time can be conditional on baseline covariates, and the treatment variable can be discrete or continuous. We illustrate the use of the proposed simulation algorithm by carrying out a brief simulation study. This study compares the coverage of confidence intervals calculated in two different ways for causal effect estimates obtained by fitting an MSM via IPTW.Comment: 29 pages, 2 figure

arXiv.org e-Print Archive

Genetically enhanced recombinant lectins for glyco-selective analysis and purification

Author: Clarke Paul A.
Keogh Damien
Larragy Ruth
O'Connell Michael
O'Connor Brendan
Thompson Roisin
Publication venue
Publication date: 16/06/2011
Field of study

- Generation of a library of recombinant prokaryotic lectins (RPL’s) through random mutagenesis of the carbohydrate binding sites of bacterial lectins. - Characterisation of mutant lectins with respect to structure and specificity - Provision of mutant RPL’s with enhanced affinity and/or altered specificity, alongside wild-type RPL’s, for glycoprotein analysis and purificatio

DCU Online Research Access Service

Using full-cohort data in nested case-control and case-cohort studies by multiple imputation.

Author: Keogh Ruth H
White Ian R
Publication venue: 'Wiley'
Publication date: 23/04/2013
Field of study

In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure-disease association studies are therefore often based on nested case-control or case-cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case-control or case-cohort study plus the remainder of the cohort as a full-cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub-studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full-cohort information in the analysis of nested case-control and case-cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter-matching in nested case-control studies and a weighted analysis for case-cohort studies, both of which use some full-cohort information. Approximate imputation models perform well except when there are interactions or non-linear terms in the outcome model, where imputation using rejection sampling works well

Crossref

LSHTM Research Online

Big data: Some statistical issues.

Author: Cox DR
Kartsonaki Christiana
Keogh Ruth H
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

A broad review is given of the impact of big data on various aspects of investigation. There is some but not total emphasis on issues in epidemiological research

Crossref

LSHTM Research Online

Oxford University Research Archive