13 research outputs found
First CLADAG data mining prize : data mining for longitudinal data with different marketing campaigns
The CLAssification and Data Analysis Group (CLADAG) of the Italian
Statistical Society recently organised a competition, the 'Young Researcher Data
Mining Prize' sponsored by the SAS Institute. This paper was the winning entry
and in it we detail our approach to the problem proposed and our results. The main
methods used are linear regression, mixture models, Bayesian autoregressive and
Bayesian dynamic models
Modelling treatment, age- and gender-specific recovery in acute injury studies
Background: Acute injury studies often measure physical ability repeatedly over
time through scores that have a finite range. This can result in a faster score change
at the beginning of the study than towards the end, motivating the investigation of
the rate of change. Additionally, the bounds of the score and their dependence on
covariates are often of interest.
Methods: We argue that transforming bounded data is not satisfactory in some
settings. Motivated by the Collaborative Ankle Support Trial (CAST), which investigated
different methods of immobilisation for severe ankle sprains, we developed a
model under the assumption that the recovery rate at a specific time is proportional to
the current score and the remaining score. This model enables a direct interpretation
of the covariate effects. We have re-analyzed the CAST data using these improved
methods, and explored novel relationships between age, gender and recovery rate.
Results: We confirm that using below knee cast is advantageous compared with a
tubular bandage in relation with the recovery rate. An age and gender effect on the
recovery rate and the maximum achievable score is demonstrated, with older female
patients recovering less fast (age-effect: -0.21, 95% confidence interval (CI) [-0.28,-
0.14]; gender effect: -0.06, CI [-0.12,-0.004]) and achieving a lower maximum score
(age-effect: -8.07, CI [-11.68,-4.01]; gender-effect: -5.34, CI [-8.18, -2.50]) than younger
male patients.
Conclusions: Our model is able to accurately model repeated measurements on the
original scale, while accounting for the bounded nature of a score. We demonstrate
that recovery in acute injury trials can differ substantially by age and gender. Older
female patients are less likely to recover well from a sprain
Sensitivity analysis for causality in observational studies for regulatory science
Recognizing the importance of real-world data (RWD) for regulatory purposes,
the United States (US) Congress passed the 21st Century Cures Act1 mandating
the development of Food and Drug Administration (FDA) guidance on regulatory
use of real-world evidence. The Forum on the Integration of Observational and
Randomized Data (FIORD) conducted a meeting bringing together various
stakeholder groups to build consensus around best practices for the use of RWD
to support regulatory science. Our companion paper describes in detail the
context and discussion carried out in the meeting, which includes a
recommendation to use a causal roadmap for complete pre-specification of study
designs using RWD. This article discusses one step of the roadmap: the
specification of a procedure for sensitivity analysis, defined as a procedure
for testing the robustness of substantive conclusions to violations of
assumptions made in the causal roadmap. We include a worked-out example of a
sensitivity analysis from a RWD study on the effectiveness of Nifurtimox in
treating Chagas disease, as well as an overview of various methods available
for sensitivity analysis in causal inference, emphasizing practical
considerations on their use for regulatory purposes
Analysing the rate of change in a longitudinal study with missing data, taking into account the number of contact attempts
In longitudinal and multivariate settings incomplete data, due to missed visits, dropouts or non-return of
questionnaires are quite common.
A longitudinal trial in which potentially informative missingness occurs is the Collaborative Ankle Support
Trial (CAST). The aim of this study is to estimate the clinical effectiveness of four different methods of
mechanical support after severe ankle sprain. The clinical status of multiple subjects was measured at four
points in time via a questionnaire and, based on this, a continuous and bounded outcome score was calculated.
Motivated by this study, a model is proposed for continuous longitudinal data with non-ignorable or
informative missingness, taking into account the number of attempts made to contact initial non-responders.
The model combines a non-linear mixed model for the underlying response model with a logistic regression
model for the reminder process.
The outcome model enables us to analyze the rate of improvement including the dependence on explanatory
variables. The non-linear mixed model is derived under the assumption that the rate of improvement in a given
time interval is proportional to the current score and the still achievable score. Based on this assumption a
differential equation is solved in order to obtain the model of interest.
The response model relates the probability of response at each contact attempt and point in time to
covariates and to observed and missing outcomes.
Using this model the impact of missingness on the rate of improvement is evaluated for different missingness
processes
Analysis of repeated measurements with missing data
This thesis discusses issues arising in the analysis of repeated measurement studies with
missing data.
The first part of the thesis is motivated by a study where continuous and bounded longitudinal
data form the outcome of interest. The aim of this study is to investigate the change
over time in the outcome variable and factors that influence this change. The analysis is
complicated because some patients withdraw from the study, leading to an incomplete data
set.
We propose a non-linear mixed model that specifies the rate of change and the
bounds of the outcome as a function of covariates. This mixed model has advantages over
transforming the data and is easy to interpret. We discuss different models for the covariance
structure of bounded continuous longitudinal data.
To explore the impact of missingness, we perform several sensitivity analyses. Further,
we propose a model for informative missingness, taking into account the number and
nature of reminders made to contact initial non-responders, and evaluate the impact of missingness
on estimates of change. We contrast this model with the traditional selection model,
where the missingness process is modelled.
Our investigations suggest that using the richer information of the reminder process
enables a more accurate choice of covariates which induce missingness, than modelling the
missingness process. Regarding the reminder process, we observe that phone calls are most
effective.
The second part of this thesis is motivated by dose-finding studies, where the number of
events per subject within a specified study period form the primary outcome. These studies
aim to identify a target dose for which the new drug can be shown to be as effective as a
competitor medication. Given a pain-related outcome, we expect many patients to drop out
before the end of the study. The impact of missingness on the analysis and models for the
missingness process must be carefully considered.
The recurrent events are modelled as over-dispersed Poisson process data, with dose
as regressor. Additional covariates may be included. Constant and time-varying rate functions
are examined. Based on a range of such models, the impact of missingness on the
precision of the target dose estimation is evaluated by simulations. Five different analysis
methods are assessed: a complete case analysis; two analyses using different single imputation techniques; a direct likelihood analysis; and an analysis using pattern-mixture models.
The target dose estimation is robust if the same missingness process holds for the
target dose group and the active control group. This robustness is lost as soon as the missingness
mechanisms for the active control and the target dose differ. Of the methods explored,
the direct-likelihood approach performs best, even when a missing not at random
mechanism holds
Higher Risk of Hypoglycemia with Glimepiride Versus Vildagliptin in Patients with Type 2 Diabetes is not Driven by High Doses of Glimepiride: Divergent Patient Susceptibilities?
In a previously published study, vildagliptin showed a reduced risk of hypoglycemia versus glimepiride as add-on therapy to metformin at similar efficacy. Glimepiride was titrated from a starting dose of 2 mg/day to a maximum dose of 6 mg/day. It is usually assumed that the increased hypoglycemia with glimepiride was driven by the 6 mg/day dose; it was therefore of interest to assess whether the risk of hypoglycemia is also different between vildagliptin and a low (2 mg/day) dose of glimepiride