49 research outputs found
Survival analysis for longitudinal data
In longitudinal studies with a set of continuous or ordinal repeated response
variables it may be convenient to summarise the outcome as a threshold
event. Then, the time to this event becomes of interest. This is particularly
true of recent Ophthalmological trials evaluating the effect of treatment
on the loss of visual acuity over time. However, the practice of employing
conventional survival analysis methods for testing the null hypothesis of
no treatment effect in these types of studies is intrinsically flawed as the
exact time to the threshold event is not measured. In this paper we obtain
a general likelihood for the unknown parameters when the underlying sur-
vival model is parametric. We also recover the actual information available
in repeated measures data for a variety of models and compare the results
with those obtained using a mis-specified model, which assumes the time
to the event is one of the possibly irregularly spaced inspection times
A logistic regression model for survival data
The near non-identifiability of one of the parameters in the Generalized
Time-Dependent Logistic (GTDL) Survival Model (MacKenzie, 1996, 1997)
is discussed. A new canonical 3-parameter logistic model survival model,
in which all of the parameters are identifiable, is obtained. A direct connection
with Fisher’s Z distribution is established. The properties of this
non-PH model are contrasted briefly with Cox’s PH model. The new model
is used to investigate survival from lung cancer in a population study
Advances in covariance modelling
Conventionally, in longitudinal studies, the mean structure has been thought
to be more important than the covariance structure between the repeated
measures on the same individual. Often, it has been argued that, with re-
spect to the mean, the covariance was merely a `nuisance parameter' and,
consequently, was not of `scientific interest'. Today, however, one can see
that from a formal statistical standpoint, the inferential problem is entirely
symmetric in both parameters. In recent years there has been a steady
stream of new results and we pause to review some key advances in the expanding field of covariance modelling, In particular, developments since the
seminal work by Pourahmadi (1999, 2000) are traced. While the main focus
is on longitudinal data with continuous responses, emerging approaches to
joint mean-covariance modelling in the GEE, and GLMM arenas are also
considered briefly
Survival with primary lung cancer in Northern Ireland: 1991–1992
Lung cancer is a major cause of death in Western countries, but survival had never been studied in Northern Ireland (NI) on a population basis prior to this study.
Aims The primary aims were to describe the survival of patients with primary lung cancer, evaluate the effect of treatment, identify patient characteristics influencing survival and treatment and describe current trends in survival.
Methods A population-based study identified all incident cases of primary lung cancer in NI during 1991–2 and followed them for 21 months. Their clinical notes were traced and relevant details abstracted. Survival status was monitored via the Registrar General’s Office, and ascertainment is thought to be near-complete. Appropriate statistical methods were used to analyse the survival data.
Results Some 855 incident cases were studied. Their 1-year survival was 24.5% with a median survival time of 4.7 months. Surgical patients had the best 1-year survival, 76.8%; however, adjustment suggested that about half of the benefit could be attributed to case-mix factors. Factors influencing treatment allocation were also identified, and a screening test showed the discordance between ‘model’ and ‘medic’: 210 patients were misclassified. Finally, the current trend in 1-year survival observed in the Republic of Ireland was best in the British Isles.
Conclusions Overall, survival remains poor. The better survival of surgical patients is due, in part, to their superior case-mix profiles. Survival with other therapies is less good suggesting that the criteria for treatment might be relaxed with advantage .using a treatment model to aid decision-making.</p
Gilbert MacKenzie’s contribution to the discussion of ‘the Discussion Meeting on Probabilistic and statistical aspects of machine learning’
I should like first to congratulate both authors on presenting their very stimulating work in the spirit of the conference’s theme of statisticians and machine learners working in alliance. There is no doubt in my mind that collaboration will lead to better, more parsimonious, models. My comments relate, mainly, to the second discussant’s points on over-parametrization.</p
The Erne mobile intensive coronary care study mortality, survival and MICCU
This paper deals with the analysis and interpretation of data relating to mortality and survival
in the first year of operation of the Erne MICCU study in Co. Fermanagh.
Aims: We aimed to measure in-hospital mortality from AMI, on WHO criteria, identify factors influencing
mortality and survival and assess the performance of the MICCU.
Methods: All first admissions of suspected AMI to the CCU from the Fermanagh District in 1983-84.
Some 297 patients were analysed. We recorded: demographic data, previous history of heart disease and
co-morbidity, status of the current attack, delay to CCU, treatment and outcome. In total, 28 variables
grouped as: (a) basic risk factors (18) and (b) clinical and treatment risk factors (10), were analysed.
Outcomes: In-hospital mortality and survival and performance of the MICCU.
Results: There were 329 admissions to the CCU of all types of which 297 (90.3%) were first admissions.
Of the 297, 170 (57.2%) had AMI on WHO criteria and 42 (14.1%) were dead at discharge. Crude, 28 day,
mortality (and unadjusted survival were statistically significantly worse in the AMI group. The multi-factor
mortality analysis identified 5 variables influencing death at discharge. In relation to multi-factor survival,
the MPR Weibull model identified a set of 9 variables in which the treatment variables pre-dominated over
basic risk factors. The MICCU delivered patients to hospital statistically significantly earlier (5 hours on
average) than other modes of transport, but did not prevent more deaths than the ordinary ambulance.
Conclusions There was no evidence of a direct, statistically significant, beneficial MICCU effect in either
of the multi-factor mortality or survival models. However, the performance of the MICCU, measured in
terms of crude survival, resulted from an adverse case-mix, which, when controlled for, suggested a small
MICCU benefit. The findings relate to the first year of operation of the Erne MICCU study and may
improve in later years
Lasso model selection in multi-dimensional contingency tables?
We develop a Smooth Lasso for sparse, high dimensional, contingency tables and compare its performance with the usual Lasso and with the now classical backwards elimination algorithm. In simulation, the usual Lasso had great difficulty identifying the correct model. Irrespective of the sample size, it did not succeed in identifying the correct model in the simulation study! By comparison the smooth Lasso performed better improving with increasing sample size. The backwards elimination algorithm also performed well and was better than the Smooth Lasso at small sample sizes. Another potential difficulty is that Lasso methods do not respect the marginal constraints on hierarchy and so lead to non-hierarchical models which are unscientific. Furthermore, even when one can demonstrate, classically, that some effects in the model are inestimable, the Lasso methods provide penalized estimates. These problems call Lasso methods into question
Space -time clustering revisited
The history of space-time clustering concepts and methods are re-
viewed briefly. The space-time clustering model of Ederer et al is investigated in
detail. This method has been used extensively in the epidemiological literature,
but we show that the distribution of main test statistic involved does not follow the distribution proposed by the authors. We note, too, that the two indices
proposed are not statistically independent, leading to potential over-reporting
in the epidemiological literature. We obtain the correlation between the original
clustering indices and suggest a new combined test statistic which has the correct
null distribution. We develop a fuller spatial model and illustrate the methodology using data from a study of the incidence of childhood leukaemia in Northern
Ireland
Prototype modelling of body surface maps using tangent distance
In recent years the tangent distance approach of Simard et al (1993) to
pattern recognition tasks such as handwritten character recognition has
proven successful. Simard's distance measure could be made locally invariant to any set of transformations of the input and when implemented in
one-nearest-neighbour classification of handwritten digits, outperformed all
other classification schemes.
Hastie et al (1996) propose prototype models which generalise the concept
of a centroid of a set of images in the Euclidian metric to a low-dimensional
hyperplane in the tangent metric, and these prototypes can be used to
reduce lookup time in classification.
We propose to apply and extend the tangent distance approach to classify
a set of body surface maps, which are recordings of the electrical activity
of the heart, of a large number of patients with various cardiac conditions.
Using a grid of p electrodes attached to the anterior chest, we calculate a
number of p-dimensional observation vectors for each patient and classify
input maps on the basis of overall distance of map to prototypes derived
from training set maps over all included observation vectors
Modelling marginal covariance structures in linear mixed models
Pourahmadi (1999) provided a convenient reparameterisation of the marginal
covariance matrix arising in longitudinal studies. We exploit his work to
model the dependence of this covariance structure on baseline covariates,
time and their interaction. The rationale for this approach is the realisation
that in linear mixed models (LMMs) the assumption of a homogeneous covariance
structure with respect to the covariate space is a testable model
choice. Accordingly, we provide methods for testing this assumption and
re-analyse Kenward’s (1987) cattle data set using our new model