21 research outputs found
Statistical modelling of neuron degeneration
SUMMARY: Parkinsonās disease, Huntingtonās disease, Amyotrophic lateral sclerosis (ALS) and Alzheimerās disease are all examples of neurodegenerative disorders that result from the premature death of nerve cells or neurons. In order to understand the mechanisms through which these diseases advance, a number of models have been put forward to describe the decline in the numbers of surviving neurons. Such work has been hampered by the poor quality of estimates of the numbers of surviving neurons and also by questionable model selection techniques. Recent work has favoured the adoption of the exponential model to explain neurodegenerative decline. We present in this paper a methodology for challenging this model, using data from patients with ALS. We use a two stage procedure to study motor unit numbers. The first stage involves determining the number of motor units in a muscle on several occasions over a period of time. The method of Ridall et al. (2007) is used which makes use of reversible jump Markov chain Monte Carlo (RJMCMC). The second stage involves the analysis of the RJMCMC output by using a hiddenMarkov process of decline. Two such processes of decline are compared. The first is the exponential where the rate parameter is constant. This is compared to a more general semi-parametric process where the rate parameter is allowed to vary over time. The rate is set to be piecewise constant between recordings where the magnitudes of the change in rate are weakly constrained by the length of the interval between recording occasions. Between model comparisons are based on electrophysiological data collected from a group of ALS patients where motor units (MUs) are gradually lost leading to progressive muscle weakness. By calculating marginal likelihoods, we find the Bayes factor in support of the exponential decline model against the more general alternative. This approach is illustrated with four ALS patients. Prediction of MU numbers lost, which incorporates both models, can also be made. Our methods, we therefore believe, have a role in formulating and evaluating biological models for neural degeneration of the motor system in ALS patients
Bayesian sequential experimental design for binary response data with application to electromyographic experiments
We develop a sequential Monte Carlo approach for Bayesian analysis of the experimental design for binary response data. Our work is motivated by surface electromyographic (SEMG) experiments, which can be used to provide information about the functionality of subjects' motor units. These experiments involve a series of stimuli being applied to a motor unit, with whether or not the motor unit res for each stimulus being recorded. The aim is to learn about how the probability of ring depends on the applied stimulus (the so-called stimulus response curve); One such excitability parameter is an estimate of the stimulus level for which the motor unit has a 50% chance of ring. Within such an experiment we are able to choose the next stimulus level based on the past observations. We show how sequential Monte Carlo can be used to analyse such data in an online manner. We then use the current estimate of the posterior distribution in order to choose the next stimulus level. The aim is to select a stimulus level that mimimises the expected loss. We will apply this loss function to the estimates of target quantiles from the stimulus-response curve. Through simulation we show that this approach is more ecient than existing sequential design methods for choosing the stimulus values. If applied in practice, it could more than halve the length of SEMG experiments
A study into drug-trying behaviour among young people in England : categorical analysis models in the Presence of missing data
This research reviewed the "Smoking, Drinking and Drug Use among Young People in England" 2010 survey (the Year 2010 Survey) study in terms of its data collection, processing and analysis. The research aim was to gain increased understanding of young peopleās drug-trying behaviour in England through appropriate handling of missing data, as well as, to build upon the previous work done, developing and applying statistical methodologies for analysis of multivariate categorical data collected by the Year 2010 Survey study. The main work done in this research included: (1) modifying the original data set to arrive the useful working data set; (2) conducting exploratory data analysis with the working data set to identify direction for further empirical investigation; (3) properly handling the missing data problem in the working data set and (4) developing and applying advanced statistical methodologies to further analyse the working data set. Apart from supporting the main findings of the Year 2010 Survey study that smoking, drinking and some drug-related socio-demographic covariates were positively associated with the studentsā drug-trying behaviour, additional significant results found by the univariate logistic regression models, log-linear analysis models, two-parameter item response theory models and latent class analysis models reported that (1) the 15 drugs were highly and positively associated with each other and each drug exerted different extent of influences on the studentsā drug-trying behaviour and (2) generally, studentsā drug-trying behaviour could be further explained by numerous smoking, drinking and drug related socio-demographic factors at different extent. These additional findings contributed to a deeper understanding of the drug use problem, added evidence to the drug related research literature and provided helpful guidance on formulating policies to combat against drug use problem in England. Another contribution of this research was the development of a new methodology for backward elimination of latent class analysis models which provided a more thorough evaluation of the optimal number of latent class and covariate elimination from saturated model
Motor unit number estimation via sequential Monte Carlo
A change in the number of motor units that operate a particular muscle is an important indicator for the progress of a neuromuscular disease and the efficacy of a therapy. Inference for realistic statistical models of the typical data produced when testing muscle function is difficult, and estimating the number of motor units is an ongoing statistical challenge. We consider a set of models for the data, each with a different number of working motor units, and present a novel method for Bayesian inference based on sequential Monte Carlo. This provides estimates of the marginal likelihood and, hence, a posterior probability for each model. Implementing this approach in practice requires a sequential Monte Carlo method that has excellent computational and Monte Carlo properties. We achieve this by benefiting from the model's conditional independence structure, where, given knowledge of which motor units fired as a result of a particular stimulus, parameters that specify the size of each unit's response are independent of the parameters defining the probability that a unit will respond at all. The scalability of our methodology relies on the natural conjugacy structure that we create for the former and an enforced, approximate, conjugate structure for the latter. A simulation study demonstrates the accuracy of our method, and inferences are consistent across two different datasets arising from the same rat tibial muscle
Bayesian Latent Variable Models for Biostatistical Applications
In this thesis we develop several kinds of latent variable models in order to address
three types of bio-statistical problem. The three problems are the treatment
effect of carcinogens on tumour development, spatial interactions between plant
species and motor unit number estimation (MUNE). The three types of data looked at are: highly heterogeneous longitudinal count data, quadrat counts of species on a rectangular lattice and lastly, electrophysiological data consisting
of measurements of compound muscle action potential (CMAP) area and amplitude.
Chapter 1 sets out the structure and the development of ideas presented
in this thesis from the point of view of: model structure, model selection, and
efficiency of estimation. Chapter 2 is an introduction to the relevant literature
that has in influenced the development of this thesis. In Chapter 3 we use the EM
algorithm for an application of an autoregressive hidden Markov model to describe
longitudinal counts. The data is collected from experiments to test the
effect of carcinogens on tumour growth in mice. Here we develop forward and
backward recursions for calculating the likelihood and for estimation. Chapter 4
is the analysis of a similar kind of data using a more sophisticated model, incorporating
random effects, but estimation this time is conducted from the Bayesian
perspective. Bayesian model selection is also explored. In Chapter 5 we move
to the two dimensional lattice and construct a model for describing the spatial
interaction of tree types. We also compare the merits of directed and undirected
graphical models for describing the hidden lattice. Chapter 6 is the application
of a Bayesian hierarchical model (MUNE), where the latent variable this time is
multivariate Gaussian and dependent on a covariate, the stimulus. Model selection
is carried out using the Bayes Information Criterion (BIC). In Chapter 7 we
approach the same problem by using the reversible jump methodology (Green,
1995) where this time we use a dual Gaussian-Binary representation of the latent
data. We conclude in Chapter 8 with suggestions for the direction of new
work. In this thesis, all of the estimation carried out on real data has only been
performed once we have been satisfied that estimation is able to retrieve the parameters
from simulated data.
Keywords: Amyotrophic lateral sclerosis (ALS), carcinogens, hidden Markov
models (HMM), latent variable models, longitudinal data analysis, motor unit
disease (MND), partially ordered Markov models (POMMs), the pseudo auto-
logistic model, reversible jump, spatial interactions
Joint modelling of goals and bookings in association football matches
A multivariate counting process formulation is developed for the quantification of association football event interdependencies which permits dynamic prediction as events unfold. We model data from English Premier League and Championship games from the 2009-10 and 2010-11 football seasons and assess predictive capacity using a model-based betting strategy, applied prospectively to available live spread betting prices. Both the scoreline and bookings status were predictive of match outcome. In particular, a red card led to increased goal rates for the non-penalised team and the home team scoring rate decreased once they were ahead. Overall the betting strategy profited with gains made in the bookings markets