thesis

Finite mixture models: visualisation, localised regression, and prediction

Abstract

Initially, this thesis introduces a new graphical tool, that can be used to summarise data possessing a mixture structure. Computation of the required summary statistics makes use of posterior probabilities of class membership obtained from a fitted mixture model. In this context, both real and simulated data are used to highlight the usefulness of the tool for the visualisation of mixture data in comparison to the use of a traditional boxplot. This thesis uses localised mixture models to produce predictions from time series data. Estimation method used in these models is achieved using a kernel-weighted version of an EM-algorithm: exponential kernels with different bandwidths are used as weight functions. By modelling a mixture of local regressions at a target time point, but using different bandwidths, an informative estimated mixture probabilities can be gained relating to the amount of information available in the data set. This information is given a scale of resolution, that corresponds to each bandwidth. Nadaraya-Watson and local linear estimators are used to carry out localised estimation. For prediction at a future time point, a new methodology of bandwidth selection and adequate methods are proposed for each local method, and then compared to competing forecasting routines. A simulation study is executed to assess the performance of this model for prediction. Finally, double-localised mixture models are presented, that can be used to improve predictions for a variable time series using additional information provided by other time series. Estimation for these models is achieved using a double-kernel-weighted version of the EM-algorithm, employing exponential kernels with different horizontal bandwidths and normal kernels with different vertical bandwidths, that are focused around a target observation at a given time point. Nadaraya-Watson and local linear estimators are used to carry out the double-localised estimation. For prediction at a future time point, different approaches are considered for each local method, and are compared to competing forecasting routines. Real data is used to investigate the performance of the localised and double-localised mixture models for prediction. The data used predominately in this thesis is taken from the International Energy Agency (IEA)

    Similar works