5 research outputs found
A Hierarchical Dirichlet Process Model with Multiple Levels of Clustering for Human EEG Seizure Modeling
Driven by the multi-level structure of human intracranial
electroencephalogram (iEEG) recordings of epileptic seizures, we introduce a
new variant of a hierarchical Dirichlet Process---the multi-level clustering
hierarchical Dirichlet Process (MLC-HDP)---that simultaneously clusters
datasets on multiple levels. Our seizure dataset contains brain activity
recorded in typically more than a hundred individual channels for each seizure
of each patient. The MLC-HDP model clusters over channels-types, seizure-types,
and patient-types simultaneously. We describe this model and its implementation
in detail. We also present the results of a simulation study comparing the
MLC-HDP to a similar model, the Nested Dirichlet Process and finally
demonstrate the MLC-HDP's use in modeling seizures across multiple patients. We
find the MLC-HDP's clustering to be comparable to independent human physician
clusterings. To our knowledge, the MLC-HDP model is the first in the epilepsy
literature capable of clustering seizures within and between patients.Comment: ICML201
An Unsupervised Approach to Modelling Visual Data
For very large visual datasets, producing expert ground-truth data for training supervised algorithms can represent a substantial human effort. In these situations there is scope for the use of unsupervised approaches that can model collections of images and automatically summarise their content. The primary motivation for this thesis comes from the problem of labelling large visual datasets of the seafloor obtained by an Autonomous Underwater Vehicle (AUV) for ecological analysis. It is expensive to label this data, as taxonomical experts for the specific region are required, whereas automatically generated summaries can be used to focus the efforts of experts, and inform decisions on additional sampling. The contributions in this thesis arise from modelling this visual data in entirely unsupervised ways to obtain comprehensive visual summaries. Firstly, popular unsupervised image feature learning approaches are adapted to work with large datasets and unsupervised clustering algorithms. Next, using Bayesian models the performance of rudimentary scene clustering is boosted by sharing clusters between multiple related datasets, such as regular photo albums or AUV surveys. These Bayesian scene clustering models are extended to simultaneously cluster sub-image segments to form unsupervised notions of “objects” within scenes. The frequency distribution of these objects within scenes is used as the scene descriptor for simultaneous scene clustering. Finally, this simultaneous clustering model is extended to make use of whole image descriptors, which encode rudimentary spatial information, as well as object frequency distributions to describe scenes. This is achieved by unifying the previously presented Bayesian clustering models, and in so doing rectifies some of their weaknesses and limitations. Hence, the final contribution of this thesis is a practical unsupervised algorithm for modelling images from the super-pixel to album levels, and is applicable to large datasets
Towards scalable Bayesian nonparametric methods for data analytics
Resorting big data to actionable information involves dealing with four dimensions of challenges in big data (called four V’s): volume, variety, velocity, veracity. In this study, we seek for novel Bayesian nonparametric models and scalable learning algorithms which can deal with these challenges of the big data era.<br /
Exploiting side information in Bayesian nonparametric models and their applications
My research is to exploit side information into advanced Bayesian nonparametric models. We have developed some novel models for data clustering and medical data analysis and also have made our methods scalable for large-scale data. I have published my research in several journal and conference papers
Bayesian nonparametric multilevel modelling and applications
Our research aims at contributing to the multilevel modeling in data analytics. We address the task of multilevel clustering, multilevel regression, and classification. We provide state of the art solution for the critical problem