6 research outputs found
Tracking interacting targets in multi-modal sensors
PhDObject tracking is one of the fundamental tasks in various applications such as surveillance,
sports, video conferencing and activity recognition. Factors such as occlusions,
illumination changes and limited field of observance of the sensor make tracking a challenging
task. To overcome these challenges the focus of this thesis is on using multiple
modalities such as audio and video for multi-target, multi-modal tracking. Particularly,
this thesis presents contributions to four related research topics, namely, pre-processing of
input signals to reduce noise, multi-modal tracking, simultaneous detection and tracking,
and interaction recognition.
To improve the performance of detection algorithms, especially in the presence
of noise, this thesis investigate filtering of the input data through spatio-temporal feature
analysis as well as through frequency band analysis. The pre-processed data from multiple
modalities is then fused within Particle filtering (PF). To further minimise the discrepancy
between the real and the estimated positions, we propose a strategy that associates the
hypotheses and the measurements with a real target, using a Weighted Probabilistic Data
Association (WPDA). Since the filtering involved in the detection process reduces the
available information and is inapplicable on low signal-to-noise ratio data, we investigate
simultaneous detection and tracking approaches and propose a multi-target track-beforedetect
Particle filtering (MT-TBD-PF). The proposed MT-TBD-PF algorithm bypasses
the detection step and performs tracking in the raw signal. Finally, we apply the proposed
multi-modal tracking to recognise interactions between targets in regions within, as well
as outside the camerasâ fields of view.
The efficiency of the proposed approaches are demonstrated on large uni-modal,
multi-modal and multi-sensor scenarios from real world detections, tracking and event
recognition datasets and through participation in evaluation campaigns
Recommended from our members
MAC-REALM: A video content feature extraction and modelling framework
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A consequence of the âdata delugeâ is the exponential increase in digital video footage, while the ability to find relevant video clips diminishes. Traditional text based search engines are no longer optimal for searching, as they cannot provide a granular search of the content inside video footage. To be able to search the video in a content based manner, the content features of the video need to be extracted and modelled into a content model, which can then act as a searchable proxy for the video content. This thesis focuses on the extraction of syntactic and semantic content features and content modelling, using machine driven processes, with either little or no user interaction. Our abstract framework design extracts syntactic and semantic content features and compiles them into an integrated content model. The framework integrates a four plane strategy that consists of a pre-processing plane that removes redundant data and filters the media to improve the feature extraction properties of the media; a syntactic feature extraction plane that extracts low level syntactic feature and mid-level syntactic features that have semantic attributes; a semantic relationship analysis and linkage plane, where the spatial and temporal relationships of all the content features are defined, and finally a content modelling stage where the syntactic and semantic content features are integrated into a content model. Each of the four planes can be split into three layers namely, the content layer, where the content to be processed is stored; the application layer, where the content is converted into content descriptions, and the MPEG-7 layer, where content descriptions are serialised. Using MPEG-7 standards to produce the content model will provide wide-ranging interoperability, while facilitating granular multi-content type searches. The framework is aiming to âbridgeâ the semantic gap, by integrating the syntactic and semantic content features from extraction through to modelling. The design of the framework has been implemented into a prototype called MAC-REALM, which has been tested and evaluated for its effectiveness to extract and model content features. Conclusions are drawn about the research output as a whole and whether they have met the objectives. Finally, future work is presented on how concept detection and crowd sourcing can be used with MAC-REALM
A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium
When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its Ï parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available
A Statistical Approach to the Alignment of fMRI Data
Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods