Search CORE

5 research outputs found

Component-level aggregation of probabilistic PCA mixtures using variational-Bayes

Author: Bruneau Pierrick
Gelgon Marc
Picarougne Fabien
Publication venue: HAL CCSD
Publication date: 01/02/2011
Field of study

Technical Report. This report of an extended version of our ICPR'2010 paper.This paper proposes a technique for aggregating mixtures of probabilistic principal component analyzers, which are a powerful probabilistic generative model for coping with a high-dimensional, non linear, data set. Aggregation is carried out through Bayesian estimation with a specific prior and an original variational scheme. We demonstrate how such models may be aggregated by accessing model parameters only, rather than original data, which can be advantageous for learning from distributed data sets. Experimental results illustrate the effectiveness of the proposal

INRIA a CCSD electronic archive server

Robust gait recognition under variable covariate conditions

Author: Bashir Khalid
Publication venue
Publication date: 01/01/2010
Field of study

PhDGait is a weak biometric when compared to face, fingerprint or iris because it can be easily affected by various conditions. These are known as the covariate conditions and include clothing, carrying, speed, shoes and view among others. In the presence of variable covariate conditions gait recognition is a hard problem yet to be solved with no working system reported. In this thesis, a novel gait representation, the Gait Flow Image (GFI), is proposed to extract more discriminative information from a gait sequence. GFI extracts the relative motion of body parts in different directions in separate motion descriptors. Compared to the existing model-free gait representations, GFI is more discriminative and robust to changes in covariate conditions. In this thesis, gait recognition approaches are evaluated without the assumption on cooperative subjects, i.e. both the gallery and the probe sets consist of gait sequences under different and unknown covariate conditions. The results indicate that the performance of the existing approaches drops drastically under this more realistic set-up. It is argued that selecting the gait features which are invariant to changes in covariate conditions is the key to developing a gait recognition system without subject cooperation. To this end, the Gait Entropy Image (GEnI) is proposed to perform automatic feature selection on each pair of gallery and probe gait sequences. Moreover, an Adaptive Component and Discriminant Analysis is formulated which seamlessly integrates the feature selection method with subspace analysis for fast and robust recognition. Among various factors that affect the performance of gait recognition, change in viewpoint poses the biggest problem and is treated separately. A novel approach to address this problem is proposed in this thesis by using Gait Flow Image in a cross view gait recognition framework with the view angle of a probe gait sequence unknown. A Gaussian Process classification technique is formulated to estimate the view angle of each probe gait sequence. To measure the similarity of gait sequences across view angles, the correlation of gait sequences from different views is modelled using Canonical Correlation Analysis and the correlation strength is used as a similarity measure. This differs from existing approaches, which reconstruct gait features in different views through 2D view transformation or 3D calibration. Without explicit reconstruction, the proposed method can cope with feature mis-match across view and is more robust against feature noise

Queen Mary Research Online

OpenGrey Repository

Spatial and temporal background modelling of non-stationary visual scenes

Author: Russell David Mark
Publication venue
Publication date: 01/01/2009
Field of study

PhDThe prevalence of electronic imaging systems in everyday life has become increasingly apparent in recent years. Applications are to be found in medical scanning, automated manufacture, and perhaps most significantly, surveillance. Metropolitan areas, shopping malls, and road traffic management all employ and benefit from an unprecedented quantity of video cameras for monitoring purposes. But the high cost and limited effectiveness of employing humans as the final link in the monitoring chain has driven scientists to seek solutions based on machine vision techniques. Whilst the field of machine vision has enjoyed consistent rapid development in the last 20 years, some of the most fundamental issues still remain to be solved in a satisfactory manner. Central to a great many vision applications is the concept of segmentation, and in particular, most practical systems perform background subtraction as one of the first stages of video processing. This involves separation of ‘interesting foreground’ from the less informative but persistent background. But the definition of what is ‘interesting’ is somewhat subjective, and liable to be application specific. Furthermore, the background may be interpreted as including the visual appearance of normal activity of any agents present in the scene, human or otherwise. Thus a background model might be called upon to absorb lighting changes, moving trees and foliage, or normal traffic flow and pedestrian activity, in order to effect what might be termed in ‘biologically-inspired’ vision as pre-attentive selection. This challenge is one of the Holy Grails of the computer vision field, and consequently the subject has received considerable attention. This thesis sets out to address some of the limitations of contemporary methods of background segmentation by investigating methods of inducing local mutual support amongst pixels in three starkly contrasting paradigms: (1) locality in the spatial domain, (2) locality in the shortterm time domain, and (3) locality in the domain of cyclic repetition frequency. Conventional per pixel models, such as those based on Gaussian Mixture Models, offer no spatial support between adjacent pixels at all. At the other extreme, eigenspace models impose a structure in which every image pixel bears the same relation to every other pixel. But Markov Random Fields permit definition of arbitrary local cliques by construction of a suitable graph, and 3 are used here to facilitate a novel structure capable of exploiting probabilistic local cooccurrence of adjacent Local Binary Patterns. The result is a method exhibiting strong sensitivity to multiple learned local pattern hypotheses, whilst relying solely on monochrome image data. Many background models enforce temporal consistency constraints on a pixel in attempt to confirm background membership before being accepted as part of the model, and typically some control over this process is exercised by a learning rate parameter. But in busy scenes, a true background pixel may be visible for a relatively small fraction of the time and in a temporally fragmented fashion, thus hindering such background acquisition. However, support in terms of temporal locality may still be achieved by using Combinatorial Optimization to derive shortterm background estimates which induce a similar consistency, but are considerably more robust to disturbance. A novel technique is presented here in which the short-term estimates act as ‘pre-filtered’ data from which a far more compact eigen-background may be constructed. Many scenes entail elements exhibiting repetitive periodic behaviour. Some road junctions employing traffic signals are among these, yet little is to be found amongst the literature regarding the explicit modelling of such periodic processes in a scene. Previous work focussing on gait recognition has demonstrated approaches based on recurrence of self-similarity by which local periodicity may be identified. The present work harnesses and extends this method in order to characterize scenes displaying multiple distinct periodicities by building a spatio-temporal model. The model may then be used to highlight abnormality in scene activity. Furthermore, a Phase Locked Loop technique with a novel phase detector is detailed, enabling such a model to maintain correct synchronization with scene activity in spite of noise and drift of periodicity. This thesis contends that these three approaches are all manifestations of the same broad underlying concept: local support in each of the space, time and frequency domains, and furthermore, that the support can be harnessed practically, as will be demonstrated experimentally

Queen Mary Research Online

OpenGrey Repository

Individual and group dynamic behaviour patterns in bound spaces

Author: Gasiorowski Pawel
Publication venue
Publication date: 01/12/2017
Field of study

The behaviour analysis of individual and group dynamics in closed spaces is a subject of extensive research in both academia and industry. However, despite recent technological advancements the problem of implementing the existing methods for visual behaviour data analysis in production systems remains difficult and the applications are available only in special cases in which the resourcing is not a problem. Most of the approaches concentrate on direct extraction and classification of the visual features from the video footage for recognising the dynamic behaviour directly from the source. The adoption of such an approach allows recognising directly the elementary actions of moving objects, which is a difficult task on its own. The major factor that impacts the performance of the methods for video analytics is the necessity to combine processing of enormous volume of video data with complex analysis of this data using and computationally resourcedemanding analytical algorithms. This is not feasible for many applications, which must work in real time. In this research, an alternative simulation-based approach for behaviour analysis has been adopted. It can potentially reduce the requirements for extracting information from real video footage for the purpose of the analysis of the dynamic behaviour. This can be achieved by combining only limited data extracted from the original video footage with a symbolic data about the events registered on the scene, which is generated by 3D simulation synchronized with the original footage. Additionally, through incorporating some physical laws and the logics of dynamic behaviour directly in the 3D model of the visual scene, this framework allows to capture the behavioural patterns using simple syntactic pattern recognition methods. The extensive experiments with the prototype implementation prove in a convincing manner that the 3D simulation generates sufficiently rich data to allow analysing the dynamic behaviour in real-time with sufficient adequacy without the need to use precise physical data, using only a limited data about the objects on the scene, their location and dynamic characteristics. This research can have a wide applicability in different areas where the video analytics is necessary, ranging from public safety and video surveillance to marketing research to computer games and animation. Its limitations are linked to the dependence on some preliminary processing of the video footage which is still less detailed and computationally demanding than the methods which use directly the video frames of the original footage

London Met Repository

DOI: 10.1007/s11263-005-5024-8 Model Selection for Unsupervised Learning of Visual Context

Author: Shaogang Gong
Tao Xiang
Publication venue
Publication date
Field of study

Abstract. This study addresses the problem of choosing the most suitable probabilistic model selection criterion for unsupervised learning of visual context of a dynamic scene using mixture models. A rectified Bayesian Information Criterion (BICr) and a Completed Likelihood Akaike’s Information Criterion (CL-AIC) are formulated to estimate the optimal model order (complexity) for a given visual scene. Both criteria are designed to overcome poor model selection by existing popular criteria when the data sample size varies from small to large and the true mixture distribution kernel functions differ from the assumed ones. Extensive experiments on learning visual context for dynamic scene modelling are carried out to demonstrate the effectiveness of BICr and CL-AIC, compared to that of existing popular model selection criteria including BIC, AIC and Integrated Completed Likelihood (ICL). Our study suggests that for learning visual context using a mixture model, BICr is the most appropriate criterion given sparse data, while CL-AIC should be chosen given moderate or large data sample sizes

CiteSeerX