8,847 research outputs found

    Automatic Video-based Analysis of Human Motion

    Get PDF

    FROM VISUAL SALIENCY TO VIDEO BEHAVIOUR UNDERSTANDING

    Get PDF
    In a world of ever increasing amounts of video data, we are forced to abandon traditional methods of scene interpretation by fully manual means. Under such circumstances, some form of automation is highly desirable but this can be a very open ended issue with high complexity. Dealing with such large amounts of data is a non-trivial task that requires efficient selective extraction of parts of a scene which have the potential to develop a higher semantic meaning, alone, or in combination with others. In particular, the types of video data that are in need of automated analysis tend to be outdoor scenes with high levels of activity generated from either foreground or background. Such dynamic scenes add considerable complexity to the problem since we cannot rely on motion energy alone to detect regions of interest. Furthermore, the behaviour of these regions of motion can differ greatly, while still being highly dependent, both spatially and temporally on the movement of other objects within the scene. Modelling these dependencies, whilst eliminating as much redundancy from the feature extraction process as possible are the challenges addressed by this thesis. In the first half, finding the right mechanism to extract and represent meaningful features from dynamic scenes with no prior knowledge is investigated. Meaningful or salient information is treated as the parts of a scene that stand out or seem unusual or interesting to us. The novelty of the work is that it is able to select salient scales in both space and time in which a particular spatio-temporal volume is considered interesting relative to the rest of the scene. By quantifying the temporal saliency values of regions of motion, it is possible to consider their importance in terms of both the long and short-term. Variations in entropy over spatio-temporal scales are used to select a context dependent measure of the local scene dynamics. A method of quantifying temporal saliency is devised based on the variation of the entropy of the intensity distribution in a spatio-temporal volume over incraeasing scales. Entropy is used over traditional filter methods since the stability or predictability of the intensity distribution over scales of a local spatio-temporal region can be defined more robustly relative to the context of its neighbourhood, even for regions exhibiting high intensity variation due to being extremely textured. Results show that it is possible to extract both locally salient features as well as globally salient temporal features from contrasting scenerios. In the second part of the thesis, focus will shift towards binding these spatio-temporally salient features together so that some semantic meaning can be inferred from their interaction. Interaction in this sense, refers to any form of temporally correlated behaviour between any salient regions of motion in a scene. Feature binding as a mechanism for interactive behaviour understanding is particularly important if we consider that regions of interest may not be treated as particularly significant individually, but represent much more semantically when considered in combination. Temporally correlated behaviour is identified and classified using accumulated co-occurrences of salient features at two levels. Firstly, co-occurrences are accumulated for spatio-temporally proximate salient features to form a local representation. Then, at the next level, the co-occurrence of these locally spatio-temporally bound features are accumulated again in order to discover unusual behaviour in the scene. The novelty of this work is that there are no assumptions made about whether interacting regions should be spatially proximate. Furthermore, no prior knowledge of the scene topology is used. Results show that it is possible to detect unusual interactions between regions of motion, which can visually infer higher levels of semantics. In the final part of the thesis, a more specific investigation of human behaviour is addressed through classification and detection of interactions between 2 human subjects. Here, further modifications are made to the feature extraction process in order to quantify the spatiotemporal saliency of a region of motion. These features are then grouped to find the people in the scene. Then, a loose pose distribution model is extracted for each person for finding salient correlations between poses of two interacting people using canonical correlation analysis. These canonical factors can be formed into trajectories and used for classification. Levenshtein distance is then used to categorise the features. The novelty of the work is that the interactions do not have to be spatially connected or proximate for them to be recognised. Furthermore, the data used is outdoors and cluttered with non-stationary background. Results show that co-occurrence techniques have the potential to provide a more generalised, compact, and meaningful representation of dynamic interactive scene behaviour.EPRSC, part-funded by QinetiQ Ltd and a travel grant was also contributed by RAEng

    Learning object behaviour models

    Get PDF
    The human visual system is capable of interpreting a remarkable variety of often subtle, learnt, characteristic behaviours. For instance we can determine the gender of a distant walking figure from their gait, interpret a facial expression as that of surprise, or identify suspicious behaviour in the movements of an individual within a car-park. Machine vision systems wishing to exploit such behavioural knowledge have been limited by the inaccuracies inherent in hand-crafted models and the absence of a unified framework for the perception of powerful behaviour models. The research described in this thesis attempts to address these limitations, using a statistical modelling approach to provide a framework in which detailed behavioural knowledge is acquired from the observation of long image sequences. The core of the behaviour modelling framework is an optimised sample-set representation of the probability density in a behaviour space defined by a novel temporal pattern formation strategy. This representation of behaviour is both concise and accurate and facilitates the recognition of actions or events and the assessment of behaviour typicality. The inclusion of generative capabilities is achieved via the addition of a learnt stochastic process model, thus facilitating the generation of predictions and realistic sample behaviours. Experimental results demonstrate the acquisition of behaviour models and suggest a variety of possible applications, including automated visual surveillance, object tracking, gesture recognition, and the generation of realistic object behaviours within animations, virtual worlds, and computer generated film sequences. The utility of the behaviour modelling framework is further extended through the modelling of object interaction. Two separate approaches are presented, and a technique is developed which, using learnt models of joint behaviour together with a stochastic tracking algorithm, can be used to equip a virtual object with the ability to interact in a natural way. Experimental results demonstrate the simulation of a plausible virtual partner during interaction between a user and the machine

    Aprendizagem automática aplicada à deteção de pessoas baseada em radar

    Get PDF
    The present dissertation describes the development and implementation of a radar-based system with the purpose of being able to detect people amidst other objects that are moving in an indoor scenario. The detection methods implemented exploit radar data that is processed by a system that includes the data acquisition, the pre-processing of the data, the feature extraction, and the application of these data to machine learning models specifically designed to attain the objective of target classification. Beyond the basic theoretical research necessary for its sucessful development, the work contamplates an important component of software development and experimental tests. Among others, the following topics were covered in this dissertation: the study of radar working principles and hardware; radar signal processing; techniques of clutter removal, feature exctraction, and data clustering applied to radar signals; implementation and hyperparameter tuning of machine learning classification systems; study of multi-target detection and tracking methods. The people detection application was tested in different indoor scenarios that include a static radar and a radar dynamically deployed by a mobile robot. This application can be executed in real time and perform multiple target detection and classification using basic clustering and tracking algorithms. A study of the effects of the detection of multiple targets in the performance of the application, as well as an assessment of the efficiency of the different classification methods is presented. The envisaged applications of the proposed detection system include intrusion detection in indoor environments and acquisition of anonymized data for people tracking and counting in public spaces such as hospitals and schools.A presente dissertação descreve o desenvolvimento e implementação de um sistema baseado em radar que tem como objetivo detetar e distinguir pessoas de outros objetos que se movem num ambiente interior. Os métodos de deteção e distinção exploram os dados de radar que são processados por um sistema que abrange a aquisição e pré-processamento dos dados, a extração de características, e a aplicação desses dados a modelos de aprendizagem automática especificamente desenhados para atingir o objetivo de classificação de alvos. Além do estudo da teoria básica de radar para o desenvolvimento bem sucedido desta dissertação, este trabalho contempla uma componente importante de desenvolvimento de software e testes experimentais. Entre outros, os seguintes tópicos foram abordados nesta dissertação: o estudo dos princípios básicos do funcionamento do radar e do seu equipamento; processamento de sinal do radar; técnicas de remoção de ruído, extração de características, e segmentação de dados aplicada ao sinal de radar; implementação e calibração de hiper-parâmetros dos modelos de aprendizagem automática para sistemas de classificação; estudo de métodos de deteção e seguimento de múltiplos alvos. A aplicação para deteção de pessoas foi testada em diferentes cenários interiores que incluem o radar estático ou transportado por um robot móvel. Esta aplicação pode ser executada em tempo real e realizar deteção e classificação de múltiplos alvos usando algoritmos básicos de segmentação e seguimento. O estudo do impacto da deteção de múltiplos alvos no funcionamento da aplicação é apresentado, bem como a avaliação da eficiência dos diferentes métodos de classificação usados. As possíveis aplicações do sistema de deteção proposto incluem a deteção de intrusão em ambientes interiores e aquisição de dados anónimos para seguimento e contagem de pessoas em espaços públicos tais como hospitais ou escolas.Mestrado em Engenharia de Computadores e Telemátic

    Vision-based techniques for gait recognition

    Full text link
    Global security concerns have raised a proliferation of video surveillance devices. Intelligent surveillance systems seek to discover possible threats automatically and raise alerts. Being able to identify the surveyed object can help determine its threat level. The current generation of devices provide digital video data to be analysed for time varying features to assist in the identification process. Commonly, people queue up to access a facility and approach a video camera in full frontal view. In this environment, a variety of biometrics are available - for example, gait which includes temporal features like stride period. Gait can be measured unobtrusively at a distance. The video data will also include face features, which are short-range biometrics. In this way, one can combine biometrics naturally using one set of data. In this paper we survey current techniques of gait recognition and modelling with the environment in which the research was conducted. We also discuss in detail the issues arising from deriving gait data, such as perspective and occlusion effects, together with the associated computer vision challenges of reliable tracking of human movement. Then, after highlighting these issues and challenges related to gait processing, we proceed to discuss the frameworks combining gait with other biometrics. We then provide motivations for a novel paradigm in biometrics-based human recognition, i.e. the use of the fronto-normal view of gait as a far-range biometrics combined with biometrics operating at a near distance

    Achieving illumination invariance using image filters

    Get PDF
    In this chapter we described a novel framework for automatic face recognition in the presence of varying illumination, primarily applicable to matching face sets or sequences. The framework is based on simple image processing filters that compete with unprocessed greyscale input to yield a single matching score between individuals. By performing all numerically consuming computation offline, our method both (i) retains the matching efficiency of simple image filters, but (ii) with a greatly increased robustness, as all online processing is performed in closed-form. Evaluated on a large, real-world data corpus, the proposed framework was shown to be successful in video-based recognition across a wide range of illumination, pose and face motion pattern change
    corecore