108 research outputs found

    Dual-rate background subtraction approach for estimating traffic queue parameters in urban scenes

    Get PDF
    This study proposes traffic queue-parameter estimation based on background subtraction. An appropriate combination of two background models is used: a short-term model, very sensitive to moving vehicles, and a long-term model capable of retaining as foreground temporarily stopped vehicles at intersections or traffic lights. Experimental results in typical urban scenes demonstrate the suitability of the proposed approach. Its main advantage is the low computational cost, avoiding specific motion detection algorithms or post-processing operations after foreground vehicle detection.Ministerio de Educación y Ciencia DPI2010-19154Consejería de Innovación, Ciencia y Empresa P07-TIC-0262

    Fuzzy integral for moving object detection

    Full text link

    Motion Segmentation Aided Super Resolution Image Reconstruction

    Get PDF
    This dissertation addresses Super Resolution (SR) Image Reconstruction focusing on motion segmentation. The main thrust is Information Complexity guided Gaussian Mixture Models (GMMs) for Statistical Background Modeling. In the process of developing our framework we also focus on two other topics; motion trajectories estimation toward global and local scene change detections and image reconstruction to have high resolution (HR) representations of the moving regions. Such a framework is used for dynamic scene understanding and recognition of individuals and threats with the help of the image sequences recorded with either stationary or non-stationary camera systems. We introduce a new technique called Information Complexity guided Statistical Background Modeling. Thus, we successfully employ GMMs, which are optimal with respect to information complexity criteria. Moving objects are segmented out through background subtraction which utilizes the computed background model. This technique produces superior results to competing background modeling strategies. The state-of-the-art SR Image Reconstruction studies combine the information from a set of unremarkably different low resolution (LR) images of static scene to construct an HR representation. The crucial challenge not handled in these studies is accumulating the corresponding information from highly displaced moving objects. In this aspect, a framework of SR Image Reconstruction of the moving objects with such high level of displacements is developed. Our assumption is that LR images are different from each other due to local motion of the objects and the global motion of the scene imposed by non-stationary imaging system. Contrary to traditional SR approaches, we employed several steps. These steps are; the suppression of the global motion, motion segmentation accompanied by background subtraction to extract moving objects, suppression of the local motion of the segmented out regions, and super-resolving accumulated information coming from moving objects rather than the whole scene. This results in a reliable offline SR Image Reconstruction tool which handles several types of dynamic scene changes, compensates the impacts of camera systems, and provides data redundancy through removing the background. The framework proved to be superior to the state-of-the-art algorithms which put no significant effort toward dynamic scene representation of non-stationary camera systems

    Automatic object classification for surveillance videos.

    Get PDF
    PhDThe recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems

    Activity understanding and unusual event detection in surveillance videos

    Get PDF
    PhDComputer scientists have made ceaseless efforts to replicate cognitive video understanding abilities of human brains onto autonomous vision systems. As video surveillance cameras become ubiquitous, there is a surge in studies on automated activity understanding and unusual event detection in surveillance videos. Nevertheless, video content analysis in public scenes remained a formidable challenge due to intrinsic difficulties such as severe inter-object occlusion in crowded scene and poor quality of recorded surveillance footage. Moreover, it is nontrivial to achieve robust detection of unusual events, which are rare, ambiguous, and easily confused with noise. This thesis proposes solutions for resolving ambiguous visual observations and overcoming unreliability of conventional activity analysis methods by exploiting multi-camera visual context and human feedback. The thesis first demonstrates the importance of learning visual context for establishing reliable reasoning on observed activity in a camera network. In the proposed approach, a new Cross Canonical Correlation Analysis (xCCA) is formulated to discover and quantify time delayed pairwise correlations of regional activities observed within and across multiple camera views. This thesis shows that learning time delayed pairwise activity correlations offers valuable contextual information for (1) spatial and temporal topology inference of a camera network, (2) robust person re-identification, and (3) accurate activity-based video temporal segmentation. Crucially, in contrast to conventional methods, the proposed approach does not rely on either intra-camera or inter-camera object tracking; it can thus be applied to low-quality surveillance videos featuring severe inter-object occlusions. Second, to detect global unusual event across multiple disjoint cameras, this thesis extends visual context learning from pairwise relationship to global time delayed dependency between regional activities. Specifically, a Time Delayed Probabilistic Graphical Model (TD-PGM) is proposed to model the multi-camera activities and their dependencies. Subtle global unusual events are detected and localised using the model as context-incoherent patterns across multiple camera views. In the model, different nodes represent activities in different decomposed re3 gions from different camera views, and the directed links between nodes encoding time delayed dependencies between activities observed within and across camera views. In order to learn optimised time delayed dependencies in a TD-PGM, a novel two-stage structure learning approach is formulated by combining both constraint-based and scored-searching based structure learning methods. Third, to cope with visual context changes over time, this two-stage structure learning approach is extended to permit tractable incremental update of both TD-PGM parameters and its structure. As opposed to most existing studies that assume static model once learned, the proposed incremental learning allows a model to adapt itself to reflect the changes in the current visual context, such as subtle behaviour drift over time or removal/addition of cameras. Importantly, the incremental structure learning is achieved without either exhaustive search in a large graph structure space or storing all past observations in memory, making the proposed solution memory and time efficient. Forth, an active learning approach is presented to incorporate human feedback for on-line unusual event detection. Contrary to most existing unsupervised methods that perform passive mining for unusual events, the proposed approach automatically requests supervision for critical points to resolve ambiguities of interest, leading to more robust detection of subtle unusual events. The active learning strategy is formulated as a stream-based solution, i.e. it makes decision on-the-fly on whether to request label for each unlabelled sample observed in sequence. It selects adaptively two active learning criteria, namely likelihood criterion and uncertainty criterion to achieve (1) discovery of unknown event classes and (2) refinement of classification boundary. The effectiveness of the proposed approaches is validated using videos captured from busy public scenes such as underground stations and traffic intersections

    An Analytic Training Approach for Recognition in Still Images and Videos

    Get PDF
    This dissertation proposes a general framework to efficiently identify the objects of interest (OI) in still images and its application can be further extended to human action recognition in videos. The frameworks utilized in this research to process still images and videos are similar in architecture except they have different content representations. Initially, global level analysis is employed to extract distinctive feature sets from an input data. For the global analysis of data the bidirectional two dimensional principal component analysis (2D-PCA) is employed to preserve correlation amongst neighborhood pixels. Furthermore, to cope with the inherent limitations within the holistic approach local information is introduced into the framework. The local information of OI is identified utilizing FERNS and affine SIFT (ASIFT) approaches for spatial and temporal datasets, respectively. For supportive local information, the feature detection is followed by an effective pruning strategy to divide these features into inliers and outliers. A cluster of inliers represents local features which exhibit stable behavior and geometric consistency. Incremental learning is a significant but often overlooked problem in action recognition. The final part of this dissertation proposes a new action recognition algorithm based on sequential learning and adaptive representation of the human body using Pyramid of Histogram of Oriented Gradients (PHOG) features. The changing shape and appearance of human body parts is tracked based on the weak appearance constancy assumption. The constantly changing shape of an OI is maximally covered by the small blocks to approximate the body contour of a segmented foreground object. In addition, the analytically determined learning phase guarantees lower computational burden for classification. The utilization of a minimum number of video frames in a causal way to recognize an action is also explored in this dissertation. The use of PHOG features adaptively extracted from individual frames allows the recognition of an incoming action video using a small group of frames which eliminates the need of large look-ahead

    Extraction of biomedical indicators from gait videos

    Get PDF
    Gait has been an extensively investigated topic in recent years. Through the analysis of gait it is possible to detect pathologies, which makes this analysis very important to assess anomalies and, consequently, help in the diagnosis and rehabilitation of patients. There are some systems for analyzing gait, but they are usually either systems with subjective evaluations or systems used in specialized laboratories with complex equipment, which makes them very expensive and inaccessible. However, there has been a significant effort of making available simpler and more accurate systems for gait analysis and classification. This dissertation reviews recent gait analysis and classification systems, presents a new database with videos of 21 subjects, simulating 4 different pathologies as well as normal gait, and also presents a web application that allows the user to remotely access an automatic classification system and thus obtain the expected classification and heatmaps for the given input. The classification system is based on the use of gait representation images such as the Gait Energy Image (GEI) and the Skeleton Gait Energy Image (SEI), which are used as input to a VGG-19 Convolutional Neural Network (CNN) that is used to perform classification. This classification system is a vision-based system. To sum up, the developed web application aims to show the usefulness of the classification system, making it possible for anyone to access it.A marcha tem sido um tema muito investigado nos últimos anos. Através da análise da marcha é possível detetar patologias, o que torna esta análise muito importante para avaliar anómalias e consequentemente, ajudar no diagnóstico e na reabilitação dos pacientes. Existem alguns sistemas para analisar a marcha, mas habitualmente, ou estão sujeitos a uma interpretação subjetiva, ou são sistemas usados em laboratórios especializados com equipamento complexo, o que os torna muito dispendiosos e inacessíveis. No entanto, tem havido um esforço significativo com o objectivo de disponibilizar sistemas mais simples e mais precisos para análise e classificação da marcha. Esta dissertação revê os sistemas de análise e classificação da marcha desenvolvidos recentemente, apresenta uma nova base de dados com vídeos de 21 sujeitos, a simular 4 patologias diferentes bem como marcha normal, e apresenta também uma aplicação web que permite ao utilizador aceder remotamente a um sistema automático de classificação e assim, obter a classificação prevista e mapas de características respectivos de acordo com a entrada dada. O sistema de classificação baseia-se no uso de imagens de representação da marcha como a "Gait Energy Image" (GEI) e "Skeleton Gait Energy Image" (SEI), que são usadas como entrada numa rede neuronal convolucional VGG-19 que é usada para realizar a classificação. Este sistema de classificação corresponde a um sistema baseado na visão. Em suma, a aplicação web desenvolvida tem como finalidade mostrar a utilidade do sistema de classificação, tornando possível o acesso a qualquer pessoa

    Detection and representation of moving objects for video surveillance

    Get PDF
    In this dissertation two new approaches have been introduced for the automatic detection of moving objects (such as people and vehicles) in video surveillance sequences. The first technique analyses the original video and exploits spatial and temporal information to find those pixels in the images that correspond to moving objects. The second technique analyses video sequences that have been encoded according to a recent video coding standard (H.264/AVC). As such, only the compressed features are analyzed to find moving objects. The latter technique results in a very fast and accurate detection (up to 20 times faster than the related work). Lastly, we investigated how different XML-based metadata standards can be used to represent information about these moving objects. We proposed the usage of Semantic Web Technologies to combine information described according to different metadata standards
    corecore