1,345 research outputs found

    Selected topics in video coding and computer vision

    Get PDF
    Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences

    Comparison of fusion methods for thermo-visual surveillance tracking

    Get PDF
    In this paper, we evaluate the appearance tracking performance of multiple fusion schemes that combine information from standard CCTV and thermal infrared spectrum video for the tracking of surveillance objects, such as people, faces, bicycles and vehicles. We show results on numerous real world multimodal surveillance sequences, tracking challenging objects whose appearance changes rapidly. Based on these results we can determine the most promising fusion scheme

    Thermal Cameras and Applications:A Survey

    Get PDF

    Long-Term Occupancy Analysis using Graph-Based Optimisation in Thermal Imagery

    Get PDF
    This paper presents a robust occupancy analysis system for thermal imaging. Reliable detection of people is very hard in crowded scenes, due to occlusions and segmentation problems. We therefore propose a framework that optimises the occupancy analysis over long periods by including in-formation on the transition in occupancy, when people enter or leave the monitored area. In stable periods, with no ac-tivity close to the borders, people are detected and counted which contributes to a weighted histogram. When activity close to the border is detected, local tracking is applied in order to identify a crossing. After a full sequence, the num-ber of people during all periods are estimated using a prob-abilistic graph search optimisation. The system is tested on a total of 51,000 frames, captured in sports arenas. The mean error for a 30-minute period containing 3-13 people is 4.44 %, which is a half of the error percentage optained by detection only, and better than the results of comparable work. The framework is also tested on a public available dataset from an outdoor scene, which proves the generality of the method. 1

    Aprendizaje evolutivo supervisado: Uso de histograma de gradiente y algoritmo de enjambre de partículas para detección y seguimiento de peatones en secuencia de imágenes infrarrojas

    Get PDF
    Recently, tracking and pedestrian detection from various images have become one of the major issues in the field of image processing and statistical identification.  In this regard, using evolutionary learning-based approaches to improve performance in different contexts can greatly influence the appropriate response.  There are problems with pedestrian tracking/identification, such as low accuracy for detection, high processing time, and uncertainty in response to answers.  Researchers are looking for new processing models that can accurately monitor one's position on the move.  In this study, a hybrid algorithm for the automatic detection of pedestrian position is presented.  It is worth noting that this method, contrary to the analysis of visible images, examines pedestrians' thermal and infrared components while walking and combines a neural network with maximum learning capability, wavelet kernel (Wavelet transform), and particle swarm optimization (PSO) to find parameters of learner model. Gradient histograms have a high effect on extracting features in infrared images.  As well, the neural network algorithm can achieve its goal (pedestrian detection and tracking) by maximizing learning.  The proposed method, despite the possibility of maximum learning, has a high speed in education, and results of various data sets in this field have been analyzed. The result indicates a negligible error in observing the infrared sequence of pedestrian movements, and it is suggested to use neural networks because of their precision and trying to boost the selection of their hyperparameters based on evolutionary algorithms

    Posture Estimation for Improved Photogrammetric Localization of Pedestrians in Monocular Infrared Imagery

    Get PDF
    Target tracking within conventional video imagery poses a significant challenge that is increasingly being addressed via complex algorithmic solutions. The complexity of this problem can be fundamentally attributed to the ambiguity associated with actual 3D scene position of a given tracked object in relation to its observed position in 2D image space. Recent work has tackled this challenge head on by returning to classical photogrammetry, within the context of current target detection and classification techniques, as a means of recovering the true 3D position of pedestrian targets within the bounds of current accuracy norms. A key limitation in such approaches is the assumption of posture – that the observed pedestrian is at full height stance within the scene. Whilst prior work has shown the effects of statistical height variation to be negligible, variations in the posture of the target may still pose a significant source of potential error. Here we present a method that addresses this issue via the use of regression based pedestrian posture estimation. This is demonstrated for variations in pedestrian target height ranging from 0.4-2m over a distance to target range of 7-30m
    corecore