18 research outputs found

    A comprehensive review of vehicle detection using computer vision

    Get PDF
    A crucial step in designing intelligent transport systems (ITS) is vehicle detection. The challenges of vehicle detection in urban roads arise because of camera position, background variations, occlusion, multiple foreground objects as well as vehicle pose. The current study provides a synopsis of state-of-the-art vehicle detection techniques, which are categorized according to motion and appearance-based techniques starting with frame differencing and background subtraction until feature extraction, a more complicated model in comparison. The advantages and disadvantages among the techniques are also highlighted with a conclusion as to the most accurate one for vehicle detection

    Selective eigenbackgrounds method for background subtraction in crowed scenes

    Full text link

    Background Subtraction Methods in Video Streams: A Review

    Get PDF
    Background subtraction is one of the most important parts in image and video processing field. There are some unnecessary parts during the image or video processing, and should be removed, because they lead to more execution time or required memory. Several subtraction methods have been presented for the time being, but find the best-suited method is an issue, which this study is going to address. Furthermore, each process needs to the specific subtraction technique, and knowing this issue helps researchers to achieve faster and higher performance in their research. This paper presents a comparative study of several existing background subtraction methods which have been investigated from simple background subtraction to more complex statistical techniques. The goal of this study is to provide a view of the strengths and drawbacks of the widely used methods. The methods are compared based on their memory requirement, the computational time and their robustness of different videos. Finally, a comparison between the existing methods has been employed with some factors like computational time or memory requirements. It is also hoped that this analysis helps researchers to address the difficulty of selecting the most convenient method for background subtraction

    ENERGY-EFFICIENT LIGHTWEIGHT ALGORITHMS FOR EMBEDDED SMART CAMERAS: DESIGN, IMPLEMENTATION AND PERFORMANCE ANALYSIS

    Get PDF
    An embedded smart camera is a stand-alone unit that not only captures images, but also includes a processor, memory and communication interface. Battery-powered, embedded smart cameras introduce many additional challenges since they have very limited resources, such as energy, processing power and memory. When camera sensors are added to an embedded system, the problem of limited resources becomes even more pronounced. Hence, computer vision algorithms running on these camera boards should be light-weight and efficient. This thesis is about designing and developing computer vision algorithms, which are aware and successfully overcome the limitations of embedded platforms (in terms of power consumption and memory usage). Particularly, we are interested in object detection and tracking methodologies and the impact of them on the performance and battery life of the CITRIC camera (embedded smart camera employed in this research). This thesis aims to prolong the life time of the Embedded Smart platform, without affecting the reliability of the system during surveillance tasks. Therefore, the reader is walked through the whole designing process, from the development and simulation, followed by the implementation and optimization, to the testing and performance analysis. The work presented in this thesis carries out not only software optimization, but also hardware-level operations during the stages of object detection and tracking. The performance of the algorithms introduced in this thesis are comparable to state-of-the-art object detection and tracking methods, such as Mixture of Gaussians, Eigen segmentation, color and coordinate tracking. Unlike the traditional methods, the newly-designed algorithms present notable reduction of the memory requirements, as well as the reduction of memory accesses per pixel. To accomplish the proposed goals, this work attempts to interconnect different levels of the embedded system architecture to make the platform more efficient in terms of energy and resource savings. Thus, the algorithms proposed are optimized at the API, middleware, and hardware levels to access the pixel information of the CMOS sensor directly. Only the required pixels are acquired in order to reduce the unnecessary communications overhead. Experimental results show that when exploiting the architecture capabilities of an embedded platform, 41.24% decrease in energy consumption, and 107.2% increase in battery-life can be accomplished. Compared to traditional object detection and tracking methods, the proposed work provides an additional 8 hours of continuous processing on 4 AA batteries, increasing the lifetime of the camera to 15.5 hours

    Rejection based multipath reconstruction for background estimation in video sequences with stationary objects

    Full text link
    This is the author’s version of a work that was accepted for publication in Computer Vision and Image Understanding. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Computer Vision and Image Understanding, VOL147 (2016) DOI 10.1016/j.cviu.2016.03.012Background estimation in video consists in extracting a foreground-free image from a set of training frames. Moving and stationary objects may affect the background visibility, thus invalidating the assumption of many related literature where background is the temporal dominant data. In this paper, we present a temporal-spatial block-level approach for background estimation in video to cope with moving and stationary objects. First, a Temporal Analysis module obtains a compact representation of the training data by motion filtering and dimensionality reduction. Then, a threshold-free hierarchical clustering determines a set of candidates to represent the background for each spatial location (block). Second, a Spatial Analysis module iteratively reconstructs the background using these candidates. For each spatial location, multiple reconstruction hypotheses (paths) are explored to obtain its neighboring locations by enforcing inter-block similarities and intra-block homogeneity constraints in terms of color discontinuity, color dissimilarity and variability. The experimental results show that the proposed approach outperforms the related state-of-the-art over challenging video sequences in presence of moving and stationary objects.This work was partially supported by the Spanish Government (HAVideo, TEC2014-53176-R) and by the TEC department (Universidad Autónoma de Madrid)

    MĂ©thodes de vision Ă  la motion et leurs applications

    Get PDF
    La détection de mouvement est une opération de base souvent utilisée en vision par ordinateur, que ce soit pour la détection de piétons, la détection d’anomalies, l’analyse de scènes vidéo ou le suivi d’objets en temps réel. Bien qu’un très grand nombre d’articles ait été publiés sur le sujet, plusieurs questions restent en suspens. Par exemple, il n’est toujours pas clair comment détecter des objets en mouvement dans des vidéos contenant des situations difficiles à gérer comme d'importants mouvements de fonds et des changements d’illumination. De plus, il n’y a pas de consensus sur comment quantifier les performances des méthodes de détection de mouvement. Aussi, il est souvent difficile d’incorporer de l’information de mouvement à des opérations de haut niveau comme par exemple la détection de piétons. Dans cette thèse, j’aborde quatre problèmes en lien avec la détection de mouvement: 1. Comment évaluer efficacement des méthodes de détection de mouvement? Pour répondre à cette question, nous avons mis sur pied une procédure d’évaluation de telles méthodes. Cela a mené à la création de la plus grosse base de données 100\% annotée au monde dédiée à la détection de mouvement et organisé une compétition internationale (CVPR 2014). J’ai également exploré différentes métriques d’évaluation ainsi que des stratégies de combinaison de méthodes de détection de mouvement. 2. L’annotation manuelle de chaque objet en mouvement dans un grand nombre de vidéos est un immense défi lors de la création d’une base de données d’analyse vidéo. Bien qu’il existe des méthodes de segmentation automatiques et semi-automatiques, ces dernières ne sont jamais assez précises pour produire des résultats de type “vérité terrain”. Pour résoudre ce problème, nous avons proposé une méthode interactive de segmentation d’objets en mouvement basée sur l’apprentissage profond. Les résultats obtenus sont aussi précis que ceux obtenus par un être humain tout en étant 40 fois plus rapide. 3. Les méthodes de détection de piétons sont très souvent utilisées en analyse de la vidéo. Malheureusement, elles souffrent parfois d’un grand nombre de faux positifs ou de faux négatifs tout dépendant de l’ajustement des paramètres de la méthode. Dans le but d’augmenter les performances des méthodes de détection de piétons, nous avons proposé un filtre non linéaire basée sur la détection de mouvement permettant de grandement réduire le nombre de faux positifs. 4. L’initialisation de fond ({\em background initialization}) est le processus par lequel on cherche à retrouver l’image de fond d’une vidéo sans les objets en mouvement. Bien qu’un grand nombre de méthodes ait été proposé, tout comme la détection de mouvement, il n’existe aucune base de donnée ni procédure d’évaluation pour de telles méthodes. Nous avons donc mis sur pied la plus grosse base de données au monde pour ce type d’applications et avons organisé une compétition internationale (ICPR 2016).Abstract : Motion detection is a basic video analytic operation on which many high-level computer vision tasks are built upon, e.g., pedestrian detection, anomaly detection, scene understanding and object tracking strategies. Even though a large number of motion detection methods have been proposed in the last decades, some important questions are still unanswered, including: (1) how to separate the foreground from the background accurately even under extremely challenging circumstances? (2) how to evaluate different motion detection methods? And (3) how to use motion information extracted by motion detection to help improving high-level computer vision tasks? In this thesis, we address four problems related to motion detection: 1. How can we benchmark (and on which videos) motion detection method? Current datasets are either too small with a limited number of scenarios, or only provide bounding box ground truth that indicates the rough location of foreground objects. As a solution, we built the largest and most objective motion detection dataset in the world with pixel accurate ground truth to evaluate and compare motion detection methods. We also explore various evaluation metrics as well as different combination strategies. 2. Providing pixel accurate ground truth is a huge challenge when building a motion detection dataset. While automatic labeling methods suffer from a too large false detection rate to be used as ground truth, manual labeling of hundreds of thousands of frames is extremely time consuming. To solve this problem, we proposed an interactive deep learning method for segmenting moving objects from videos. The proposed method can reach human-level accuracies while lowering the labeling time by a factor of 40. 3. Pedestrian detectors always suffer from either false positive detections or false negative detections all depending on the parameter tuning. Unfortunately, manual adjustment of parameters for a large number of videos is not feasible in practice. In order to make pedestrian detectors more robust on a large variety of videos, we combined motion detection with various state-of-the-art pedestrian detectors. This is done by a novel motion-based nonlinear filtering process which improves detectors by a significant margin. 4. Scene background initialization is the process by which a method tries to recover the RGB background image of a video without foreground objects in it. However, one of the reasons that background modeling is challenging is that there is no good dataset and benchmarking framework to estimate the performance of background modeling methods. To fix this problem, we proposed an extensive survey as well as a novel benchmarking framework for scene background initialization

    Annex 16 : automated traffic monitoring for complex road conditions

    Get PDF
    Recent advancements in computer vision and machine learning techniques have made traffic monitoring systems highly effective in well structured traffic conditions such as highways. But these systems struggle in handling complex and irregular conditions that exist in developing countries, due to lack of infrastructure and regulation. This research breaks down the problem into different sub-tasks such as vehicle detection, vehicle tracking, and vehicle recognition, then combines each process into one pipeline that can be used for traffic monitoring. Implementing the final pipeline involves improving and aggregating existing techniques. Results demonstrate the potential of these techniques for automated traffic monitoring

    Hierarchical improvement of foreground segmentation masks in background subtraction

    Full text link
    A plethora of algorithms have been defined for foreground segmentation, a fundamental stage for many computer vision applications. In this work, we propose a post-processing framework to improve foreground segmentation performance of background subtraction algorithms. We define a hierarchical framework for extending segmented foreground pixels to undetected foreground object areas and for removing erroneously segmented foreground. Firstly, we create a motion-aware hierarchical image segmentation of each frame that prevents merging foreground and background image regions. Then, we estimate the quality of the foreground mask through the fitness of the binary regions in the mask and the hierarchy of segmented regions. Finally, the improved foreground mask is obtained as an optimal labeling by jointly exploiting foreground quality and spatial color relations in a pixel-wise fully-connected Conditional Random Field. Experiments are conducted over four large and heterogeneous datasets with varied challenges (CDNET2014, LASIESTA, SABS and BMC) demonstrating the capability of the proposed framework to improve background subtraction resultsThis work was partially supported by the Spanish Government (HAVideo, TEC2014-53176-R
    corecore