17 research outputs found
3D Vehicle Extraction and Tracking from Multiple Viewpoints for Traffic Monitoring by using Probability Fusion Map
This paper presents a novel solution of vehicle occlusion and 3D measurement for traffic monitoring by data fusion from multiple stationary cameras. Comparing with single camera based conventional methods in traffic monitoring, our approach fuses video data from different viewpoints into a common probability fusion map (PFM) and extracts targets. The proposed PFM concept is efficient to handle and fuse data in order to estimate the probability of vehicle appearance, which is verified to be more reliable than single camera solution by real outdoor experiments. An AMF based shadowing modeling algorithm is also proposed in this paper in order to remove shadows on the road area and extract the proper vehicle regions
Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets
In this work, we explore the correlation between people trajectories and
their head orientations. We argue that people trajectory and head pose
forecasting can be modelled as a joint problem. Recent approaches on trajectory
forecasting leverage short-term trajectories (aka tracklets) of pedestrians to
predict their future paths. In addition, sociological cues, such as expected
destination or pedestrian interaction, are often combined with tracklets. In
this paper, we propose MiXing-LSTM (MX-LSTM) to capture the interplay between
positions and head orientations (vislets) thanks to a joint unconstrained
optimization of full covariance matrices during the LSTM backpropagation. We
additionally exploit the head orientations as a proxy for the visual attention,
when modeling social interactions. MX-LSTM predicts future pedestrians location
and head pose, increasing the standard capabilities of the current approaches
on long-term trajectory forecasting. Compared to the state-of-the-art, our
approach shows better performances on an extensive set of public benchmarks.
MX-LSTM is particularly effective when people move slowly, i.e. the most
challenging scenario for all other models. The proposed approach also allows
for accurate predictions on a longer time horizon.Comment: Accepted at IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE 2019. arXiv admin note: text overlap with arXiv:1805.0065
A Novel Approach for Image Localization Using SVM Classifier and PSO Algorithm for Vehicle Tracking
In this paper, we propose a novel methodology for vehicular image localization, by incorporating the surveillance image object identification, using a local gradient model, and vehicle localization using the time of action. The aerial images of different traffic densities are obtained using the Histograms of Oriented Gradients (HOG) Descriptor. These features are acquired simply based on locations, angles, positions, and height of cameras set on the junction board. The localization of vehicular image is obtained based on the different times of action of the vehicles under consideration. Support Vector Machines (SVM) classifier, as well as Particle Swarm Optimization (PSO), is also proposed in this work. Different experimental analyses are also performed to calculate the efficiency of optimization methods in the new proposed system. Outcomes from experimentations reveal the effectiveness of the classification precision, recall, and F measure
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
In fisheye images, rich distinct distortion patterns are regularly
distributed in the image plane. These distortion patterns are independent of
the visual content and provide informative cues for rectification. To make the
best of such rectification cues, we introduce SimFIR, a simple framework for
fisheye image rectification based on self-supervised representation learning.
Technically, we first split a fisheye image into multiple patches and extract
their representations with a Vision Transformer (ViT). To learn fine-grained
distortion representations, we then associate different image patches with
their specific distortion patterns based on the fisheye model, and further
subtly design an innovative unified distortion-aware pretext task for their
learning. The transfer performance on the downstream rectification task is
remarkably boosted, which verifies the effectiveness of the learned
representations. Extensive experiments are conducted, and the quantitative and
qualitative results demonstrate the superiority of our method over the
state-of-the-art algorithms as well as its strong generalization ability on
real-world fisheye images.Comment: Accepted to ICCV 202
Motion detection and tracking algorithms in video streams
Moving objects detection and tracking in video stream are basic fundamental and critical tasks in many computer vision applications. We have presented in this paper effectiveness increase of algorithms for moving objects detection and tracking. For this, we use additive minimax similarity function. Background reconstruction algorithm is developed. Moving and tracking objects detection algorithms are modified on the basis of additive minimax similarity function. Results of experiments are presented according to time expenses of the moving object detection and tracking
A semantic autonomous video surveillance system for dense camera networks in smart cities
Producción CientíficaThis paper presents a proposal of an intelligent video surveillance system able to
detect and identify abnormal and alarming situations by analyzing object movement. The
system is designed to minimize video processing and transmission, thus allowing a large
number of cameras to be deployed on the system, and therefore making it suitable for its
usage as an integrated safety and security solution in Smart Cities. Alarm detection is
performed on the basis of parameters of the moving objects and their trajectories, and is
performed using semantic reasoning and ontologies. This means that the system employs a
high-level conceptual language easy to understand for human operators, capable of raising
enriched alarms with descriptions of what is happening on the image, and to automate
reactions to them such as alerting the appropriate emergency services using the Smart City
safety network
Boosting video tracking performance by means of Tabu Search in Intelligent Visual Surveillance Systems
In this paper, we present a fast and efficient technique for the data association problem applied to visual tracking systems. Visual tracking process is formulated as a combinatorial hypotheses search with a heuristic evaluation function taking into account structural and specific information such as distance, shape, color, etc. We introduce a Tabu Search algorithm which performs a search on an indirect space. A novel problem formulation allows us to transform any solution into the real search space, which is needed for fitness calculation, in linear time. This new formulation and the use of auxiliary structures yields a fast transformation from a blob-to-track assignment space to the real shape and position of tracks space (while calculating fitness in an incremental fashion), which is key in order to produce efficient and fast results. Other previous approaches are based on statistical techniques or on evolutionary algorithms. These techniques are quite efficient and robust although they cannot converge as fast as our approach.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT
TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02.Publicad
Visual surveillance by dynamic visual attention method
This paper describes a method for visual surveillance based on biologically motivated dynamic visual attention in video image sequences. Our system is based on the extraction and integration of local (pixels and spots) as well as global (objects) features. Our approach defines a method for the generation of an active attention focus on a dynamic scene for surveillance purposes. The system segments in accordance with a set of predefined features, including gray level, motion and shape features, giving raise to two classes of objects: vehicle and pedestrian. The solution proposed to the selective visual attention problem consists of decomposing the input images of an indefinite sequence of images into its moving objects, defining which of these elements are of the user\\s interest at a given moment, and keeping attention on those elements through time. Features extraction and integration are solved by incorporating mechanisms of charge and discharge?based on the permanency effect?, as well as mechanisms of lateral interaction. All these mechanisms have proved to be good enough to segment the scene into moving objects and background