17 research outputs found

    3D Vehicle Extraction and Tracking from Multiple Viewpoints for Traffic Monitoring by using Probability Fusion Map

    Get PDF
    This paper presents a novel solution of vehicle occlusion and 3D measurement for traffic monitoring by data fusion from multiple stationary cameras. Comparing with single camera based conventional methods in traffic monitoring, our approach fuses video data from different viewpoints into a common probability fusion map (PFM) and extracts targets. The proposed PFM concept is efficient to handle and fuse data in order to estimate the probability of vehicle appearance, which is verified to be more reliable than single camera solution by real outdoor experiments. An AMF based shadowing modeling algorithm is also proposed in this paper in order to remove shadows on the road area and extract the proper vehicle regions

    Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets

    Full text link
    In this work, we explore the correlation between people trajectories and their head orientations. We argue that people trajectory and head pose forecasting can be modelled as a joint problem. Recent approaches on trajectory forecasting leverage short-term trajectories (aka tracklets) of pedestrians to predict their future paths. In addition, sociological cues, such as expected destination or pedestrian interaction, are often combined with tracklets. In this paper, we propose MiXing-LSTM (MX-LSTM) to capture the interplay between positions and head orientations (vislets) thanks to a joint unconstrained optimization of full covariance matrices during the LSTM backpropagation. We additionally exploit the head orientations as a proxy for the visual attention, when modeling social interactions. MX-LSTM predicts future pedestrians location and head pose, increasing the standard capabilities of the current approaches on long-term trajectory forecasting. Compared to the state-of-the-art, our approach shows better performances on an extensive set of public benchmarks. MX-LSTM is particularly effective when people move slowly, i.e. the most challenging scenario for all other models. The proposed approach also allows for accurate predictions on a longer time horizon.Comment: Accepted at IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019. arXiv admin note: text overlap with arXiv:1805.0065

    A Novel Approach for Image Localization Using SVM Classifier and PSO Algorithm for Vehicle Tracking

    Get PDF
    In this paper, we propose a novel methodology for vehicular image localization, by incorporating the surveillance image object identification, using a local gradient model, and vehicle localization using the time of action. The aerial images of different traffic densities are obtained using the Histograms of Oriented Gradients (HOG) Descriptor. These features are acquired simply based on locations, angles, positions, and height of cameras set on the junction board. The localization of vehicular image is obtained based on the different times of action of the vehicles under consideration. Support Vector Machines (SVM) classifier, as well as Particle Swarm Optimization (PSO), is also proposed in this work. Different experimental analyses are also performed to calculate the efficiency of optimization methods in the new proposed system. Outcomes from experimentations reveal the effectiveness of the classification precision, recall, and F measure

    SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning

    Full text link
    In fisheye images, rich distinct distortion patterns are regularly distributed in the image plane. These distortion patterns are independent of the visual content and provide informative cues for rectification. To make the best of such rectification cues, we introduce SimFIR, a simple framework for fisheye image rectification based on self-supervised representation learning. Technically, we first split a fisheye image into multiple patches and extract their representations with a Vision Transformer (ViT). To learn fine-grained distortion representations, we then associate different image patches with their specific distortion patterns based on the fisheye model, and further subtly design an innovative unified distortion-aware pretext task for their learning. The transfer performance on the downstream rectification task is remarkably boosted, which verifies the effectiveness of the learned representations. Extensive experiments are conducted, and the quantitative and qualitative results demonstrate the superiority of our method over the state-of-the-art algorithms as well as its strong generalization ability on real-world fisheye images.Comment: Accepted to ICCV 202

    Motion detection and tracking algorithms in video streams

    Get PDF
    Moving objects detection and tracking in video stream are basic fundamental and critical tasks in many computer vision applications. We have presented in this paper effectiveness increase of algorithms for moving objects detection and tracking. For this, we use additive minimax similarity function. Background reconstruction algorithm is developed. Moving and tracking objects detection algorithms are modified on the basis of additive minimax similarity function. Results of experiments are presented according to time expenses of the moving object detection and tracking

    A semantic autonomous video surveillance system for dense camera networks in smart cities

    Get PDF
    Producción CientíficaThis paper presents a proposal of an intelligent video surveillance system able to detect and identify abnormal and alarming situations by analyzing object movement. The system is designed to minimize video processing and transmission, thus allowing a large number of cameras to be deployed on the system, and therefore making it suitable for its usage as an integrated safety and security solution in Smart Cities. Alarm detection is performed on the basis of parameters of the moving objects and their trajectories, and is performed using semantic reasoning and ontologies. This means that the system employs a high-level conceptual language easy to understand for human operators, capable of raising enriched alarms with descriptions of what is happening on the image, and to automate reactions to them such as alerting the appropriate emergency services using the Smart City safety network

    Boosting video tracking performance by means of Tabu Search in Intelligent Visual Surveillance Systems

    Get PDF
    In this paper, we present a fast and efficient technique for the data association problem applied to visual tracking systems. Visual tracking process is formulated as a combinatorial hypotheses search with a heuristic evaluation function taking into account structural and specific information such as distance, shape, color, etc. We introduce a Tabu Search algorithm which performs a search on an indirect space. A novel problem formulation allows us to transform any solution into the real search space, which is needed for fitness calculation, in linear time. This new formulation and the use of auxiliary structures yields a fast transformation from a blob-to-track assignment space to the real shape and position of tracks space (while calculating fitness in an incremental fashion), which is key in order to produce efficient and fast results. Other previous approaches are based on statistical techniques or on evolutionary algorithms. These techniques are quite efficient and robust although they cannot converge as fast as our approach.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02.Publicad

    Visual surveillance by dynamic visual attention method

    Get PDF
    This paper describes a method for visual surveillance based on biologically motivated dynamic visual attention in video image sequences. Our system is based on the extraction and integration of local (pixels and spots) as well as global (objects) features. Our approach defines a method for the generation of an active attention focus on a dynamic scene for surveillance purposes. The system segments in accordance with a set of predefined features, including gray level, motion and shape features, giving raise to two classes of objects: vehicle and pedestrian. The solution proposed to the selective visual attention problem consists of decomposing the input images of an indefinite sequence of images into its moving objects, defining which of these elements are of the user\\s interest at a given moment, and keeping attention on those elements through time. Features extraction and integration are solved by incorporating mechanisms of charge and discharge?based on the permanency effect?, as well as mechanisms of lateral interaction. All these mechanisms have proved to be good enough to segment the scene into moving objects and background
    corecore