46,882 research outputs found

    Understanding Traffic Density from Large-Scale Web Camera Data

    Full text link
    Understanding traffic density from large-scale web camera (webcam) videos is a challenging problem because such videos have low spatial and temporal resolution, high occlusion and large perspective. To deeply understand traffic density, we explore both deep learning based and optimization based methods. To avoid individual vehicle detection and tracking, both methods map the image into vehicle density map, one based on rank constrained regression and the other one based on fully convolution networks (FCN). The regression based method learns different weights for different blocks in the image to increase freedom degrees of weights and embed perspective information. The FCN based method jointly estimates vehicle density map and vehicle count with a residual learning framework to perform end-to-end dense prediction, allowing arbitrary image resolution, and adapting to different vehicle scales and perspectives. We analyze and compare both methods, and get insights from optimization based method to improve deep model. Since existing datasets do not cover all the challenges in our work, we collected and labelled a large-scale traffic video dataset, containing 60 million frames from 212 webcams. Both methods are extensively evaluated and compared on different counting tasks and datasets. FCN based method significantly reduces the mean absolute error from 10.99 to 5.31 on the public dataset TRANCOS compared with the state-of-the-art baseline.Comment: Accepted by CVPR 2017. Preprint version was uploaded on http://welcome.isr.tecnico.ulisboa.pt/publications/understanding-traffic-density-from-large-scale-web-camera-data

    FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

    Full text link
    In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of densities in each frame, which significantly accelerates the training of networks. To preserve feature map resolution, we propose a Hyper-Atrous combination to integrate atrous convolution in FCN and combine feature maps of different convolution layers. FCN-rLSTM enables refined feature representation and a novel end-to-end trainable mapping from pixels to vehicle count. We extensively evaluated the proposed method on different counting tasks with three datasets, with experimental results demonstrating their effectiveness and robustness. In particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21 on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201

    Measuring traffic flow and lane changing from semi-automatic video processing

    Get PDF
    Comprehensive databases are needed in order to extend our knowledge on the behavior of vehicular traffic. Nevertheless data coming from common traffic detectors is incomplete. Detectors only provide vehicle count, detector occupancy and speed at discrete locations. To enrich these databases additional measurements from other data sources, like video recordings, are used. Extracting data from videos by actually watching the entire length of the recordings and manually counting is extremely time-consuming. The alternative is to set up an automatic video detection system. This is also costly in terms of money and time, and generally does not pay off for sporadic usage on a pilot test. An adaptation of the semi-automatic video processing methodology proposed by Patire (2010) is presented here. It makes possible to count flow and lane changes 90% faster than actually counting them by looking at the video. The method consists in selecting some specific lined pixels in the video, and converting them into a set of space – time images. The manual time is only spent in counting from these images. The method is adaptive, in the sense that the counting is always done at the maximum speed, not constrained by the video playback speed. This allows going faster when there are a few counts and slower when a lot of counts happen. This methodology has been used for measuring off-ramp flows and lane changing at several locations in the B-23 freeway (Soriguera & Sala, 2014). Results show that, as long as the video recordings fulfill some minimum requirements in framing and quality, the method is easy to use, fast and reliable. This method is intended for research purposes, when some hours of video recording have to be analyzed, not for long term use in a Traffic Management Center.Postprint (published version

    Available seat counting in public rail transport

    Get PDF
    Surveillance cameras are found almost everywhere today, including vehicles for public transport. A lot of research has already been done on video analysis in open spaces. However, the conditions in a vehicle for public transport differ from these in open spaces, as described in detail in this paper. A use case described in this paper is on counting the available seats in a vehicle using surveillance cameras. We propose an algorithm based on Laplace edge detection, combined with background subtraction

    Learned Scalable Video Coding For Humans and Machines

    Full text link
    Video coding has traditionally been developed to support services such as video streaming, videoconferencing, digital TV, and so on. The main intent was to enable human viewing of the encoded content. However, with the advances in deep neural networks (DNNs), encoded video is increasingly being used for automatic video analytics performed by machines. In applications such as automatic traffic monitoring, analytics such as vehicle detection, tracking and counting, would run continuously, while human viewing could be required occasionally to review potential incidents. To support such applications, a new paradigm for video coding is needed that will facilitate efficient representation and compression of video for both machine and human use in a scalable manner. In this manuscript, we introduce the first end-to-end learnable video codec that supports a machine vision task in its base layer, while its enhancement layer supports input reconstruction for human viewing. The proposed system is constructed based on the concept of conditional coding to achieve better compression gains. Comprehensive experimental evaluations conducted on four standard video datasets demonstrate that our framework outperforms both state-of-the-art learned and conventional video codecs in its base layer, while maintaining comparable performance on the human vision task in its enhancement layer. We will provide the implementation of the proposed system at www.github.com upon completion of the review process.Comment: 14 pages, 16 figure

    An Intelligent Monitoring System of Vehicles on Highway Traffic

    Full text link
    Vehicle speed monitoring and management of highways is the critical problem of the road in this modern age of growing technology and population. A poor management results in frequent traffic jam, traffic rules violation and fatal road accidents. Using traditional techniques of RADAR, LIDAR and LASAR to address this problem is time-consuming, expensive and tedious. This paper presents an efficient framework to produce a simple, cost efficient and intelligent system for vehicle speed monitoring. The proposed method uses an HD (High Definition) camera mounted on the road side either on a pole or on a traffic signal for recording video frames. On the basis of these frames, a vehicle can be tracked by using radius growing method, and its speed can be calculated by calculating vehicle mask and its displacement in consecutive frames. The method uses pattern recognition, digital image processing and mathematical techniques for vehicle detection, tracking and speed calculation. The validity of the proposed model is proved by testing it on different highways.Comment: 5 page
    • …
    corecore