35,165 research outputs found

    Measuring traffic flow and lane changing from semi-automatic video processing

    Get PDF
    Comprehensive databases are needed in order to extend our knowledge on the behavior of vehicular traffic. Nevertheless data coming from common traffic detectors is incomplete. Detectors only provide vehicle count, detector occupancy and speed at discrete locations. To enrich these databases additional measurements from other data sources, like video recordings, are used. Extracting data from videos by actually watching the entire length of the recordings and manually counting is extremely time-consuming. The alternative is to set up an automatic video detection system. This is also costly in terms of money and time, and generally does not pay off for sporadic usage on a pilot test. An adaptation of the semi-automatic video processing methodology proposed by Patire (2010) is presented here. It makes possible to count flow and lane changes 90% faster than actually counting them by looking at the video. The method consists in selecting some specific lined pixels in the video, and converting them into a set of space – time images. The manual time is only spent in counting from these images. The method is adaptive, in the sense that the counting is always done at the maximum speed, not constrained by the video playback speed. This allows going faster when there are a few counts and slower when a lot of counts happen. This methodology has been used for measuring off-ramp flows and lane changing at several locations in the B-23 freeway (Soriguera & Sala, 2014). Results show that, as long as the video recordings fulfill some minimum requirements in framing and quality, the method is easy to use, fast and reliable. This method is intended for research purposes, when some hours of video recording have to be analyzed, not for long term use in a Traffic Management Center.Postprint (published version

    FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

    Full text link
    In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of densities in each frame, which significantly accelerates the training of networks. To preserve feature map resolution, we propose a Hyper-Atrous combination to integrate atrous convolution in FCN and combine feature maps of different convolution layers. FCN-rLSTM enables refined feature representation and a novel end-to-end trainable mapping from pixels to vehicle count. We extensively evaluated the proposed method on different counting tasks with three datasets, with experimental results demonstrating their effectiveness and robustness. In particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21 on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201

    People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting

    Get PDF
    In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is used to produce a singular patch-based counting regressor capable of counting various object types including people, vehicles, cell nuclei and wildlife. As part of this study a challenging new cell counting dataset in the context of tissue culture and patient diagnosis is constructed. This new collection, referred to as the Dublin Cell Counting (DCC) dataset, is the first of its kind to be made available to the wider computer vision community. State-of-the-art object counting performance is achieved in both the Shanghaitech (parts A and B) and Penguins datasets while competitive performance is observed on the TRANCOS and Modified Bone Marrow (MBM) datasets, all using a shared counting model.Comment: 10 page

    Counting with Focus for Free

    Get PDF
    This paper aims to count arbitrary objects in images. The leading counting approaches start from point annotations per object from which they construct density maps. Then, their training objective transforms input images to density maps through deep convolutional networks. We posit that the point annotations serve more supervision purposes than just constructing density maps. We introduce ways to repurpose the points for free. First, we propose supervised focus from segmentation, where points are converted into binary maps. The binary maps are combined with a network branch and accompanying loss function to focus on areas of interest. Second, we propose supervised focus from global density, where the ratio of point annotations to image pixels is used in another branch to regularize the overall density estimation. To assist both the density estimation and the focus from segmentation, we also introduce an improved kernel size estimator for the point annotations. Experiments on six datasets show that all our contributions reduce the counting error, regardless of the base network, resulting in state-of-the-art accuracy using only a single network. Finally, we are the first to count on WIDER FACE, allowing us to show the benefits of our approach in handling varying object scales and crowding levels. Code is available at https://github.com/shizenglin/Counting-with-Focus-for-FreeComment: ICCV, 201
    corecore