Search CORE

44,426 research outputs found

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Author: Costeira João P.
Moura José M. F.
Wu Guanhang
Zhang Shanghang
Publication venue
Publication date: 31/07/2017
Field of study

In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of densities in each frame, which significantly accelerates the training of networks. To preserve feature map resolution, we propose a Hyper-Atrous combination to integrate atrous convolution in FCN and combine feature maps of different convolution layers. FCN-rLSTM enables refined feature representation and a novel end-to-end trainable mapping from pixels to vehicle count. We extensively evaluated the proposed method on different counting tasks with three datasets, with experimental results demonstrating their effectiveness and robustness. In particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21 on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201

arXiv.org e-Print Archive

Crossref

Understanding Traffic Density from Large-Scale Web Camera Data

Author: Costeira João P.
Moura José M. F.
Wu Guanhang
Zhang Shanghang
Publication venue
Publication date: 30/06/2017
Field of study

Understanding traffic density from large-scale web camera (webcam) videos is a challenging problem because such videos have low spatial and temporal resolution, high occlusion and large perspective. To deeply understand traffic density, we explore both deep learning based and optimization based methods. To avoid individual vehicle detection and tracking, both methods map the image into vehicle density map, one based on rank constrained regression and the other one based on fully convolution networks (FCN). The regression based method learns different weights for different blocks in the image to increase freedom degrees of weights and embed perspective information. The FCN based method jointly estimates vehicle density map and vehicle count with a residual learning framework to perform end-to-end dense prediction, allowing arbitrary image resolution, and adapting to different vehicle scales and perspectives. We analyze and compare both methods, and get insights from optimization based method to improve deep model. Since existing datasets do not cover all the challenges in our work, we collected and labelled a large-scale traffic video dataset, containing 60 million frames from 212 webcams. Both methods are extensively evaluated and compared on different counting tasks and datasets. FCN based method significantly reduces the mean absolute error from 10.99 to 5.31 on the public dataset TRANCOS compared with the state-of-the-art baseline.Comment: Accepted by CVPR 2017. Preprint version was uploaded on http://welcome.isr.tecnico.ulisboa.pt/publications/understanding-traffic-density-from-large-scale-web-camera-data

arXiv.org e-Print Archive

Crossref

Scheduling for Multi-Camera Surveillance in LTE Networks

Author: Chen Wen-Tsuen
Wang Chih-Hang
Yang De-Nian
Publication venue
Publication date: 28/03/2015
Field of study

Wireless surveillance in cellular networks has become increasingly important, while commercial LTE surveillance cameras are also available nowadays. Nevertheless, most scheduling algorithms in the literature are throughput, fairness, or profit-based approaches, which are not suitable for wireless surveillance. In this paper, therefore, we explore the resource allocation problem for a multi-camera surveillance system in 3GPP Long Term Evolution (LTE) uplink (UL) networks. We minimize the number of allocated resource blocks (RBs) while guaranteeing the coverage requirement for surveillance systems in LTE UL networks. Specifically, we formulate the Camera Set Resource Allocation Problem (CSRAP) and prove that the problem is NP-Hard. We then propose an Integer Linear Programming formulation for general cases to find the optimal solution. Moreover, we present a baseline algorithm and devise an approximation algorithm to solve the problem. Simulation results based on a real surveillance map and synthetic datasets manifest that the number of allocated RBs can be effectively reduced compared to the existing approach for LTE networks.Comment: 9 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Energy Consumption Of Visual Sensor Networks: Impact Of Spatio-Temporal Coverage

Author: Andreopoulos Yiannis
Buranapanichkit Dujdow
Cesana Matteo
Redondi Alessandro
Tagliasacchi Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Wireless visual sensor networks (VSNs) are expected to play a major role in future IEEE 802.15.4 personal area networks (PAN) under recently-established collision-free medium access control (MAC) protocols, such as the IEEE 802.15.4e-2012 MAC. In such environments, the VSN energy consumption is affected by the number of camera sensors deployed (spatial coverage), as well as the number of captured video frames out of which each node processes and transmits data (temporal coverage). In this paper, we explore this aspect for uniformly-formed VSNs, i.e., networks comprising identical wireless visual sensor nodes connected to a collection node via a balanced cluster-tree topology, with each node producing independent identically-distributed bitstream sizes after processing the video frames captured within each network activation interval. We derive analytic results for the energy-optimal spatio-temporal coverage parameters of such VSNs under a-priori known bounds for the number of frames to process per sensor and the number of nodes to deploy within each tier of the VSN. Our results are parametric to the probability density function characterizing the bitstream size produced by each node and the energy consumption rates of the system of interest. Experimental results reveal that our analytic results are always within 7% of the energy consumption measurements for a wide range of settings. In addition, results obtained via a multimedia subsystem show that the optimal spatio-temporal settings derived by the proposed framework allow for substantial reduction of energy consumption in comparison to ad-hoc settings. As such, our analytic modeling is useful for early-stage studies of possible VSN deployments under collision-free MAC protocols prior to costly and time-consuming experiments in the field.Comment: to appear in IEEE Transactions on Circuits and Systems for Video Technology, 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Real Time Airborne Monitoring for Disaster and Traffic Applications

Author: Kurz Franz
Leitloff Jens
Meynberg Oliver
Reinartz Peter
Rosenbaum Dominik
Publication venue
Publication date: 01/04/2011
Field of study

Remote sensing applications like disaster or mass event monitoring need the acquired data and extracted information within a very short time span. Airborne sensors can acquire the data quickly and on-board processing combined with data downlink is the fastest possibility to achieve this requirement. For this purpose, a new low-cost airborne frame camera system has been developed at the German Aerospace Center (DLR) named 3K-camera. The pixel size and swath width range between 15 cm to 50 cm and 2.5 km to 8 km respectively. Within two minutes an area of approximately 10 km x 8 km can be monitored. Image data are processed onboard on five computers using data from a real time GPS/IMU system including direct georeferencing. Due to high frequency image acquisition (3 images/second) the monitoring of moving objects like vehicles and people is performed allowing wide area detailed traffic monitoring

Institute of Transport Research:Publications

Simulation of Mixed Critical In-vehicular Networks

Author: Korf Franz
Meyer Philipp
Schmidt Thomas C.
Steinbach Till
Publication venue
Publication date: 01/08/2018
Field of study

Future automotive applications ranging from advanced driver assistance to autonomous driving will largely increase demands on in-vehicular networks. Data flows of high bandwidth or low latency requirements, but in particular many additional communication relations will introduce a new level of complexity to the in-car communication system. It is expected that future communication backbones which interconnect sensors and actuators with ECU in cars will be built on Ethernet technologies. However, signalling from different application domains demands for network services of tailored attributes, including real-time transmission protocols as defined in the TSN Ethernet extensions. These QoS constraints will increase network complexity even further. Event-based simulation is a key technology to master the challenges of an in-car network design. This chapter introduces the domain-specific aspects and simulation models for in-vehicular networks and presents an overview of the car-centric network design process. Starting from a domain specific description language, we cover the corresponding simulation models with their workflows and apply our approach to a related case study for an in-car network of a premium car

arXiv.org e-Print Archive

REPOSIT