2,576 research outputs found
Understanding Traffic Density from Large-Scale Web Camera Data
Understanding traffic density from large-scale web camera (webcam) videos is
a challenging problem because such videos have low spatial and temporal
resolution, high occlusion and large perspective. To deeply understand traffic
density, we explore both deep learning based and optimization based methods. To
avoid individual vehicle detection and tracking, both methods map the image
into vehicle density map, one based on rank constrained regression and the
other one based on fully convolution networks (FCN). The regression based
method learns different weights for different blocks in the image to increase
freedom degrees of weights and embed perspective information. The FCN based
method jointly estimates vehicle density map and vehicle count with a residual
learning framework to perform end-to-end dense prediction, allowing arbitrary
image resolution, and adapting to different vehicle scales and perspectives. We
analyze and compare both methods, and get insights from optimization based
method to improve deep model. Since existing datasets do not cover all the
challenges in our work, we collected and labelled a large-scale traffic video
dataset, containing 60 million frames from 212 webcams. Both methods are
extensively evaluated and compared on different counting tasks and datasets.
FCN based method significantly reduces the mean absolute error from 10.99 to
5.31 on the public dataset TRANCOS compared with the state-of-the-art baseline.Comment: Accepted by CVPR 2017. Preprint version was uploaded on
http://welcome.isr.tecnico.ulisboa.pt/publications/understanding-traffic-density-from-large-scale-web-camera-data
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
Spatial Mixture-of-Experts
Many data have an underlying dependence on spatial location; it may be
weather on the Earth, a simulation on a mesh, or a registered image. Yet this
feature is rarely taken advantage of, and violates common assumptions made by
many neural network layers, such as translation equivariance. Further, many
works that do incorporate locality fail to capture fine-grained structure. To
address this, we introduce the Spatial Mixture-of-Experts (SMoE) layer, a
sparsely-gated layer that learns spatial structure in the input domain and
routes experts at a fine-grained level to utilize it. We also develop new
techniques to train SMoEs, including a self-supervised routing loss and damping
expert errors. Finally, we show strong results for SMoEs on numerous tasks, and
set new state-of-the-art results for medium-range weather prediction and
post-processing ensemble weather forecasts.Comment: 20 pages, 3 figures; NeurIPS 202
Recurrent Graph Convolutional Networks for Spatiotemporal Prediction of Snow Accumulation Using Airborne Radar
The accurate prediction and estimation of annual snow accumulation has grown
in importance as we deal with the effects of climate change and the increase of
global atmospheric temperatures. Airborne radar sensors, such as the Snow
Radar, are able to measure accumulation rate patterns at a large-scale and
monitor the effects of ongoing climate change on Greenland's precipitation and
run-off. The Snow Radar's use of an ultra-wide bandwidth enables a fine
vertical resolution that helps in capturing internal ice layers. Given the
amount of snow accumulation in previous years using the radar data, in this
paper, we propose a machine learning model based on recurrent graph
convolutional networks to predict the snow accumulation in recent consecutive
years at a certain location. We found that the model performs better and with
more consistency than equivalent nongeometric and nontemporal models.Comment: Accepted to IEEE Radar Conference 2023. 6 pages, 4 figures, 2 table
SageFormer: Series-Aware Graph-Enhanced Transformers for Multivariate Time Series Forecasting
Multivariate time series forecasting plays a critical role in diverse
domains. While recent advancements in deep learning methods, especially
Transformers, have shown promise, there remains a gap in addressing the
significance of inter-series dependencies. This paper introduces SageFormer, a
Series-aware Graph-enhanced Transformer model designed to effectively capture
and model dependencies between series using graph structures. SageFormer
tackles two key challenges: effectively representing diverse temporal patterns
across series and mitigating redundant information among series. Importantly,
the proposed series-aware framework seamlessly integrates with existing
Transformer-based models, augmenting their ability to model inter-series
dependencies. Through extensive experiments on real-world and synthetic
datasets, we showcase the superior performance of SageFormer compared to
previous state-of-the-art approaches
SpatialRank: Urban Event Ranking with NDCG Optimization on Spatiotemporal Data
The problem of urban event ranking aims at predicting the top-k most risky
locations of future events such as traffic accidents and crimes. This problem
is of fundamental importance to public safety and urban administration
especially when limited resources are available. The problem is, however,
challenging due to complex and dynamic spatio-temporal correlations between
locations, uneven distribution of urban events in space, and the difficulty to
correctly rank nearby locations with similar features. Prior works on event
forecasting mostly aim at accurately predicting the actual risk score or counts
of events for all the locations. Rankings obtained as such usually have low
quality due to prediction errors. Learning-to-rank methods directly optimize
measures such as Normalized Discounted Cumulative Gain (NDCG), but cannot
handle the spatiotemporal autocorrelation existing among locations. In this
paper, we bridge the gap by proposing a novel spatial event ranking approach
named SpatialRank. SpatialRank features adaptive graph convolution layers that
dynamically learn the spatiotemporal dependencies across locations from data.
In addition, the model optimizes through surrogates a hybrid NDCG loss with a
spatial component to better rank neighboring spatial locations. We design an
importance-sampling with a spatial filtering algorithm to effectively evaluate
the loss during training. Comprehensive experiments on three real-world
datasets demonstrate that SpatialRank can effectively identify the top riskiest
locations of crimes and traffic accidents and outperform state-of-art methods
in terms of NDCG by up to 12.7%.Comment: 37th Conference on Neural Information Processing Systems (NeurIPS
2023
- …