3,663 research outputs found
Class-Agnostic Counting
Nearly all existing counting methods are designed for a specific object
class. Our work, however, aims to create a counting model able to count any
class of object. To achieve this goal, we formulate counting as a matching
problem, enabling us to exploit the image self-similarity property that
naturally exists in object counting problems. We make the following three
contributions: first, a Generic Matching Network (GMN) architecture that can
potentially count any object in a class-agnostic manner; second, by
reformulating the counting problem as one of matching objects, we can take
advantage of the abundance of video data labeled for tracking, which contains
natural repetitions suitable for training a counting model. Such data enables
us to train the GMN. Third, to customize the GMN to different user
requirements, an adapter module is used to specialize the model with minimal
effort, i.e. using a few labeled examples, and adapting only a small fraction
of the trained parameters. This is a form of few-shot learning, which is
practical for domains where labels are limited due to requiring expert
knowledge (e.g. microbiology). We demonstrate the flexibility of our method on
a diverse set of existing counting benchmarks: specifically cells, cars, and
human crowds. The model achieves competitive performance on cell and crowd
counting datasets, and surpasses the state-of-the-art on the car dataset using
only three training images. When training on the entire dataset, the proposed
method outperforms all previous methods by a large margin.Comment: Asian Conference on Computer Vision (ACCV), 201
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras
In this paper, we develop deep spatio-temporal neural networks to
sequentially count vehicles from low quality videos captured by city cameras
(citycams). Citycam videos have low resolution, low frame rate, high occlusion
and large perspective, making most existing methods lose their efficacy. To
overcome limitations of existing methods and incorporate the temporal
information of traffic video, we design a novel FCN-rLSTM network to jointly
estimate vehicle density and vehicle count by connecting fully convolutional
neural networks (FCN) with long short term memory networks (LSTM) in a residual
learning fashion. Such design leverages the strengths of FCN for pixel-level
prediction and the strengths of LSTM for learning complex temporal dynamics.
The residual learning connection reformulates the vehicle count regression as
learning residual functions with reference to the sum of densities in each
frame, which significantly accelerates the training of networks. To preserve
feature map resolution, we propose a Hyper-Atrous combination to integrate
atrous convolution in FCN and combine feature maps of different convolution
layers. FCN-rLSTM enables refined feature representation and a novel end-to-end
trainable mapping from pixels to vehicle count. We extensively evaluated the
proposed method on different counting tasks with three datasets, with
experimental results demonstrating their effectiveness and robustness. In
particular, FCN-rLSTM reduces the mean absolute error (MAE) from 5.31 to 4.21
on TRANCOS, and reduces the MAE from 2.74 to 1.53 on WebCamT. Training process
is accelerated by 5 times on average.Comment: Accepted by International Conference on Computer Vision (ICCV), 201
PEOPLE COUNTING AND RUNNER IDENTIFICATION IN ATHLETIC RACES
The objective of this project is to create software capable of analyzing a video sequence of running competitions. The analysis consists of detecting the runners, tracking them with the intention of knowing their position when they cross the finish line and counting them. Another functionality of the system will be recognizing the bib numbers, thus making it possible for every runner to get their time. The software was developed studying different techniques of object detection, tracking and character recognition to try to choose the best for this specific application. A set of experiments has been performed to validate the proposed system
Cyclist Detection, Tracking, and Trajectory Analysis in Urban Traffic Video Data
The major objective of this thesis work is examining computer vision and machine learning detection methods, tracking algorithms and trajectory analysis for cyclists in traffic video data and developing an efficient system for cyclist counting. Due to the growing number of cyclist accidents on urban roads, methods for collecting information on cyclists are of significant importance to the Department of Transportation. The collected information provides insights into solving critical problems related to transportation planning, implementing safety countermeasures, and managing traffic flow efficiently. Intelligent Transportation System (ITS) employs automated tools to collect traffic information from traffic video data. In comparison to other road users, such as cars and pedestrians, the automated cyclist data collection is relatively a new research area. In this work, a vision-based method for gathering cyclist count data at intersections and road segments is developed. First, we develop methodology for an efficient detection and tracking of cyclists. The combination of classification features along with motion based properties are evaluated to detect cyclists in the test video data. A Convolutional Neural Network (CNN) based detector called You Only Look Once (YOLO) is implemented to increase the detection accuracy. In the next step, the detection results are fed into a tracker which is implemented based on the Kernelized Correlation Filters (KCF) which in cooperation with the bipartite graph matching algorithm allows to track multiple cyclists, concurrently. Then, a trajectory rebuilding method and a trajectory comparison model are applied to refine the accuracy of tracking and counting. The trajectory comparison is performed based on semantic similarity approach. The proposed counting method is the first cyclist counting method that has the ability to count cyclists under different movement patterns. The trajectory data obtained can be further utilized for cyclist behavioral modeling and safety analysis
- …