3,315 research outputs found
LCrowdV: Generating Labeled Videos for Simulation-based Crowd Behavior Learning
We present a novel procedural framework to generate an arbitrary number of
labeled crowd videos (LCrowdV). The resulting crowd video datasets are used to
design accurate algorithms or training models for crowded scene understanding.
Our overall approach is composed of two components: a procedural simulation
framework for generating crowd movements and behaviors, and a procedural
rendering framework to generate different videos or images. Each video or image
is automatically labeled based on the environment, number of pedestrians,
density, behavior, flow, lighting conditions, viewpoint, noise, etc.
Furthermore, we can increase the realism by combining synthetically-generated
behaviors with real-world background videos. We demonstrate the benefits of
LCrowdV over prior lableled crowd datasets by improving the accuracy of
pedestrian detection and crowd behavior classification algorithms. LCrowdV
would be released on the WWW
Recommended from our members
Recognition of human interactions with vehicles using 3-D models and dynamic context
textThis dissertation describes two distinctive methods for human-vehicle interaction recognition: one for ground level videos and the other for aerial videos. For ground level videos, this dissertation presents a novel methodology which is able to estimate a detailed status of a scene involving multiple humans and vehicles. The system tracks their configuration even when they are performing complex interactions with severe occlusion such as when four persons are exiting a car together. The motivation is to identify the 3-D states of vehicles (e.g. status of doors), their relations with persons, which is necessary to analyze complex human-vehicle interactions (e.g. breaking into or stealing a vehicle), and the motion of humans and car doors to detect atomic human-vehicle interactions. A probabilistic algorithm has been designed to track humans and analyze their dynamic relationships with vehicles using a dynamic context. We have focused on two ideas. One is that many simple events can be detected based on a low-level analysis, and these detected events must contextually meet with human/vehicle status tracking results. The other is that the motion clue interferes with states in the current and future frames, and analyzing the motion is critical to detect such simple events. Our approach updates the probability of a person (or a vehicle) having a particular state based on these basic observed events. The probabilistic inference is made for the tracking process to match event-based evidence and motion-based evidence. For aerial videos, the object resolution is low, the visual cues are vague, and the detection and tracking of objects is less reliable as a consequence. Any method that requires accurate tracking of objects or the exact matching of event definition are better avoided. To address these issues, we present a temporal logic based approach which does not require training from event examples. At the low-level, we employ dynamic programming to perform fast model fitting between the tracked vehicle and the rendered 3-D vehicle models. At the semantic-level, given the localized event region of interest (ROI), we verify the time series of human-vehicle relationships with the pre-specified event definitions in a piecewise fashion. With special interest in recognizing a person getting into and out of a vehicle, we have tested our method on a subset of the VIRAT Aerial Video dataset and achieved superior results.Electrical and Computer Engineerin
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking
In recent years, numerous effective multi-object tracking (MOT) methods are
developed because of the wide range of applications. Existing performance
evaluations of MOT methods usually separate the object tracking step from the
object detection step by using the same fixed object detection results for
comparisons. In this work, we perform a comprehensive quantitative study on the
effects of object detection accuracy to the overall MOT performance, using the
new large-scale University at Albany DETection and tRACking (UA-DETRAC)
benchmark dataset. The UA-DETRAC benchmark dataset consists of 100 challenging
video sequences captured from real-world traffic scenes (over 140,000 frames
with rich annotations, including occlusion, weather, vehicle category,
truncation, and vehicle bounding boxes) for object detection, object tracking
and MOT system. We evaluate complete MOT systems constructed from combinations
of state-of-the-art object detection and object tracking methods. Our analysis
shows the complex effects of object detection accuracy on MOT system
performance. Based on these observations, we propose new evaluation tools and
metrics for MOT systems that consider both object detection and object tracking
for comprehensive analysis.Comment: 18 pages, 11 figures, accepted by CVI
Crowd detection and counting using a static and dynamic platform: state of the art
Automated object detection and crowd density estimation are popular and important area in visual surveillance research. The last decades witnessed many significant research in this field however, it is still a challenging problem for automatic visual surveillance. The ever increase in research of the field of crowd dynamics and crowd motion necessitates a detailed and updated survey of different techniques and trends in this field. This paper presents a survey on crowd detection and crowd density estimation from moving platform and surveys the different methods employed for this purpose. This review category and delineates several detections and counting estimation methods that have been applied for the examination of scenes from static and moving platforms
Design, implementation and evaluation of automated surveillance systems
El reconocimiento de patrones ha conseguido un nivel de complejidad que nos permite reconocer diferente
tipo de eventos, incluso peligros, y actuar en concordancia para minimizar el impacto de una situación
complicada y abordarla de la mejor manera posible. Sin embargo, creemos que todavía se puede llegar
a alcanzar aplicaciones más eficientes con algoritmos más precisos. Nuestra aplicación quiere probar
a incluir el nuevo paradigma de la programación, las redes neuronales. Nuestra idea en principio fue
explorar la alternativa que las nuevas redes neuronales convolucionales aportaban, en donde se podía
ver en vídeos de ejemplos la alta tasa de detección e identificación que, por ejemplo, YOLOv2 podría
mostrar. Después de comparar las características, vimos que YOLOv3 ofrecía un buen balance entre
precisión y rapidez como comentaremos más adelante. Debido a la tasa de baja detecciones, haremos
uso de los filtros de Kalman para ayudarnos a la hora de hacer reidentificación de personas y objetos.
En este proyecto, haremos un estudio además de las alternativas de videovigilancia con las que cuentan
empresas del sector y veremos que clase de productos ofrecen y, por otro lado, observaremos cuales son
los trabajos de los grupos de investigadores de otras universidades que más similitudes tienen con nuestro objetivo. Dedicaremos, por lo tanto, el uso de esta red neuronal para detectar eventos como el abandono de mochilas y para mostrar la densidad de tránsito en localizaciones concretas, así como utilizaremos una metodología más tradicional, el flujo óptico, para detectar actuaciones anormales en una multitud.Automatic surveillance system is getting more and more sophisticated with the increasing calculation
power that computers are reaching. The aim of this project is to take advantage of these tools and
with the new classification and detection technology brought by neural networks, develop a surveillance
application that can recognize certain behaviours (which are the detection of lost backpacks and suitcases,
detection of abnormal crowd activity and heatmap of density occupation). To develop this program,
python has been the selected programming language used, where YOLO and OpenCV form the spine of
this project. After testing the code, it has been proved that due to the constrains of the detection for
small objects, the project does not perform as it should for real development, but still it shows potential
for the detection of lost backpacks in certain videos from the GBA dataset [1] and PETS2006 dataset [2].
The abnormal activity detection for crowds is made with a simple algorithm that seems to perform well,
detecting the anomalies in all the testing dataset used, generated by the University of Minnesota [3].
Finally, the heatmap can display correctly the projection of people on the ground for five second, just as
intended. The objective of this software is to be part of the core of what could be a future application
with more modules that will be able to perform full automated surveillance tasks and gather useful
information data, and these advances and future proposal will be explained in this memory.Máster Universitario en Ingeniería Industrial (M141
- …