2,740 research outputs found
A Scheme for the Detection and Tracking of People Tuned for Aerial Image Sequences
Abstract. This paper addresses the problem of detecting and tracking a large number of individuals in aerial image sequences that have been taken from high altitude. We propose a method which can handle the numerous challenges that are associated with this task and demonstrate its quality on several test sequences. Moreover this paper contains several contributions to improve object detection and tracking in other domains, too. We show how to build an effective object detector in a flexible way which incorporates the shadow of an object and enhanced features for shape and color. Furthermore the performance of the detector is boosted by an improved way to collect background samples for the classifier train-ing. At last we describe a tracking-by-detection method that can handle frequent misses and a very large number of similar objects
An Appearance-Based Tracking Algorithm for Aerial Search and Rescue Purposes
The automation of the Wilderness Search and Rescue (WiSAR) task aims for high levels of understanding of various scenery. In addition, working in unfriendly and complex environments may cause a time delay in the operation and consequently put human lives at stake. In order to
address this problem, Unmanned Aerial Vehicles (UAVs), which provide potential support to the
conventional methods, are used. These vehicles are provided with reliable human detection and
tracking algorithms; in order to be able to find and track the bodies of the victims in complex
environments, and a robust control system to maintain safe distances from the detected bodies.
In this paper, a human detection based on the color and depth data captured from onboard sensors
is proposed. Moreover, the proposal of computing data association from the skeleton pose and a
visual appearance measurement allows the tracking of multiple people with invariance to the scale,
translation and rotation of the point of view with respect to the target objects. The system has been
validated with real and simulation experiments, and the obtained results show the ability to track
multiple individuals even after long-term disappearances. Furthermore, the simulations present the
robustness of the implemented reactive control system as a promising tool for assisting the pilot to
perform approaching maneuvers in a safe and smooth manner.This research is supported by Madrid Community project SEGVAUTO 4.0 P2018/EMT-4362)
and by the Spanish Government CICYT projects (TRA2015-63708-R and TRA2016-78886-C3-1-R), and Ministerio
de Educación, Cultura y Deporte para la Formación de Profesorado Universitario (FPU14/02143). Also,
we gratefully acknowledge the support of the NVIDIA Corporation with the donation of the GPUs used for
this research
AerialMPTNet: Multi-Pedestrian Tracking in Aerial Imagery Using Temporal and Graphical Features
Multi-pedestrian tracking in aerial imagery has several applications such as
large-scale event monitoring, disaster management, search-and-rescue missions,
and as input into predictive crowd dynamic models. Due to the challenges such
as the large number and the tiny size of the pedestrians (e.g., 4 x 4 pixels)
with their similar appearances as well as different scales and atmospheric
conditions of the images with their extremely low frame rates (e.g., 2 fps),
current state-of-the-art algorithms including the deep learning-based ones are
unable to perform well. In this paper, we propose AerialMPTNet, a novel
approach for multi-pedestrian tracking in geo-referenced aerial imagery by
fusing appearance features from a Siamese Neural Network, movement predictions
from a Long Short-Term Memory, and pedestrian interconnections from a GraphCNN.
In addition, to address the lack of diverse aerial pedestrian tracking
datasets, we introduce the Aerial Multi-Pedestrian Tracking (AerialMPT) dataset
consisting of 307 frames and 44,740 pedestrians annotated. We believe that
AerialMPT is the largest and most diverse dataset to this date and will be
released publicly. We evaluate AerialMPTNet on AerialMPT and KIT AIS, and
benchmark with several state-of-the-art tracking methods. Results indicate that
AerialMPTNet significantly outperforms other methods on accuracy and
time-efficiency.Comment: ICPR 202
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks
In this work we present a novel end-to-end framework for tracking and
classifying a robot's surroundings in complex, dynamic and only partially
observable real-world environments. The approach deploys a recurrent neural
network to filter an input stream of raw laser measurements in order to
directly infer object locations, along with their identity in both visible and
occluded areas. To achieve this we first train the network using unsupervised
Deep Tracking, a recently proposed theoretical framework for end-to-end space
occupancy prediction. We show that by learning to track on a large amount of
unsupervised data, the network creates a rich internal representation of its
environment which we in turn exploit through the principle of inductive
transfer of knowledge to perform the task of it's semantic classification. As a
result, we show that only a small amount of labelled data suffices to steer the
network towards mastering this additional task. Furthermore we propose a novel
recurrent neural network architecture specifically tailored to tracking and
semantic classification in real-world robotics applications. We demonstrate the
tracking and classification performance of the method on real-world data
collected at a busy road junction. Our evaluation shows that the proposed
end-to-end framework compares favourably to a state-of-the-art, model-free
tracking solution and that it outperforms a conventional one-shot training
scheme for semantic classification
Class-Agnostic Counting
Nearly all existing counting methods are designed for a specific object
class. Our work, however, aims to create a counting model able to count any
class of object. To achieve this goal, we formulate counting as a matching
problem, enabling us to exploit the image self-similarity property that
naturally exists in object counting problems. We make the following three
contributions: first, a Generic Matching Network (GMN) architecture that can
potentially count any object in a class-agnostic manner; second, by
reformulating the counting problem as one of matching objects, we can take
advantage of the abundance of video data labeled for tracking, which contains
natural repetitions suitable for training a counting model. Such data enables
us to train the GMN. Third, to customize the GMN to different user
requirements, an adapter module is used to specialize the model with minimal
effort, i.e. using a few labeled examples, and adapting only a small fraction
of the trained parameters. This is a form of few-shot learning, which is
practical for domains where labels are limited due to requiring expert
knowledge (e.g. microbiology). We demonstrate the flexibility of our method on
a diverse set of existing counting benchmarks: specifically cells, cars, and
human crowds. The model achieves competitive performance on cell and crowd
counting datasets, and surpasses the state-of-the-art on the car dataset using
only three training images. When training on the entire dataset, the proposed
method outperforms all previous methods by a large margin.Comment: Asian Conference on Computer Vision (ACCV), 201
Video surveillance systems-current status and future trends
Within this survey an attempt is made to document the present status of video surveillance systems. The main components of a surveillance system are presented and studied thoroughly. Algorithms for image enhancement, object detection, object tracking, object recognition and item re-identification are presented. The most common modalities utilized by surveillance systems are discussed, putting emphasis on video, in terms of available resolutions and new imaging approaches, like High Dynamic Range video. The most important features and analytics are presented, along with the most common approaches for image / video quality enhancement. Distributed computational infrastructures are discussed (Cloud, Fog and Edge Computing), describing the advantages and disadvantages of each approach. The most important deep learning algorithms are presented, along with the smart analytics that they utilize. Augmented reality and the role it can play to a surveillance system is reported, just before discussing the challenges and the future trends of surveillance
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
- …