503 research outputs found
Efficient Labelling of Pedestrian Supervisions
Object detection is a fundamental goal to achieve intelligent visual perception by computers due to the fact that objects are the basic building blocks to achieve higher level image understanding. Among the numerous categories of objects in the real-world, pedestrians are among the most important due to several potential benefits brought about by successful pedestrian detection. Often, pedestrian detectors are trained in state-of-the-art systems using supervised machine learning algorithms which necessitates costly and often tedious manual annotation of pedestrians in the form of precise bounding boxes. In this paper, a novel weakly supervised learning algorithm is proposed to train a pedestrian detector that requires, instead of bounding boxes, only annotations of estimated centres of pedestrians. The algorithm makes use of a pedestrian prior learnt in an unsupervised way from the video and this prior is fused with the given weak supervision information in a systematic manner. By evaluating on publicly available datasets, we demonstrate that our weakly supervised algorithm reduces the cost of manual annotation of pedestrians by more than four times while achieving similar performance to a pedestrian detector trained with standard bounding box annotations
Unsupervised learning of generative topic saliency for person re-identification
(c) 2014. The copyright of this document resides with its authors.
It may be distributed unchanged freely in print or electronic forms.© 2014. The copyright of this document resides with its authors. Existing approaches to person re-identification (re-id) are dominated by supervised learning based methods which focus on learning optimal similarity distance metrics. However, supervised learning based models require a large number of manually labelled pairs of person images across every pair of camera views. This thus limits their ability to scale to large camera networks. To overcome this problem, this paper proposes a novel unsupervised re-id modelling approach by exploring generative probabilistic topic modelling. Given abundant unlabelled data, our topic model learns to simultaneously both (1) discover localised person foreground appearance saliency (salient image patches) that are more informative for re-id matching, and (2) remove busy background clutters surrounding a person. Extensive experiments are carried out to demonstrate that the proposed model outperforms existing unsupervised learning re-id methods with significantly simplified model complexity. In the meantime, it still retains comparable re-id accuracy when compared to the state-of-the-art supervised re-id methods but without any need for pair-wise labelled training data
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
TrackletMapper: Ground Surface Segmentation and Mapping from Traffic Participant Trajectories
Robustly classifying ground infrastructure such as roads and street crossings
is an essential task for mobile robots operating alongside pedestrians. While
many semantic segmentation datasets are available for autonomous vehicles,
models trained on such datasets exhibit a large domain gap when deployed on
robots operating in pedestrian spaces. Manually annotating images recorded from
pedestrian viewpoints is both expensive and time-consuming. To overcome this
challenge, we propose TrackletMapper, a framework for annotating ground surface
types such as sidewalks, roads, and street crossings from object tracklets
without requiring human-annotated data. To this end, we project the robot
ego-trajectory and the paths of other traffic participants into the ego-view
camera images, creating sparse semantic annotations for multiple types of
ground surfaces from which a ground segmentation model can be trained. We
further show that the model can be self-distilled for additional performance
benefits by aggregating a ground surface map and projecting it into the camera
images, creating a denser set of training annotations compared to the sparse
tracklet annotations. We qualitatively and quantitatively attest our findings
on a novel large-scale dataset for mobile robots operating in pedestrian areas.
Code and dataset will be made available at
http://trackletmapper.cs.uni-freiburg.de.Comment: 19 pages, 14 figures, CoRL 2022 v
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
- …