3,102 research outputs found
Pedestrian Attribute Recognition: A Survey
Recognizing pedestrian attributes is an important task in computer vision
community due to it plays an important role in video surveillance. Many
algorithms has been proposed to handle this task. The goal of this paper is to
review existing works using traditional methods or based on deep learning
networks. Firstly, we introduce the background of pedestrian attributes
recognition (PAR, for short), including the fundamental concepts of pedestrian
attributes and corresponding challenges. Secondly, we introduce existing
benchmarks, including popular datasets and evaluation criterion. Thirdly, we
analyse the concept of multi-task learning and multi-label learning, and also
explain the relations between these two learning algorithms and pedestrian
attribute recognition. We also review some popular network architectures which
have widely applied in the deep learning community. Fourthly, we analyse
popular solutions for this task, such as attributes group, part-based,
\emph{etc}. Fifthly, we shown some applications which takes pedestrian
attributes into consideration and achieve better performance. Finally, we
summarized this paper and give several possible research directions for
pedestrian attributes recognition. The project page of this paper can be found
from the following website:
\url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey:
https://sites.google.com/view/ahu-pedestrianattributes
End-to-end people detection in crowded scenes
Current people detectors operate either by scanning an image in a sliding
window fashion or by classifying a discrete set of proposals. We propose a
model that is based on decoding an image into a set of people detections. Our
system takes an image as input and directly outputs a set of distinct detection
hypotheses. Because we generate predictions jointly, common post-processing
steps such as non-maximum suppression are unnecessary. We use a recurrent LSTM
layer for sequence generation and train our model end-to-end with a new loss
function that operates on sets of detections. We demonstrate the effectiveness
of our approach on the challenging task of detecting people in crowded scenes.Comment: 9 pages, 7 figures. Submitted to NIPS 2015. Supplementary material
video: http://www.youtube.com/watch?v=QeWl0h3kQ2
SLAM for Visually Impaired People: A Survey
In recent decades, several assistive technologies for visually impaired and
blind (VIB) people have been developed to improve their ability to navigate
independently and safely. At the same time, simultaneous localization and
mapping (SLAM) techniques have become sufficiently robust and efficient to be
adopted in the development of assistive technologies. In this paper, we first
report the results of an anonymous survey conducted with VIB people to
understand their experience and needs; we focus on digital assistive
technologies that help them with indoor and outdoor navigation. Then, we
present a literature review of assistive technologies based on SLAM. We discuss
proposed approaches and indicate their pros and cons. We conclude by presenting
future opportunities and challenges in this domain.Comment: 26 pages, 5 tables, 3 figure
Expanding Navigation Systems by Integrating It with Advanced Technologies
Navigation systems provide the optimized route from one location to another. It is mainly assisted by external technologies such as Global Positioning System (GPS) and satellite-based radio navigation systems. GPS has many advantages such as high accuracy, available anywhere, reliable, and self-calibrated. However, GPS is limited to outdoor operations. The practice of combining different sources of data to improve the overall outcome is commonly used in various domains. GIS is already integrated with GPS to provide the visualization and realization aspects of a given location. Internet of things (IoT) is a growing domain, where embedded sensors are connected to the Internet and so IoT improves existing navigation systems and expands its capabilities. This chapter proposes a framework based on the integration of GPS, GIS, IoT, and mobile communications to provide a comprehensive and accurate navigation solution. In the next section, we outline the limitations of GPS, and then we describe the integration of GIS, smartphones, and GPS to enable its use in mobile applications. For the rest of this chapter, we introduce various navigation implementations using alternate technologies integrated with GPS or operated as standalone devices
An Adaptive Human Activity-Aided Hand-Held Smartphone-Based Pedestrian Dead Reckoning Positioning System
Pedestrian dead reckoning (PDR), enabled by smartphones’ embedded inertial sensors, is widely applied as a type of indoor positioning system (IPS). However, traditional PDR faces two challenges to improve its accuracy: lack of robustness for different PDR-related human activities and positioning error accumulation over elapsed time. To cope with these issues, we propose a novel adaptive human activity-aided PDR (HAA-PDR) IPS that consists of two main parts, human activity recognition (HAR) and PDR optimization. (1) For HAR, eight different locomotion-related activities are divided into two classes: steady-heading activities (ascending/descending stairs, stationary, normal walking, stationary stepping, and lateral walking) and non-steady-heading activities (door opening and turning). A hierarchical combination of a support vector machine (SVM) and decision tree (DT) is used to recognize steady-heading activities. An autoencoder-based deep neural network (DNN) and a heading range-based method to recognize door opening and turning, respectively. The overall HAR accuracy is over 98.44%. (2) For optimization methods, a process automatically sets the parameters of the PDR differently for different activities to enhance step counting and step length estimation. Furthermore, a method of trajectory optimization mitigates PDR error accumulation utilizing the non-steady-heading activities. We divided the trajectory into small segments and reconstructed it after targeted optimization of each segment. Our method does not use any a priori knowledge of the building layout, plan, or map. Finally, the mean positioning error of our HAA-PDR in a multilevel building is 1.79 m, which is a significant improvement in accuracy compared with a baseline state-of-the-art PDR system
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
A Navigation and Augmented Reality System for Visually Impaired People
In recent years, we have assisted with an impressive advance in augmented reality systems and computer vision algorithms, based on image processing and artificial intelligence. Thanks to these technologies, mainstream smartphones are able to estimate their own motion in 3D space with high accuracy. In this paper, we exploit such technologies to support the autonomous mobility of people with visual disabilities, identifying pre-defined virtual paths and providing context information, reducing the distance between the digital and real worlds. In particular, we present ARIANNA+, an extension of ARIANNA, a system explicitly designed for visually impaired people for indoor and outdoor localization and navigation. While ARIANNA is based on the assumption that landmarks, such as QR codes, and physical paths (composed of colored tapes, painted lines, or tactile pavings) are deployed in the environment and recognized by the camera of a common smartphone, ARIANNA+ eliminates the need for any physical support thanks to the ARKit library, which we exploit to build a completely virtual path. Moreover, ARIANNA+ adds the possibility for the users to have enhanced interactions with the surrounding environment, through convolutional neural networks (CNNs) trained to recognize objects or buildings and enabling the possibility of accessing contents associated with them. By using a common smartphone as a mediation instrument with the environment, ARIANNA+ leverages augmented reality and machine learning for enhancing physical accessibility. The proposed system allows visually impaired people to easily navigate in indoor and outdoor scenarios simply by loading a previously recorded virtual path and providing automatic guidance along the route, through haptic, speech, and sound feedback
- …