3,102 research outputs found

    Pedestrian Attribute Recognition: A Survey

    Full text link
    Recognizing pedestrian attributes is an important task in computer vision community due to it plays an important role in video surveillance. Many algorithms has been proposed to handle this task. The goal of this paper is to review existing works using traditional methods or based on deep learning networks. Firstly, we introduce the background of pedestrian attributes recognition (PAR, for short), including the fundamental concepts of pedestrian attributes and corresponding challenges. Secondly, we introduce existing benchmarks, including popular datasets and evaluation criterion. Thirdly, we analyse the concept of multi-task learning and multi-label learning, and also explain the relations between these two learning algorithms and pedestrian attribute recognition. We also review some popular network architectures which have widely applied in the deep learning community. Fourthly, we analyse popular solutions for this task, such as attributes group, part-based, \emph{etc}. Fifthly, we shown some applications which takes pedestrian attributes into consideration and achieve better performance. Finally, we summarized this paper and give several possible research directions for pedestrian attributes recognition. The project page of this paper can be found from the following website: \url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey: https://sites.google.com/view/ahu-pedestrianattributes

    End-to-end people detection in crowded scenes

    Full text link
    Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as non-maximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes.Comment: 9 pages, 7 figures. Submitted to NIPS 2015. Supplementary material video: http://www.youtube.com/watch?v=QeWl0h3kQ2

    SLAM for Visually Impaired People: A Survey

    Full text link
    In recent decades, several assistive technologies for visually impaired and blind (VIB) people have been developed to improve their ability to navigate independently and safely. At the same time, simultaneous localization and mapping (SLAM) techniques have become sufficiently robust and efficient to be adopted in the development of assistive technologies. In this paper, we first report the results of an anonymous survey conducted with VIB people to understand their experience and needs; we focus on digital assistive technologies that help them with indoor and outdoor navigation. Then, we present a literature review of assistive technologies based on SLAM. We discuss proposed approaches and indicate their pros and cons. We conclude by presenting future opportunities and challenges in this domain.Comment: 26 pages, 5 tables, 3 figure

    Expanding Navigation Systems by Integrating It with Advanced Technologies

    Get PDF
    Navigation systems provide the optimized route from one location to another. It is mainly assisted by external technologies such as Global Positioning System (GPS) and satellite-based radio navigation systems. GPS has many advantages such as high accuracy, available anywhere, reliable, and self-calibrated. However, GPS is limited to outdoor operations. The practice of combining different sources of data to improve the overall outcome is commonly used in various domains. GIS is already integrated with GPS to provide the visualization and realization aspects of a given location. Internet of things (IoT) is a growing domain, where embedded sensors are connected to the Internet and so IoT improves existing navigation systems and expands its capabilities. This chapter proposes a framework based on the integration of GPS, GIS, IoT, and mobile communications to provide a comprehensive and accurate navigation solution. In the next section, we outline the limitations of GPS, and then we describe the integration of GIS, smartphones, and GPS to enable its use in mobile applications. For the rest of this chapter, we introduce various navigation implementations using alternate technologies integrated with GPS or operated as standalone devices

    An Adaptive Human Activity-Aided Hand-Held Smartphone-Based Pedestrian Dead Reckoning Positioning System

    Get PDF
    Pedestrian dead reckoning (PDR), enabled by smartphones’ embedded inertial sensors, is widely applied as a type of indoor positioning system (IPS). However, traditional PDR faces two challenges to improve its accuracy: lack of robustness for different PDR-related human activities and positioning error accumulation over elapsed time. To cope with these issues, we propose a novel adaptive human activity-aided PDR (HAA-PDR) IPS that consists of two main parts, human activity recognition (HAR) and PDR optimization. (1) For HAR, eight different locomotion-related activities are divided into two classes: steady-heading activities (ascending/descending stairs, stationary, normal walking, stationary stepping, and lateral walking) and non-steady-heading activities (door opening and turning). A hierarchical combination of a support vector machine (SVM) and decision tree (DT) is used to recognize steady-heading activities. An autoencoder-based deep neural network (DNN) and a heading range-based method to recognize door opening and turning, respectively. The overall HAR accuracy is over 98.44%. (2) For optimization methods, a process automatically sets the parameters of the PDR differently for different activities to enhance step counting and step length estimation. Furthermore, a method of trajectory optimization mitigates PDR error accumulation utilizing the non-steady-heading activities. We divided the trajectory into small segments and reconstructed it after targeted optimization of each segment. Our method does not use any a priori knowledge of the building layout, plan, or map. Finally, the mean positioning error of our HAA-PDR in a multilevel building is 1.79 m, which is a significant improvement in accuracy compared with a baseline state-of-the-art PDR system

    Object Detection in 20 Years: A Survey

    Full text link
    Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Its development in the past two decades can be regarded as an epitome of computer vision history. If we think of today's object detection as a technical aesthetics under the power of deep learning, then turning back the clock 20 years we would witness the wisdom of cold weapon era. This paper extensively reviews 400+ papers of object detection in the light of its technical evolution, spanning over a quarter-century's time (from the 1990s to 2019). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed up techniques, and the recent state of the art detection methods. This paper also reviews some important detection applications, such as pedestrian detection, face detection, text detection, etc, and makes an in-deep analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible publicatio

    A Navigation and Augmented Reality System for Visually Impaired People

    Get PDF
    In recent years, we have assisted with an impressive advance in augmented reality systems and computer vision algorithms, based on image processing and artificial intelligence. Thanks to these technologies, mainstream smartphones are able to estimate their own motion in 3D space with high accuracy. In this paper, we exploit such technologies to support the autonomous mobility of people with visual disabilities, identifying pre-defined virtual paths and providing context information, reducing the distance between the digital and real worlds. In particular, we present ARIANNA+, an extension of ARIANNA, a system explicitly designed for visually impaired people for indoor and outdoor localization and navigation. While ARIANNA is based on the assumption that landmarks, such as QR codes, and physical paths (composed of colored tapes, painted lines, or tactile pavings) are deployed in the environment and recognized by the camera of a common smartphone, ARIANNA+ eliminates the need for any physical support thanks to the ARKit library, which we exploit to build a completely virtual path. Moreover, ARIANNA+ adds the possibility for the users to have enhanced interactions with the surrounding environment, through convolutional neural networks (CNNs) trained to recognize objects or buildings and enabling the possibility of accessing contents associated with them. By using a common smartphone as a mediation instrument with the environment, ARIANNA+ leverages augmented reality and machine learning for enhancing physical accessibility. The proposed system allows visually impaired people to easily navigate in indoor and outdoor scenarios simply by loading a previously recorded virtual path and providing automatic guidance along the route, through haptic, speech, and sound feedback
    • …
    corecore