1,771 research outputs found

    A Joint 3D-2D based Method for Free Space Detection on Roads

    Full text link
    In this paper, we address the problem of road segmentation and free space detection in the context of autonomous driving. Traditional methods either use 3-dimensional (3D) cues such as point clouds obtained from LIDAR, RADAR or stereo cameras or 2-dimensional (2D) cues such as lane markings, road boundaries and object detection. Typical 3D point clouds do not have enough resolution to detect fine differences in heights such as between road and pavement. Image based 2D cues fail when encountering uneven road textures such as due to shadows, potholes, lane markings or road restoration. We propose a novel free road space detection technique combining both 2D and 3D cues. In particular, we use CNN based road segmentation from 2D images and plane/box fitting on sparse depth data obtained from SLAM as priors to formulate an energy minimization using conditional random field (CRF), for road pixels classification. While the CNN learns the road texture and is unaffected by depth boundaries, the 3D information helps in overcoming texture based classification failures. Finally, we use the obtained road segmentation with the 3D depth data from monocular SLAM to detect the free space for the navigation purposes. Our experiments on KITTI odometry dataset, Camvid dataset, as well as videos captured by us, validate the superiority of the proposed approach over the state of the art.Comment: Accepted for publication at IEEE WACV 201

    Person Re-identification by Local Maximal Occurrence Representation and Metric Learning

    Full text link
    Person re-identification is an important technique towards automatic search of a person's presence in a surveillance video. Two fundamental problems are critical for person re-identification, feature representation and metric learning. An effective feature representation should be robust to illumination and viewpoint changes, and a discriminant metric should be learned to match various person images. In this paper, we propose an effective feature representation called Local Maximal Occurrence (LOMO), and a subspace and metric learning method called Cross-view Quadratic Discriminant Analysis (XQDA). The LOMO feature analyzes the horizontal occurrence of local features, and maximizes the occurrence to make a stable representation against viewpoint changes. Besides, to handle illumination variations, we apply the Retinex transform and a scale invariant texture operator. To learn a discriminant metric, we propose to learn a discriminant low dimensional subspace by cross-view quadratic discriminant analysis, and simultaneously, a QDA metric is learned on the derived subspace. We also present a practical computation method for XQDA, as well as its regularization. Experiments on four challenging person re-identification databases, VIPeR, QMUL GRID, CUHK Campus, and CUHK03, show that the proposed method improves the state-of-the-art rank-1 identification rates by 2.2%, 4.88%, 28.91%, and 31.55% on the four databases, respectively.Comment: This paper has been accepted by CVPR 2015. For source codes and extracted features please visit http://www.cbsr.ia.ac.cn/users/scliao/projects/lomo_xqda

    Fast and robust 3D feature extraction from sparse point clouds

    Get PDF
    Matching 3D point clouds, a critical operation in map building and localization, is difficult with Velodyne-type sensors due to the sparse and non-uniform point clouds that they produce. Standard methods from dense 3D point clouds are generally not effective. In this paper, we describe a featurebased approach using Principal Components Analysis (PCA) of neighborhoods of points, which results in mathematically principled line and plane features. The key contribution in this work is to show how this type of feature extraction can be done efficiently and robustly even on non-uniformly sampled point clouds. The resulting detector runs in real-time and can be easily tuned to have a low false positive rate, simplifying data association. We evaluate the performance of our algorithm on an autonomous car at the MCity Test Facility using a Velodyne HDL-32E, and we compare our results against the state-of-theart NARF keypoint detector. © 2016 IEEE

    Don't Look Back: Robustifying Place Categorization for Viewpoint- and Condition-Invariant Place Recognition

    Full text link
    When a human drives a car along a road for the first time, they later recognize where they are on the return journey typically without needing to look in their rear-view mirror or turn around to look back, despite significant viewpoint and appearance change. Such navigation capabilities are typically attributed to our semantic visual understanding of the environment [1] beyond geometry to recognizing the types of places we are passing through such as "passing a shop on the left" or "moving through a forested area". Humans are in effect using place categorization [2] to perform specific place recognition even when the viewpoint is 180 degrees reversed. Recent advances in deep neural networks have enabled high-performance semantic understanding of visual places and scenes, opening up the possibility of emulating what humans do. In this work, we develop a novel methodology for using the semantics-aware higher-order layers of deep neural networks for recognizing specific places from within a reference database. To further improve the robustness to appearance change, we develop a descriptor normalization scheme that builds on the success of normalization schemes for pure appearance-based techniques such as SeqSLAM [3]. Using two different datasets - one road-based, one pedestrian-based, we evaluate the performance of the system in performing place recognition on reverse traversals of a route with a limited field of view camera and no turn-back-and-look behaviours, and compare to existing state-of-the-art techniques and vanilla off-the-shelf features. The results demonstrate significant improvements over the existing state of the art, especially for extreme perceptual challenges that involve both great viewpoint change and environmental appearance change. We also provide experimental analyses of the contributions of the various system components.Comment: 9 pages, 11 figures, ICRA 201
    • …
    corecore