749 research outputs found

    Indoor assistance for visually impaired people using a RGB-D camera

    Get PDF
    In this paper a navigational aid for visually impaired people is presented. The system uses a RGB-D camera to perceive the environment and implements self-localization, obstacle detection and obstacle classification. The novelty of this work is threefold. First, self-localization is performed by means of a novel camera tracking approach that uses both depth and color information. Second, to provide the user with semantic information, obstacles are classified as walls, doors, steps and a residual class that covers isolated objects and bumpy parts on the floor. Third, in order to guarantee real time performance, the system is accelerated by offloading parallel operations to the GPU. Experiments demonstrate that the whole system is running at 9 Hz

    SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences

    Full text link
    While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches. To this end, we find sparse matches across two stereo image pairs that are detected without any prior regularization and perform dense interpolation preserving geometric and motion boundaries by using edge information. A few iterations of variational energy minimization are performed to refine our results, which are thoroughly evaluated on the KITTI benchmark and additionally compared to state-of-the-art on MPI Sintel. For application in an automotive context, we further show that an optional ego-motion model helps to boost performance and blends smoothly into our approach to produce a segmentation of the scene into static and dynamic parts.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV), 201

    Is the Pedestrian going to Cross? Answering by 2D Pose Estimation

    Get PDF
    Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-of-the-art results.Comment: This is a paper presented in IEEE Intelligent Vehicles Symposium (IEEE IV 2018

    Fast Multi-frame Stereo Scene Flow with Motion Segmentation

    Full text link
    We propose a new multi-frame method for efficiently computing scene flow (dense depth and optical flow) and camera ego-motion for a dynamic scene observed from a moving stereo camera rig. Our technique also segments out moving objects from the rigid scene. In our method, we first estimate the disparity map and the 6-DOF camera motion using stereo matching and visual odometry. We then identify regions inconsistent with the estimated camera motion and compute per-pixel optical flow only at these regions. This flow proposal is fused with the camera motion-based flow proposal using fusion moves to obtain the final optical flow and motion segmentation. This unified framework benefits all four tasks - stereo, optical flow, visual odometry and motion segmentation leading to overall higher accuracy and efficiency. Our method is currently ranked third on the KITTI 2015 scene flow benchmark. Furthermore, our CPU implementation runs in 2-3 seconds per frame which is 1-3 orders of magnitude faster than the top six methods. We also report a thorough evaluation on challenging Sintel sequences with fast camera and object motion, where our method consistently outperforms OSF [Menze and Geiger, 2015], which is currently ranked second on the KITTI benchmark.Comment: 15 pages. To appear at IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). Our results were submitted to KITTI 2015 Stereo Scene Flow Benchmark in November 201
    corecore