27,823 research outputs found

    Region of Interest Generation for Pedestrian Detection using Stereo Vision

    Get PDF
    Pedestrian detection is an active research area in the field of computer vision. The sliding window paradigm is usually followed to extract all possible detector windows, however, it is very time consuming. Subsequently, stereo vision using a pair of camera is preferred to reduce the search space that includes the depth information. Disparity map generation using feature correspondence is an integral part and a prior task to depth estimation. In our work, we apply the ORB features to fasten the feature correspondence process. Once the ROI generation phase is over, the extracted detector window is represented by low level histogram of oriented gradient (HOG) features. Subsequently, Linear Support Vector Machine (SVM) is applied to classify them as either pedestrian or non-pedestrian. The experimental results reveal that ORB driven depth estimation is at least seven times faster than the SURF descriptor and ten times faster than the SIFT descriptor

    A sliding mode approach to visual motion estimation

    Get PDF
    The problem of estimating motion from a sequence of images has been a major research theme in machine vision for many years and remains one of the most challenging ones. In this work, we use sliding mode observers to estimate the motion of a moving body with the aid of a CCD camera. We consider a variety of dynamical systems which arise in machine vision applications and develop a novel identication procedure for the estimation of both constant and time varying parameters. The basic procedure introduced for parameter estimation is to recast image feature dynamics linearly in terms of unknown parameters and construct a sliding mode observer to produce asymptotically correct estimates of the observed image features, and then use “equivalent control” to explicitly compute parameters. Much of our analysis has been substantiated by computer simulations and real experiments

    Deep Detection of People and their Mobility Aids for a Hospital Robot

    Full text link
    Robots operating in populated environments encounter many different types of people, some of whom might have an advanced need for cautious interaction, because of physical impairments or their advanced age. Robots therefore need to recognize such advanced demands to provide appropriate assistance, guidance or other forms of support. In this paper, we propose a depth-based perception pipeline that estimates the position and velocity of people in the environment and categorizes them according to the mobility aids they use: pedestrian, person in wheelchair, person in a wheelchair with a person pushing them, person with crutches and person using a walker. We present a fast region proposal method that feeds a Region-based Convolutional Network (Fast R-CNN). With this, we speed up the object detection process by a factor of seven compared to a dense sliding window approach. We furthermore propose a probabilistic position, velocity and class estimator to smooth the CNN's detections and account for occlusions and misclassifications. In addition, we introduce a new hospital dataset with over 17,000 annotated RGB-D images. Extensive experiments confirm that our pipeline successfully keeps track of people and their mobility aids, even in challenging situations with multiple people from different categories and frequent occlusions. Videos of our experiments and the dataset are available at http://www2.informatik.uni-freiburg.de/~kollmitz/MobilityAidsComment: 7 pages, ECMR 2017, dataset and videos: http://www2.informatik.uni-freiburg.de/~kollmitz/MobilityAids

    Real-Time RGB-D based Template Matching Pedestrian Detection

    Full text link
    Pedestrian detection is one of the most popular topics in computer vision and robotics. Considering challenging issues in multiple pedestrian detection, we present a real-time depth-based template matching people detector. In this paper, we propose different approaches for training the depth-based template. We train multiple templates for handling issues due to various upper-body orientations of the pedestrians and different levels of detail in depth-map of the pedestrians with various distances from the camera. And, we take into account the degree of reliability for different regions of sliding window by proposing the weighted template approach. Furthermore, we combine the depth-detector with an appearance based detector as a verifier to take advantage of the appearance cues for dealing with the limitations of depth data. We evaluate our method on the challenging ETH dataset sequence. We show that our method outperforms the state-of-the-art approaches.Comment: published in ICRA 201

    LDSO: Direct Sparse Odometry with Loop Closure

    Full text link
    In this paper we present an extension of Direct Sparse Odometry (DSO) to a monocular visual SLAM system with loop closure detection and pose-graph optimization (LDSO). As a direct technique, DSO can utilize any image pixel with sufficient intensity gradient, which makes it robust even in featureless areas. LDSO retains this robustness, while at the same time ensuring repeatability of some of these points by favoring corner features in the tracking frontend. This repeatability allows to reliably detect loop closure candidates with a conventional feature-based bag-of-words (BoW) approach. Loop closure candidates are verified geometrically and Sim(3) relative pose constraints are estimated by jointly minimizing 2D and 3D geometric error terms. These constraints are fused with a co-visibility graph of relative poses extracted from DSO's sliding window optimization. Our evaluation on publicly available datasets demonstrates that the modified point selection strategy retains the tracking accuracy and robustness, and the integrated pose-graph optimization significantly reduces the accumulated rotation-, translation- and scale-drift, resulting in an overall performance comparable to state-of-the-art feature-based systems, even without global bundle adjustment

    Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty

    Full text link
    This work proposes a robust visual odometry method for structured environments that combines point features with line and plane segments, extracted through an RGB-D camera. Noisy depth maps are processed by a probabilistic depth fusion framework based on Mixtures of Gaussians to denoise and derive the depth uncertainty, which is then propagated throughout the visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are used to model the uncertainties of the feature parameters and pose is estimated by combining the three types of primitives based on their uncertainties. Performance evaluation on RGB-D sequences collected in this work and two public RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth fusion framework and combining the three feature-types, particularly in scenes with low-textured surfaces, dynamic objects and missing depth measurements.Comment: Major update: more results, depth filter released as opensource, 34 page

    Realtime State Estimation with Tactile and Visual sensing. Application to Planar Manipulation

    Full text link
    Accurate and robust object state estimation enables successful object manipulation. Visual sensing is widely used to estimate object poses. However, in a cluttered scene or in a tight workspace, the robot's end-effector often occludes the object from the visual sensor. The robot then loses visual feedback and must fall back on open-loop execution. In this paper, we integrate both tactile and visual input using a framework for solving the SLAM problem, incremental smoothing and mapping (iSAM), to provide a fast and flexible solution. Visual sensing provides global pose information but is noisy in general, whereas contact sensing is local, but its measurements are more accurate relative to the end-effector. By combining them, we aim to exploit their advantages and overcome their limitations. We explore the technique in the context of a pusher-slider system. We adapt iSAM's measurement cost and motion cost to the pushing scenario, and use an instrumented setup to evaluate the estimation quality with different object shapes, on different surface materials, and under different contact modes
    corecore