4,856 research outputs found

    Image Matching Using SIFT, SURF, BRIEF and ORB: Performance Comparison for Distorted Images

    Full text link
    Fast and robust image matching is a very important task with various applications in computer vision and robotics. In this paper, we compare the performance of three different image matching techniques, i.e., SIFT, SURF, and ORB, against different kinds of transformations and deformations such as scaling, rotation, noise, fish eye distortion, and shearing. For this purpose, we manually apply different types of transformations on original images and compute the matching evaluation parameters such as the number of key points in images, the matching rate, and the execution time required for each algorithm and we will show that which algorithm is the best more robust against each kind of distortion. Index Terms-Image matching, scale invariant feature transform (SIFT), speed up robust feature (SURF), robust independent elementary features (BRIEF), oriented FAST, rotated BRIEF (ORB).Comment: 5 pages, 6 figures, In Proceedings of the 2015 Newfoundland Electrical and Computer Engineering Conference,St. johns, Canada, November, 201

    A Hybrid SLAM and Object Recognition System for Pepper Robot

    Full text link
    Humanoid robots are playing increasingly important roles in real-life tasks especially when it comes to indoor applications. Providing robust solutions for the tasks such as indoor environment mapping, self-localisation and object recognition are essential to make the robots to be more autonomous, hence, more human-like. The well-known Aldebaran service robot Pepper is a suitable candidate for achieving these goals. In this paper, a hybrid system combining Simultaneous Localisation and Mapping (SLAM) algorithm with object recognition is developed and tested with Pepper robot in real-world conditions for the first time. The ORB SLAM 2 algorithm was taken as a seminal work in our research. Then, an object recognition technique based on Scale-Invariant Feature Transform (SIFT) and Random Sample Consensus (RANSAC) was combined with SLAM to recognise and localise objects in the mapped indoor environment. The results of our experiments showed the system's applicability for the Pepper robot in real-world scenarios. Moreover, we made our source code available for the community at https://github.com/PaolaArdon/Salt-Pepper.Comment: All authors contributed equally, listed in alphabetical orde

    CFORB: Circular FREAK-ORB Visual Odometry

    Full text link
    We present a novel Visual Odometry algorithm entitled Circular FREAK-ORB (CFORB). This algorithm detects features using the well-known ORB algorithm [12] and computes feature descriptors using the FREAK algorithm [14]. CFORB is invariant to both rotation and scale changes, and is suitable for use in environments with uneven terrain. Two visual geometric constraints have been utilized in order to remove invalid feature descriptor matches. These constraints have not previously been utilized in a Visual Odometry algorithm. A variation to circular matching [16] has also been implemented. This allows features to be matched between images without having to be dependent upon the epipolar constraint. This algorithm has been run on the KITTI benchmark dataset and achieves a competitive average translational error of 3.73%3.73 \% and average rotational error of 0.0107deg/m0.0107 deg/m. CFORB has also been run in an indoor environment and achieved an average translational error of 3.70%3.70 \%. After running CFORB in a highly textured environment with an approximately uniform feature spread across the images, the algorithm achieves an average translational error of 2.4%2.4 \% and an average rotational error of 0.009deg/m0.009 deg/m

    MODS: Fast and Robust Method for Two-View Matching

    Full text link
    A novel algorithm for wide-baseline matching called MODS - Matching On Demand with view Synthesis - is presented. The MODS algorithm is experimentally shown to solve a broader range of wide-baseline problems than the state of the art while being nearly as fast as standard matchers on simple problems. The apparent robustness vs. speed trade-off is finessed by the use of progressively more time-consuming feature detectors and by on-demand generation of synthesized images that is performed until a reliable estimate of geometry is obtained. We introduce an improved method for tentative correspondence selection, applicable both with and without view synthesis. A modification of the standard first to second nearest distance rule increases the number of correct matches by 5-20% at no additional computational cost. Performance of the MODS algorithm is evaluated on several standard publicly available datasets, and on a new set of geometrically challenging wide baseline problems that is made public together with the ground truth. Experiments show that the MODS outperforms the state-of-the-art in robustness and speed. Moreover, MODS performs well on other classes of difficult two-view problems like matching of images from different modalities, with wide temporal baseline or with significant lighting changes.Comment: Version accepted to CVIU. arXiv admin note: text overlap with arXiv:1306.385

    Local Multi-Grouped Binary Descriptor with Ring-based Pooling Configuration and Optimization

    Full text link
    Local binary descriptors are attracting increasingly attention due to their great advantages in computational speed, which are able to achieve real-time performance in numerous image/vision applications. Various methods have been proposed to learn data-dependent binary descriptors. However, most existing binary descriptors aim overly at computational simplicity at the expense of significant information loss which causes ambiguity in similarity measure using Hamming distance. In this paper, by considering multiple features might share complementary information, we present a novel local binary descriptor, referred as Ring-based Multi-Grouped Descriptor (RMGD), to successfully bridge the performance gap between current binary and floated-point descriptors. Our contributions are two-fold. Firstly, we introduce a new pooling configuration based on spatial ring-region sampling, allowing for involving binary tests on the full set of pairwise regions with different shapes, scales and distances. This leads to a more meaningful description than existing methods which normally apply a limited set of pooling configurations. Then, an extended Adaboost is proposed for efficient bit selection by emphasizing high variance and low correlation, achieving a highly compact representation. Secondly, the RMGD is computed from multiple image properties where binary strings are extracted. We cast multi-grouped features integration as rankSVM or sparse SVM learning problem, so that different features can compensate strongly for each other, which is the key to discriminativeness and robustness. The performance of RMGD was evaluated on a number of publicly available benchmarks, where the RMGD outperforms the state-of-the-art binary descriptors significantly.Comment: To appear in IEEE Trans. on Image Processing, 201

    Characterizing SLAM Benchmarks and Methods for the Robust Perception Age

    Full text link
    The diversity of SLAM benchmarks affords extensive testing of SLAM algorithms to understand their performance, individually or in relative terms. The ad-hoc creation of these benchmarks does not necessarily illuminate the particular weak points of a SLAM algorithm when performance is evaluated. In this paper, we propose to use a decision tree to identify challenging benchmark properties for state-of-the-art SLAM algorithms and important components within the SLAM pipeline regarding their ability to handle these challenges. Establishing what factors of a particular sequence lead to track failure or degradation relative to these characteristics is important if we are to arrive at a strong understanding for the core computational needs of a robust SLAM algorithm. Likewise, we argue that it is important to profile the computational performance of the individual SLAM components for use when benchmarking. In particular, we advocate the use of time-dilation during ROS bag playback, or what we refer to as slo-mo playback. Using slo-mo to benchmark SLAM instantiations can provide clues to how SLAM implementations should be improved at the computational component level. Three prevalent VO/SLAM algorithms and two low-latency algorithms of our own are tested on selected typical sequences, which are generated from benchmark characterization, to further demonstrate the benefits achieved from computationally efficient components.Comment: 7 pages, 5 figures, accepted at ICRA 2019 Workshop on Dataset Generation and Benchmarking of SLAM Algorithms for Robotics and VR/A

    LoopSmart: Smart Visual SLAM Through Surface Loop Closure

    Full text link
    We present a visual simultaneous localization and mapping (SLAM) framework of closing surface loops. It combines both sparse feature matching and dense surface alignment. Sparse feature matching is used for visual odometry and globally camera pose fine-tuning when dense loops are detected, while dense surface alignment is the way of closing large loops and solving surface mismatching problem. To achieve smart dense surface loop closure, a highly efficient CUDA-based global point cloud registration method and a map content dependent loop verification method are proposed. We run extensive experiments on different datasets, our method outperforms state-of-the-art ones in terms of both camera trajectory and surface reconstruction accuracy

    Image Identification Using SIFT Algorithm: Performance Analysis against Different Image Deformations

    Full text link
    Image identification is one of the most challenging tasks in different areas of computer vision. Scale-invariant feature transform is an algorithm to detect and describe local features in images to further use them as an image matching criteria. In this paper, the performance of the SIFT matching algorithm against various image distortions such as rotation, scaling, fisheye and motion distortion are evaluated and false and true positive rates for a large number of image pairs are calculated and presented. We also evaluate the distribution of the matched keypoint orientation difference for each image deformation.Comment: 4 pages, 11 figures, In Proceedings of the 2015 Newfoundland Electrical and Computer Engineering Conference,St. johns, Canada, November, 201

    New Feature Detection Mechanism for Extended Kalman Filter Based Monocular SLAM with 1-Point RANSAC

    Full text link
    We present a different approach of feature point detection for improving the accuracy of SLAM using single, monocular camera. Traditionally, Harris Corner detection, SURF or FAST corner detectors are used for finding feature points of interest in the image. We replace this with another approach, which involves building a non-linear scale-space representation of images using Perona and Malik Diffusion equation and computing the scale normalized Hessian at multiple scale levels (KAZE feature). The feature points so detected are used to estimate the state and pose of a mono camera using extended Kalman filter. By using accelerated KAZE features and a more rigorous feature rejection routine combined with 1-point RANSAC for outlier rejection, short baseline matching of features are significantly improved, even with a lesser number of feature points, especially in the presence of motion blur. We present a comparative study of our proposal with FAST and show improved localization accuracy in terms of absolute trajectory error.Comment: Accepted in Third International Conference of Mining Intelligence and Knowledge Exploration (MIKE) 201

    ENFT: Efficient Non-Consecutive Feature Tracking for Robust Structure-from-Motion

    Full text link
    Structure-from-motion (SfM) largely relies on feature tracking. In image sequences, if disjointed tracks caused by objects moving in and out of the field of view, occasional occlusion, or image noise, are not handled well, corresponding SfM could be affected. This problem becomes severer for large-scale scenes, which typically requires to capture multiple sequences to cover the whole scene. In this paper, we propose an efficient non-consecutive feature tracking (ENFT) framework to match interrupted tracks distributed in different subsequences or even in different videos. Our framework consists of steps of solving the feature `dropout' problem when indistinctive structures, noise or large image distortion exists, and of rapidly recognizing and joining common features located in different subsequences. In addition, we contribute an effective segment-based coarse-to-fine SfM algorithm for robustly handling large datasets. Experimental results on challenging video data demonstrate the effectiveness of the proposed system.Comment: 15 pages, 12 figure
    • …
    corecore