665 research outputs found

    Stereo vision-based obstacle avoidance module on 3D point cloud data

    Get PDF
    This paper deals in building a 3D vision-based obstacle avoidance and navigation. In order for an autonomous system to work in real life condition, a capability of gaining surrounding environment data, interpret the data and take appropriate action is needed. One of the required capability in this matter for an autonomous system is a capability to navigate cluttered, unorganized environment and avoiding collision with any present obstacle, defined as any data with vertical orientation and able to take decision when environment update exist. Proposed in this work are two-step strategy of extracting the obstacle position and orientation from point cloud data using plane based segmentation and the resultant segmentation are mapped based on obstacle point position relative to camera using occupancy grid map to acquire obstacle cluster position and recorded the occupancy grid map for future use and global navigation, obstacle position gained in grid map is used to plan the navigation path towards target goal without going through obstacle position and modify the navigation path to avoid collision when environment update is present or platform movement is not aligned with navigation path based on timed elastic band method

    RANSAC for Robotic Applications: A Survey

    Get PDF
    Random Sample Consensus, most commonly abbreviated as RANSAC, is a robust estimation method for the parameters of a model contaminated by a sizable percentage of outliers. In its simplest form, the process starts with a sampling of the minimum data needed to perform an estimation, followed by an evaluation of its adequacy, and further repetitions of this process until some stopping criterion is met. Multiple variants have been proposed in which this workflow is modified, typically tweaking one or several of these steps for improvements in computing time or the quality of the estimation of the parameters. RANSAC is widely applied in the field of robotics, for example, for finding geometric shapes (planes, cylinders, spheres, etc.) in cloud points or for estimating the best transformation between different camera views. In this paper, we present a review of the current state of the art of RANSAC family methods with a special interest in applications in robotics.This work has been partially funded by the Basque Government, Spain, under Research Teams Grant number IT1427-22 and under ELKARTEK LANVERSO Grant number KK-2022/00065; the Spanish Ministry of Science (MCIU), the State Research Agency (AEI), the European Regional Development Fund (FEDER), under Grant number PID2021-122402OB-C21 (MCIU/AEI/FEDER, UE); and the Spanish Ministry of Science, Innovation and Universities, under Grant FPU18/04737

    Confidence Estimation in Image-Based Localization

    Get PDF
    Image-based localization aims at estimating the camera position and orientation, briefly referred as camera pose, from a given image. Estimating the camera pose is needed in several applications, such as augmented reality, odometry and self-driving cars. A main challenge is to develop an algorithm for large varying environments, such as buildings or whole cities. During the past decade several algorithms have tackled this challenge and, despite the promising results, the task is far from being solved. Several applications, however, need a reliable pose estimate; in odometry applications, for example, the camera pose is used to correct the drift error accumulated by inertial sensor measurements. Based on this, it is important to be able to assess the confidence of the estimated pose and manage to discriminate between correct and incorrect poses within a prefixed error threshold. A common approach is to use the number of inliers produced in the RANSAC loop to evaluate how good an estimate is. Particularly, this is used to choose the best pose from a given image from a set of candidates. This metric, however, is not very robust, especially for indoor scenes, presenting several repetitive patterns, such as long textureless walls or similar objects. Despite some other metrics have been proposed, they aim at improving the accuracy of the algorithm, by grading candidate poses referred to the same query image; they thus recognize the best pose among a given set but cannot be used to grade the overall confidence of the final pose. In this thesis, we formalize confidence estimation as a binary classification problem and investigate how to quantify the confidence of an estimated camera pose. Opposed to the previous work, this new research question takes place after the whole visual localization pipeline and is able to compare also poses from different query images. In addition to the number of inliers, other factors such as the spatial distributions of inliers, are considered. A neural network is then used to generate a novel robust metric, able to evaluate the confidence for different query images. The proposed method is benchmarked using InLoc, a challenging dataset for indoor pose estimation. It is also shown the proposed confidence metric is independent of the dataset used for training and can be applied to different datasets and pipelines

    CMOS-3D smart imager architectures for feature detection

    Get PDF
    This paper reports a multi-layered smart image sensor architecture for feature extraction based on detection of interest points. The architecture is conceived for 3-D integrated circuit technologies consisting of two layers (tiers) plus memory. The top tier includes sensing and processing circuitry aimed to perform Gaussian filtering and generate Gaussian pyramids in fully concurrent way. The circuitry in this tier operates in mixed-signal domain. It embeds in-pixel correlated double sampling, a switched-capacitor network for Gaussian pyramid generation, analog memories and a comparator for in-pixel analog-to-digital conversion. This tier can be further split into two for improved resolution; one containing the sensors and another containing a capacitor per sensor plus the mixed-signal processing circuitry. Regarding the bottom tier, it embeds digital circuitry entitled for the calculation of Harris, Hessian, and difference-of-Gaussian detectors. The overall system can hence be configured by the user to detect interest points by using the algorithm out of these three better suited to practical applications. The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers. The Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.Xunta de Galicia 10PXIB206037PRMinisterio de Ciencia e Innovación TEC2009-12686, IPT-2011-1625-430000Office of Naval Research N00014111031

    Local Descriptors Optimized for Average Precision

    Full text link
    Extraction of local feature descriptors is a vital stage in the solution pipelines for numerous computer vision tasks. Learning-based approaches improve performance in certain tasks, but still cannot replace handcrafted features in general. In this paper, we improve the learning of local feature descriptors by optimizing the performance of descriptor matching, which is a common stage that follows descriptor extraction in local feature based pipelines, and can be formulated as nearest neighbor retrieval. Specifically, we directly optimize a ranking-based retrieval performance metric, Average Precision, using deep neural networks. This general-purpose solution can also be viewed as a listwise learning to rank approach, which is advantageous compared to recent local ranking approaches. On standard benchmarks, descriptors learned with our formulation achieve state-of-the-art results in patch verification, patch retrieval, and image matching.Comment: 13 pages, 8 figures. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201

    Feature Based Calibration of a Network of Kinect Sensors

    Get PDF
    The availability of affordable depth sensors in conjunction with common RGB cameras, such as the Microsoft Kinect, can provide robots with a complete and instantaneous representation of the current surrounding environment. However, in the problem of calibrating multiple camera systems, traditional methods bear some drawbacks, such as requiring human intervention. In this thesis, we propose an automatic and reliable calibration framework that can easily estimate the extrinsic parameters of a Kinect sensor network. Our framework includes feature extraction, Random Sample Consensus and camera pose estimation from high accuracy correspondences. We also implement a robustness analysis of position estimation algorithms. The result shows that our system could provide precise data under certain amount noise. Keywords Kinect, Multiple Camera Calibration, Feature Points Extraction, Correspondence, RANSA

    Part localization for robotic manipulation

    Get PDF
    The new generation of collaborative robots allows the use of small robot arms working with human workers, e.g. the YuMi robot, a dual 7-DOF robot arms designed for precise manipulation of small objects. For the further acceptance of such a robot in the industry, some methods and sensors systems have to be developed to allow them to perform a task such as grasping a specific object. If the robot wants to grasp an object, it has to localize the object relative to itself. This is a task of object recognition in computer vision, the art of localizing predefined objects in image sensor data. This master thesis presents a pipeline for object recognition of a single isolated model in point cloud. The system uses point cloud data generated from a 3D CAD model and describes its characteristics using local feature descriptors. These are then matched with the descriptors of the point cloud data from the scene to find the 6-DoF pose of the model in the robot coordinate frame. This initial pose estimation is then refined by a registration method such as ICP. A robot-camera calibration is performed also. The contributions of this thesis are as follows: The system uses FPFH (Fast Point Feature Histogram) for describing the local region and a hypothesize-and-test paradigm, e.g. RANSAC in the matching process. In contrast to several approaches, those whose rely on Point Pair Features as feature descriptors and a geometry hashing, e.g. voting-scheme as the matching process.The new generation of collaborative robots allows the use of small robot arms working with human workers, e.g. the YuMi robot, a dual 7-DOF robot arms designed for precise manipulation of small objects. For the further acceptance of such a robot in the industry, some methods and sensors systems have to be developed to allow them to perform a task such as grasping a specific object. If the robot wants to grasp an object, it has to localize the object relative to itself. This is a task of object recognition in computer vision, the art of localizing predefined objects in image sensor data. This master thesis presents a pipeline for object recognition of a single isolated model in point cloud. The system uses point cloud data generated from a 3D CAD model and describes its characteristics using local feature descriptors. These are then matched with the descriptors of the point cloud data from the scene to find the 6-DoF pose of the model in the robot coordinate frame. This initial pose estimation is then refined by a registration method such as ICP. A robot-camera calibration is performed also. The contributions of this thesis are as follows: The system uses FPFH (Fast Point Feature Histogram) for describing the local region and a hypothesize-and-test paradigm, e.g. RANSAC in the matching process. In contrast to several approaches, those whose rely on Point Pair Features as feature descriptors and a geometry hashing, e.g. voting-scheme as the matching process
    corecore