60,563 research outputs found

    A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation

    Get PDF
    This thesis presents an automatic forklift approach for lifting and handling pallets. The project more specifically develops a solution for autonomous object detection and pose es- timation by Machine Learning (ML), point cloud processing, and arithmetic calculations. The project is based on a real-life scenario identified together with the industrial partner Red Rock, which includes a forklift operation, where the machine is supposed to identify, lift, and handle pallets autonomously. A key to achieving this automation is to localize and classify the pallet as well as to estimate the Six Dimensional (6D) pose of the pallet, which include its (x, y, z) position and (pitch, roll, yaw) orientation. Positioned directly in front of the pallet, the pose estimation must be performed around the range of 2-meter distance and 0° to ±45° angle. A systematic solution consisting of two major phases, object detection, and pose estimation, is developed to achieve the project goal. For object detection, the You Only Look Once X (YOLOX)-S ML algorithm is selected and implemented. The algorithm is pre-trained on the COCO dataset. It is, after that transfer, learned on the Logistics Objects in Context (LOCO) dataset to be able to detect pallets in an industrial environment. To improve the detection inference, the algorithm is optimized with the Intel OpenVINO toolkit, resulting in improved inference latency by over 2.5 times on Central Processing Unit (CPU). The output of the YOLOX-S algorithm is a bounding box around the pallet, and a custom struct links object detection and poses estimation together. The pose estimation algorithm converts the Two Dimensional (2D) bounding box data into Three Dimensional (3D) vectors, in which only the relevant points in the point cloud are kept. In contrast, all irrelevant points are filtered out from the environment. A series of arithmetic calculations from the filtered point cloud are applied, including Random Sample Consensus (RANSAC) and vector operations, in which the prior calculates the largest vertical plane of the identified pallet. Based on the object detection output and the pose estimation calculations, a 3D vector and a 3D point resulting in the pallet’s pose is found. Several tests and experiments have been performed to evaluate and validate the developed solution. The tests are based on a developed ground truth setup consisting of an AprilTag marker which provides a robust and precise ground truth measurement. Results from the standstill experiment show that the algorithm can estimate the position within 0.3 and 7.5 millimeters for the x and y axes. Moreover, the z-axis managed to be kept within 1.6 and 28.6 millimeters. The pitch orientation was kept within 3.65° and 5.21°, while the yaw ori- entation managed to be within 0.86° and 2.64°. Overall standstill test results have evaluated the best and worst case, respectively, within 0° and 45° degrees

    A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation

    Get PDF
    This thesis presents an automatic forklift approach for lifting and handling pallets. The project more specifically develops a solution for autonomous object detection and pose estimation by Machine Learning (ML), point cloud processing, and arithmetic calculations.The project is based on a real-life scenario identified together with the industrial partner Red Rock, which includes a forklift operation, where the machine is supposed to identify,lift, and handle pallets autonomously. A key to achieving this automation is to localize and classify the pallet as well as to estimate the Six Dimensional (6D) pose of the pallet, which include its (x, y, z) position and (pitch, roll, yaw) orientation. Positioned directly in front of the pallet, the pose estimation must be performed around the range of 2-meter distance and 0° to ±45° angle. A systematic solution consisting of two major phases, object detection, and pose estimation, is developed to achieve the project goal. For object detection, the You Only Look Once X (YOLOX)-S ML algorithm is selected and implemented. The algorithm is pre-trained on the COCO dataset. It is, after that transfer, learned on the Logistics Objects in Context (LOCO) dataset to be able to detect pallets in an industrial environment. To improve the detection inference, the algorithm is optimized with the Intel OpenVINO toolkit, resulting in improved inference latency by over 2.5 times on Central Processing Unit (CPU). The output of the YOLOX-S algorithm is a bounding box around the pallet, and a custom struct links object detection and poses estimation together. The pose estimation algorithm converts the Two Dimensional (2D) bounding box data into Three Dimensional (3D) vectors, in which only the relevant points in the point cloud are kept. In contrast, all irrelevant points are filtered out from the environment. A series of arithmetic calculations from the filtered point cloud are applied, including Random Sample Consensus (RANSAC) and vector operations, in which the prior calculates the largest vertical plane of the identified pallet. Based on the object detection output and the pose estimation calculations, a 3D vector and a 3D point resulting in the pallet’s pose is found. Several tests and experiments have been performed to evaluate and validate the developed solution. The tests are based on a developed ground truth setup consisting of an AprilTag marker which provides a robust and precise ground truth measurement. Results from the standstill experiment show that the algorithm can estimate the position within 0.3 and 7.5 millimeters for the x and y axes. Moreover, the z-axis managed to be kept within 1.6 and 28.6 millimeters. The pitch orientation was kept within 3.65° and 5.21°, while the yaw orientation managed to be within 0.86° and 2.64°. Overall standstill test results have evaluated the best and worst case, respectively, within 0° and 45° degrees

    Autonomous Pick-and-Place Procedure with an Industrial Robot Using Multiple 3D Sensors for Object Detection and Obstacle Avoidance

    Get PDF
    Master's thesis in Mechatronics (MAS500)This thesis proposes a full pipeline autonomous pick-and-place procedure, integrating perception, planning, grasping and control for execution of tasks towards long term industrial automation. Within perception, we demonstrate the detection of a large object (target) including position and orientation (pose) estimation in 3D world. Further on, obstacles in the work area are mapped with proposed filtering prior to motion planning and navigation of an industrial robot to the target’s pose. The target is then picked using a custom built motorized 3D printed end gripper, and placed at a desired location in the robot’s reachable environment. Point cloud based model-free obstacle avoidance is performed throughout the whole process. The complete pipeline is targeted towards typical tasks in various industries including offshore, logistics and warehouse domain with scanning of the scene, picking and placing of a bulky object from one position to another without or with minimal human intervention

    Collaborative and Cooperative Robotics Applications using Visual Perception

    Get PDF
    The objective of this Thesis is to develop novel integrated strategies for collaborative and cooperative robotic applications. Commonly, industrial robots operate in structured environments and in work-cell separated from human operators. Nowadays, collaborative robots have the capacity of sharing the workspace and collaborate with humans or other robots to perform complex tasks. These robots often operate in an unstructured environment, whereby they need sensors and algorithms to get information about environment changes. Advanced vision and control techniques have been analyzed to evaluate their performance and their applicability to industrial tasks. Then, some selected techniques have been applied for the first time to an industrial context. A Peg-in-Hole task has been chosen as first case study, since it has been extensively studied but still remains challenging: it requires accuracy both in the determination of the hole poses and in the robot positioning. Two solutions have been developed and tested. Experimental results have been discussed to highlight the advantages and disadvantages of each technique. Grasping partially known objects in unstructured environments is one of the most challenging issues in robotics. It is a complex task and requires to address multiple subproblems, in order to be accomplished, including object localization and grasp pose detection. Also for this class of issues some vision techniques have been analyzed. One of these has been adapted to be used in industrial scenarios. Moreover, as a second case study, a robot-to-robot object handover task in a partially structured environment and in the absence of explicit communication between the robots has been developed and validated. Finally, the two case studies have been integrated in two real industrial setups to demonstrate the applicability of the strategies to solving industrial problems

    Fast tomographic inspection of cylindrical objects

    Get PDF
    This paper presents a method for improved analysis of objects with an axial symmetry using X-ray Computed Tomography (CT). Cylindrical coordinates about an axis fixed to the object form the most natural base to check certain characteristics of objects that contain such symmetry, as often occurs with industrial parts. The sampling grid corresponds with the object, allowing for down-sampling hence reducing the reconstruction time. This is necessary for in-line applications and fast quality inspection. With algebraic reconstruction it permits the use of a pre-computed initial volume perfectly suited to fit a series of scans where same-type objects can have different positions and orientations, as often encountered in an industrial setting. Weighted back-projection can also be included when some regions are more likely subject to change, to improve stability. Building on a Cartesian grid reconstruction code, the feasibility of reusing the existing ray-tracers is checked against other researches in the same field.Comment: 13 pages, 13 figures. submitted to Journal Of Nondestructive Evaluation (https://www.springer.com/journal/10921
    • …
    corecore