60,563 research outputs found
A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation
This thesis presents an automatic forklift approach for lifting and handling pallets. The
project more specifically develops a solution for autonomous object detection and pose es-
timation by Machine Learning (ML), point cloud processing, and arithmetic calculations.
The project is based on a real-life scenario identified together with the industrial partner
Red Rock, which includes a forklift operation, where the machine is supposed to identify,
lift, and handle pallets autonomously. A key to achieving this automation is to localize and
classify the pallet as well as to estimate the Six Dimensional (6D) pose of the pallet, which
include its (x, y, z) position and (pitch, roll, yaw) orientation. Positioned directly in front
of the pallet, the pose estimation must be performed around the range of 2-meter distance
and 0° to ±45° angle.
A systematic solution consisting of two major phases, object detection, and pose estimation,
is developed to achieve the project goal. For object detection, the You Only Look Once X
(YOLOX)-S ML algorithm is selected and implemented. The algorithm is pre-trained on
the COCO dataset. It is, after that transfer, learned on the Logistics Objects in Context
(LOCO) dataset to be able to detect pallets in an industrial environment. To improve the
detection inference, the algorithm is optimized with the Intel OpenVINO toolkit, resulting in
improved inference latency by over 2.5 times on Central Processing Unit (CPU). The output
of the YOLOX-S algorithm is a bounding box around the pallet, and a custom struct links
object detection and poses estimation together. The pose estimation algorithm converts the
Two Dimensional (2D) bounding box data into Three Dimensional (3D) vectors, in which
only the relevant points in the point cloud are kept. In contrast, all irrelevant points are
filtered out from the environment. A series of arithmetic calculations from the filtered point
cloud are applied, including Random Sample Consensus (RANSAC) and vector operations,
in which the prior calculates the largest vertical plane of the identified pallet. Based on the
object detection output and the pose estimation calculations, a 3D vector and a 3D point
resulting in the pallet’s pose is found.
Several tests and experiments have been performed to evaluate and validate the developed
solution. The tests are based on a developed ground truth setup consisting of an AprilTag
marker which provides a robust and precise ground truth measurement. Results from the
standstill experiment show that the algorithm can estimate the position within 0.3 and 7.5
millimeters for the x and y axes. Moreover, the z-axis managed to be kept within 1.6 and
28.6 millimeters. The pitch orientation was kept within 3.65° and 5.21°, while the yaw ori-
entation managed to be within 0.86° and 2.64°. Overall standstill test results have evaluated
the best and worst case, respectively, within 0° and 45° degrees
A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation
This thesis presents an automatic forklift approach for lifting and handling pallets. The project more specifically develops a solution for autonomous object detection and pose estimation by Machine Learning (ML), point cloud processing, and arithmetic calculations.The project is based on a real-life scenario identified together with the industrial partner Red Rock, which includes a forklift operation, where the machine is supposed to identify,lift, and handle pallets autonomously. A key to achieving this automation is to localize and classify the pallet as well as to estimate the Six Dimensional (6D) pose of the pallet, which include its (x, y, z) position and (pitch, roll, yaw) orientation. Positioned directly in front of the pallet, the pose estimation must be performed around the range of 2-meter distance and 0° to ±45° angle.
A systematic solution consisting of two major phases, object detection, and pose estimation, is developed to achieve the project goal. For object detection, the You Only Look Once X (YOLOX)-S ML algorithm is selected and implemented. The algorithm is pre-trained on the COCO dataset. It is, after that transfer, learned on the Logistics Objects in Context (LOCO) dataset to be able to detect pallets in an industrial environment. To improve the detection inference, the algorithm is optimized with the Intel OpenVINO toolkit, resulting in improved inference latency by over 2.5 times on Central Processing Unit (CPU). The output of the YOLOX-S algorithm is a bounding box around the pallet, and a custom struct links object detection and poses estimation together. The pose estimation algorithm converts the Two Dimensional (2D) bounding box data into Three Dimensional (3D) vectors, in which only the relevant points in the point cloud are kept. In contrast, all irrelevant points are filtered out from the environment. A series of arithmetic calculations from the filtered point cloud are applied, including Random Sample Consensus (RANSAC) and vector operations, in which the prior calculates the largest vertical plane of the identified pallet. Based on the object detection output and the pose estimation calculations, a 3D vector and a 3D point resulting in the pallet’s pose is found.
Several tests and experiments have been performed to evaluate and validate the developed solution. The tests are based on a developed ground truth setup consisting of an AprilTag marker which provides a robust and precise ground truth measurement. Results from the standstill experiment show that the algorithm can estimate the position within 0.3 and 7.5 millimeters for the x and y axes. Moreover, the z-axis managed to be kept within 1.6 and 28.6 millimeters. The pitch orientation was kept within 3.65° and 5.21°, while the yaw orientation managed to be within 0.86° and 2.64°. Overall standstill test results have evaluated the best and worst case, respectively, within 0° and 45° degrees
Autonomous Pick-and-Place Procedure with an Industrial Robot Using Multiple 3D Sensors for Object Detection and Obstacle Avoidance
Master's thesis in Mechatronics (MAS500)This thesis proposes a full pipeline autonomous pick-and-place procedure, integrating perception, planning, grasping and control for execution of tasks towards long term industrial automation. Within perception, we demonstrate the detection of a large object (target) including position and orientation (pose) estimation in 3D world. Further on, obstacles in the work area are mapped with proposed filtering prior to motion planning and navigation of an industrial robot to the target’s pose. The target is then picked using a custom built motorized 3D printed end gripper, and placed at a desired location in the robot’s reachable environment. Point cloud based model-free obstacle avoidance is performed throughout the whole process. The complete pipeline is targeted towards typical tasks in various industries including offshore, logistics and warehouse domain with scanning of the scene, picking and placing of a bulky object from one position to another without or with minimal human intervention
Collaborative and Cooperative Robotics Applications using Visual Perception
The objective of this Thesis is to develop novel integrated strategies for collaborative and cooperative robotic applications. Commonly, industrial robots operate in structured environments and in work-cell separated from human operators. Nowadays, collaborative robots have the capacity of sharing the workspace and collaborate with humans or other robots to perform complex tasks. These robots often operate in an unstructured environment, whereby they need sensors and algorithms to get information about environment changes.
Advanced vision and control techniques have been analyzed to evaluate their performance and their applicability to industrial tasks. Then, some selected techniques have been applied for the first time to an industrial context. A Peg-in-Hole task has been chosen as first case study, since it has been extensively studied but still remains challenging: it requires accuracy both in the determination of the hole poses and in the robot positioning.
Two solutions have been developed and tested. Experimental results have been discussed to highlight the advantages and disadvantages of each technique. Grasping partially known objects in unstructured environments is one of the most challenging issues in robotics. It is a complex task and requires to address multiple subproblems, in order to be accomplished, including object localization and grasp pose detection.
Also for this class of issues some vision techniques have been analyzed. One of these has been adapted to be used in industrial scenarios. Moreover, as a second case study, a robot-to-robot object handover task in a partially structured environment and in the absence of explicit communication between the robots has been developed and validated.
Finally, the two case studies have been integrated in two real industrial setups to demonstrate the applicability of the strategies to solving industrial problems
Fast tomographic inspection of cylindrical objects
This paper presents a method for improved analysis of objects with an axial
symmetry using X-ray Computed Tomography (CT). Cylindrical coordinates about an
axis fixed to the object form the most natural base to check certain
characteristics of objects that contain such symmetry, as often occurs with
industrial parts. The sampling grid corresponds with the object, allowing for
down-sampling hence reducing the reconstruction time. This is necessary for
in-line applications and fast quality inspection. With algebraic reconstruction
it permits the use of a pre-computed initial volume perfectly suited to fit a
series of scans where same-type objects can have different positions and
orientations, as often encountered in an industrial setting. Weighted
back-projection can also be included when some regions are more likely subject
to change, to improve stability. Building on a Cartesian grid reconstruction
code, the feasibility of reusing the existing ray-tracers is checked against
other researches in the same field.Comment: 13 pages, 13 figures. submitted to Journal Of Nondestructive
Evaluation (https://www.springer.com/journal/10921
- …