12 research outputs found
Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning
This paper primarily focuses on evaluating and benchmarking the robustness of
visual representations in the context of object assembly tasks. Specifically,
it investigates the alignment and insertion of objects with geometrical
extrusions and intrusions, commonly referred to as a peg-in-hole task. The
accuracy required to detect and orient the peg and the hole geometry in SE(3)
space for successful assembly poses significant challenges. Addressing this, we
employ a general framework in visuomotor policy learning that utilizes visual
pretraining models as vision encoders. Our study investigates the robustness of
this framework when applied to a dual-arm manipulation setup, specifically to
the grasp variations. Our quantitative analysis shows that existing pretrained
models fail to capture the essential visual features necessary for this task.
However, a visual encoder trained from scratch consistently outperforms the
frozen pretrained models. Moreover, we discuss rotation representations and
associated loss functions that substantially improve policy learning. We
present a novel task scenario designed to evaluate the progress in visuomotor
policy learning, with a specific focus on improving the robustness of intricate
assembly tasks that require both geometrical and spatial reasoning. Videos,
additional experiments, dataset, and code are available at
https://bit.ly/geometric-peg-in-hole
PolyFit: A Peg-in-hole Assembly Framework for Unseen Polygon Shapes via Sim-to-real Adaptation
The study addresses the foundational and challenging task of peg-in-hole
assembly in robotics, where misalignments caused by sensor inaccuracies and
mechanical errors often result in insertion failures or jamming. This research
introduces PolyFit, representing a paradigm shift by transitioning from a
reinforcement learning approach to a supervised learning methodology. PolyFit
is a Force/Torque (F/T)-based supervised learning framework designed for 5-DoF
peg-in-hole assembly. It utilizes F/T data for accurate extrinsic pose
estimation and adjusts the peg pose to rectify misalignments. Extensive
training in a simulated environment involves a dataset encompassing a diverse
range of peg-hole shapes, extrinsic poses, and their corresponding contact F/T
readings. To enhance extrinsic pose estimation, a multi-point contact strategy
is integrated into the model input, recognizing that identical F/T readings can
indicate different poses. The study proposes a sim-to-real adaptation method
for real-world application, using a sim-real paired dataset to enable effective
generalization to complex and unseen polygon shapes. PolyFit achieves
impressive peg-in-hole success rates of 97.3% and 96.3% for seen and unseen
shapes in simulations, respectively. Real-world evaluations further demonstrate
substantial success rates of 86.7% and 85.0%, highlighting the robustness and
adaptability of the proposed method.Comment: 8 pages, 8 figures, 3 table
CFVS: Coarse-to-Fine Visual Servoing for 6-DoF Object-Agnostic Peg-In-Hole Assembly
Robotic peg-in-hole assembly remains a challenging task due to its high
accuracy demand. Previous work tends to simplify the problem by restricting the
degree of freedom of the end-effector, or limiting the distance between the
target and the initial pose position, which prevents them from being deployed
in real-world manufacturing. Thus, we present a Coarse-to-Fine Visual Servoing
(CFVS) peg-in-hole method, achieving 6-DoF end-effector motion control based on
3D visual feedback. CFVS can handle arbitrary tilt angles and large initial
alignment errors through a fast pose estimation before refinement. Furthermore,
by introducing a confidence map to ignore the irrelevant contour of objects,
CFVS is robust against noise and can deal with various targets beyond training
data. Extensive experiments show CFVS outperforms state-of-the-art methods and
obtains 100%, 91%, and 82% average success rates in 3-DoF, 4-DoF, and 6-DoF
peg-in-hole, respectively
Camera geometry determination based on circular's shape for peg-in-hole task
A simple, inexpensive system and effective in performing required tasks is the most preferable in industry. The peg-in-hole task is widely used in manufacturing process by using vision system and sensors. However, it requires complex algorithm and high Degree of Freedom (DOF) mechanism with fine movement. Hence, it will increase the cost. Currently, a forklift-like robot controlled by an operator using wired controllers is used to pick up one by one of the copper wire spools arranged side by side on the shelf to be taken to the inspection area. The holder and puller attached to the robot is used to pick up the spool. It is difficult for the operator to ensure the stem is properly inserted into the hole (peg-in-hole problem) because of the structure of the robot. However, the holder design is not universal and not applicable to other companies. The spool can only be grasped and pulled out from the front side and cannot be grasped using robot arm and gripper. In this study, a vision system is developed to solve the peg-in-hole problem by enabling the robot to autonomously perform the insertion and pick up the spool without using any sensors except a low-cost camera. A low-cost camera is used to capture images of copper wire spool in real-time video. Inspired by how human perceive an object orientation based on its shape, a system is developed to determine camera orientation based on the spool image condition and yaw angle from the center of the camera (CFOV) to CHS. The performance of the proposed system is analyzed based on detection rate analysis. This project is developed by using MATLAB software. The analysis is done in controlled environment with 50-110 cm distance range of camera to the spool. In addition, the camera orientation is analyzed between -20º to 20º yaw angle range. In order to ensure the puller will not scratch the spool, a mathematical equation is derived to calculate the puller tolerance. By using this, the system can estimate the spool position based on the camera orientation and distance calculation. Application of this system is simple and costeffective. A Modified Circular Hough Transform (MCHT) method is proposed and tested with existing method which is Circular Hough Transform (CHT) method to eliminate false circles and outliers. The results of the analysis showed detection success rate of 96% compared to the CHT method. It shows the MCHT method is better than CHT method. The proposed system is able to calculate the distance and camera orientation based on spool image condition with low error rate. Hence, it solves the peg-in-hole problem without using Force/Torque sensor. In conclusion, a total of 7 analysis consist of image pre-processing, image segmentation, object classification, comparison between CHT and MCHT, illumination measurement, distance calculation and yaw angle analysis were experimentally tested including the comparison with the existing method. The proposed system was able to achieve all the objectives
Robotic Assembly Control Reconfiguration Based on Transfer Reinforcement Learning for Objects with Different Geometric Features
Robotic force-based compliance control is a preferred approach to achieve
high-precision assembly tasks. When the geometric features of assembly objects
are asymmetric or irregular, reinforcement learning (RL) agents are gradually
incorporated into the compliance controller to adapt to complex force-pose
mapping which is hard to model analytically. Since force-pose mapping is
strongly dependent on geometric features, a compliance controller is only
optimal for current geometric features. To reduce the learning cost of assembly
objects with different geometric features, this paper is devoted to answering
how to reconfigure existing controllers for new assembly objects with different
geometric features. In this paper, model-based parameters are first
reconfigured based on the proposed Equivalent Theory of Compliance Law (ETCL).
Then the RL agent is transferred based on the proposed Weighted Dimensional
Policy Distillation (WDPD) method. The experiment results demonstrate that the
control reconfiguration method costs less time and achieves better control
performance, which confirms the validity of proposed methods
Stochastic Search Methods for Mobile Manipulators
Mobile manipulators are a potential solution to the increasing need for additional flexibility and mobility in industrial applications. However, they tend to lack the accuracy and precision achieved by fixed manipulators, especially in scenarios where both the manipulator and the autonomous vehicle move simultaneously. This paper analyzes the problem of dynamically evaluating the positioning error of mobile manipulators. In particular, it investigates the use of Bayesian methods to predict the position of the end-effector in the presence of uncertainty propagated from the mobile platform. The precision of the mobile manipulator is evaluated through its ability to intercept retroreflective markers using a photoelectric sensor attached to the end-effector. Compared to a deterministic search approach, we observed improved robustness with comparable search times, thereby enabling effective calibration of the mobile manipulator
Human-in-the-Loop Task and Motion Planning for Imitation Learning
Imitation learning from human demonstrations can teach robots complex
manipulation skills, but is time-consuming and labor intensive. In contrast,
Task and Motion Planning (TAMP) systems are automated and excel at solving
long-horizon tasks, but they are difficult to apply to contact-rich tasks. In
this paper, we present Human-in-the-Loop Task and Motion Planning (HITL-TAMP),
a novel system that leverages the benefits of both approaches. The system
employs a TAMP-gated control mechanism, which selectively gives and takes
control to and from a human teleoperator. This enables the human teleoperator
to manage a fleet of robots, maximizing data collection efficiency. The
collected human data is then combined with an imitation learning framework to
train a TAMP-gated policy, leading to superior performance compared to training
on full task demonstrations. We compared HITL-TAMP to a conventional
teleoperation system -- users gathered more than 3x the number of demos given
the same time budget. Furthermore, proficient agents (75\%+ success) could be
trained from just 10 minutes of non-expert teleoperation data. Finally, we
collected 2.1K demos with HITL-TAMP across 12 contact-rich, long-horizon tasks
and show that the system often produces near-perfect agents. Videos and
additional results at https://hitltamp.github.io .Comment: Conference on Robot Learning (CoRL) 202
Multifingered robot hand compliant manipulation based on vision-based demonstration and adaptive force control
Multifingered hand dexterous manipulation is quite challenging in the domain of robotics. One remaining issue is how to achieve compliant behaviors. In this work, we propose a human-in-the-loop learning-control approach for acquiring compliant grasping and manipulation skills of a multifinger robot hand. This approach takes the depth image of the human hand as input and generates the desired force commands for the robot. The markerless vision-based teleoperation system is used for the task demonstration, and an end-to-end neural network model (i.e., TeachNet) is trained to map the pose of the human hand to the joint angles of the robot hand in real-time. To endow the robot hand with compliant human-like behaviors, an adaptive force control strategy is designed to predict the desired force control commands based on the pose difference between the robot hand and the human hand during the demonstration. The force controller is derived from a computational model of the biomimetic control strategy in human motor learning, which allows adapting the control variables (impedance and feedforward force) online during the execution of the reference joint angles. The simultaneous adaptation of the impedance and feedforward profiles enables the robot to interact with the environment compliantly. Our approach has been verified in both simulation and real-world task scenarios based on a multifingered robot hand, that is, the Shadow Hand, and has shown more reliable performances than the current widely used position control mode for obtaining compliant grasping and manipulation behaviors