12 research outputs found

    Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning

    Full text link
    This paper primarily focuses on evaluating and benchmarking the robustness of visual representations in the context of object assembly tasks. Specifically, it investigates the alignment and insertion of objects with geometrical extrusions and intrusions, commonly referred to as a peg-in-hole task. The accuracy required to detect and orient the peg and the hole geometry in SE(3) space for successful assembly poses significant challenges. Addressing this, we employ a general framework in visuomotor policy learning that utilizes visual pretraining models as vision encoders. Our study investigates the robustness of this framework when applied to a dual-arm manipulation setup, specifically to the grasp variations. Our quantitative analysis shows that existing pretrained models fail to capture the essential visual features necessary for this task. However, a visual encoder trained from scratch consistently outperforms the frozen pretrained models. Moreover, we discuss rotation representations and associated loss functions that substantially improve policy learning. We present a novel task scenario designed to evaluate the progress in visuomotor policy learning, with a specific focus on improving the robustness of intricate assembly tasks that require both geometrical and spatial reasoning. Videos, additional experiments, dataset, and code are available at https://bit.ly/geometric-peg-in-hole

    PolyFit: A Peg-in-hole Assembly Framework for Unseen Polygon Shapes via Sim-to-real Adaptation

    Full text link
    The study addresses the foundational and challenging task of peg-in-hole assembly in robotics, where misalignments caused by sensor inaccuracies and mechanical errors often result in insertion failures or jamming. This research introduces PolyFit, representing a paradigm shift by transitioning from a reinforcement learning approach to a supervised learning methodology. PolyFit is a Force/Torque (F/T)-based supervised learning framework designed for 5-DoF peg-in-hole assembly. It utilizes F/T data for accurate extrinsic pose estimation and adjusts the peg pose to rectify misalignments. Extensive training in a simulated environment involves a dataset encompassing a diverse range of peg-hole shapes, extrinsic poses, and their corresponding contact F/T readings. To enhance extrinsic pose estimation, a multi-point contact strategy is integrated into the model input, recognizing that identical F/T readings can indicate different poses. The study proposes a sim-to-real adaptation method for real-world application, using a sim-real paired dataset to enable effective generalization to complex and unseen polygon shapes. PolyFit achieves impressive peg-in-hole success rates of 97.3% and 96.3% for seen and unseen shapes in simulations, respectively. Real-world evaluations further demonstrate substantial success rates of 86.7% and 85.0%, highlighting the robustness and adaptability of the proposed method.Comment: 8 pages, 8 figures, 3 table

    CFVS: Coarse-to-Fine Visual Servoing for 6-DoF Object-Agnostic Peg-In-Hole Assembly

    Full text link
    Robotic peg-in-hole assembly remains a challenging task due to its high accuracy demand. Previous work tends to simplify the problem by restricting the degree of freedom of the end-effector, or limiting the distance between the target and the initial pose position, which prevents them from being deployed in real-world manufacturing. Thus, we present a Coarse-to-Fine Visual Servoing (CFVS) peg-in-hole method, achieving 6-DoF end-effector motion control based on 3D visual feedback. CFVS can handle arbitrary tilt angles and large initial alignment errors through a fast pose estimation before refinement. Furthermore, by introducing a confidence map to ignore the irrelevant contour of objects, CFVS is robust against noise and can deal with various targets beyond training data. Extensive experiments show CFVS outperforms state-of-the-art methods and obtains 100%, 91%, and 82% average success rates in 3-DoF, 4-DoF, and 6-DoF peg-in-hole, respectively

    Camera geometry determination based on circular's shape for peg-in-hole task

    Get PDF
    A simple, inexpensive system and effective in performing required tasks is the most preferable in industry. The peg-in-hole task is widely used in manufacturing process by using vision system and sensors. However, it requires complex algorithm and high Degree of Freedom (DOF) mechanism with fine movement. Hence, it will increase the cost. Currently, a forklift-like robot controlled by an operator using wired controllers is used to pick up one by one of the copper wire spools arranged side by side on the shelf to be taken to the inspection area. The holder and puller attached to the robot is used to pick up the spool. It is difficult for the operator to ensure the stem is properly inserted into the hole (peg-in-hole problem) because of the structure of the robot. However, the holder design is not universal and not applicable to other companies. The spool can only be grasped and pulled out from the front side and cannot be grasped using robot arm and gripper. In this study, a vision system is developed to solve the peg-in-hole problem by enabling the robot to autonomously perform the insertion and pick up the spool without using any sensors except a low-cost camera. A low-cost camera is used to capture images of copper wire spool in real-time video. Inspired by how human perceive an object orientation based on its shape, a system is developed to determine camera orientation based on the spool image condition and yaw angle from the center of the camera (CFOV) to CHS. The performance of the proposed system is analyzed based on detection rate analysis. This project is developed by using MATLAB software. The analysis is done in controlled environment with 50-110 cm distance range of camera to the spool. In addition, the camera orientation is analyzed between -20º to 20º yaw angle range. In order to ensure the puller will not scratch the spool, a mathematical equation is derived to calculate the puller tolerance. By using this, the system can estimate the spool position based on the camera orientation and distance calculation. Application of this system is simple and costeffective. A Modified Circular Hough Transform (MCHT) method is proposed and tested with existing method which is Circular Hough Transform (CHT) method to eliminate false circles and outliers. The results of the analysis showed detection success rate of 96% compared to the CHT method. It shows the MCHT method is better than CHT method. The proposed system is able to calculate the distance and camera orientation based on spool image condition with low error rate. Hence, it solves the peg-in-hole problem without using Force/Torque sensor. In conclusion, a total of 7 analysis consist of image pre-processing, image segmentation, object classification, comparison between CHT and MCHT, illumination measurement, distance calculation and yaw angle analysis were experimentally tested including the comparison with the existing method. The proposed system was able to achieve all the objectives

    Robotic Assembly Control Reconfiguration Based on Transfer Reinforcement Learning for Objects with Different Geometric Features

    Full text link
    Robotic force-based compliance control is a preferred approach to achieve high-precision assembly tasks. When the geometric features of assembly objects are asymmetric or irregular, reinforcement learning (RL) agents are gradually incorporated into the compliance controller to adapt to complex force-pose mapping which is hard to model analytically. Since force-pose mapping is strongly dependent on geometric features, a compliance controller is only optimal for current geometric features. To reduce the learning cost of assembly objects with different geometric features, this paper is devoted to answering how to reconfigure existing controllers for new assembly objects with different geometric features. In this paper, model-based parameters are first reconfigured based on the proposed Equivalent Theory of Compliance Law (ETCL). Then the RL agent is transferred based on the proposed Weighted Dimensional Policy Distillation (WDPD) method. The experiment results demonstrate that the control reconfiguration method costs less time and achieves better control performance, which confirms the validity of proposed methods

    Stochastic Search Methods for Mobile Manipulators

    Get PDF
    Mobile manipulators are a potential solution to the increasing need for additional flexibility and mobility in industrial applications. However, they tend to lack the accuracy and precision achieved by fixed manipulators, especially in scenarios where both the manipulator and the autonomous vehicle move simultaneously. This paper analyzes the problem of dynamically evaluating the positioning error of mobile manipulators. In particular, it investigates the use of Bayesian methods to predict the position of the end-effector in the presence of uncertainty propagated from the mobile platform. The precision of the mobile manipulator is evaluated through its ability to intercept retroreflective markers using a photoelectric sensor attached to the end-effector. Compared to a deterministic search approach, we observed improved robustness with comparable search times, thereby enabling effective calibration of the mobile manipulator

    Human-in-the-Loop Task and Motion Planning for Imitation Learning

    Full text link
    Imitation learning from human demonstrations can teach robots complex manipulation skills, but is time-consuming and labor intensive. In contrast, Task and Motion Planning (TAMP) systems are automated and excel at solving long-horizon tasks, but they are difficult to apply to contact-rich tasks. In this paper, we present Human-in-the-Loop Task and Motion Planning (HITL-TAMP), a novel system that leverages the benefits of both approaches. The system employs a TAMP-gated control mechanism, which selectively gives and takes control to and from a human teleoperator. This enables the human teleoperator to manage a fleet of robots, maximizing data collection efficiency. The collected human data is then combined with an imitation learning framework to train a TAMP-gated policy, leading to superior performance compared to training on full task demonstrations. We compared HITL-TAMP to a conventional teleoperation system -- users gathered more than 3x the number of demos given the same time budget. Furthermore, proficient agents (75\%+ success) could be trained from just 10 minutes of non-expert teleoperation data. Finally, we collected 2.1K demos with HITL-TAMP across 12 contact-rich, long-horizon tasks and show that the system often produces near-perfect agents. Videos and additional results at https://hitltamp.github.io .Comment: Conference on Robot Learning (CoRL) 202

    Multifingered robot hand compliant manipulation based on vision-based demonstration and adaptive force control

    Get PDF
    Multifingered hand dexterous manipulation is quite challenging in the domain of robotics. One remaining issue is how to achieve compliant behaviors. In this work, we propose a human-in-the-loop learning-control approach for acquiring compliant grasping and manipulation skills of a multifinger robot hand. This approach takes the depth image of the human hand as input and generates the desired force commands for the robot. The markerless vision-based teleoperation system is used for the task demonstration, and an end-to-end neural network model (i.e., TeachNet) is trained to map the pose of the human hand to the joint angles of the robot hand in real-time. To endow the robot hand with compliant human-like behaviors, an adaptive force control strategy is designed to predict the desired force control commands based on the pose difference between the robot hand and the human hand during the demonstration. The force controller is derived from a computational model of the biomimetic control strategy in human motor learning, which allows adapting the control variables (impedance and feedforward force) online during the execution of the reference joint angles. The simultaneous adaptation of the impedance and feedforward profiles enables the robot to interact with the environment compliantly. Our approach has been verified in both simulation and real-world task scenarios based on a multifingered robot hand, that is, the Shadow Hand, and has shown more reliable performances than the current widely used position control mode for obtaining compliant grasping and manipulation behaviors
    corecore