213 research outputs found

    GenPose: Generative Category-level Object Pose Estimation via Diffusion Models

    Full text link
    Object pose estimation plays a vital role in embodied AI and computer vision, enabling intelligent agents to comprehend and interact with their surroundings. Despite the practicality of category-level pose estimation, current approaches encounter challenges with partially observed point clouds, known as the multihypothesis issue. In this study, we propose a novel solution by reframing categorylevel object pose estimation as conditional generative modeling, departing from traditional point-to-point regression. Leveraging score-based diffusion models, we estimate object poses by sampling candidates from the diffusion model and aggregating them through a two-step process: filtering out outliers via likelihood estimation and subsequently mean-pooling the remaining candidates. To avoid the costly integration process when estimating the likelihood, we introduce an alternative method that trains an energy-based model from the original score-based model, enabling end-to-end likelihood estimation. Our approach achieves state-of-the-art performance on the REAL275 dataset, surpassing 50% and 60% on strict 5d2cm and 5d5cm metrics, respectively. Furthermore, our method demonstrates strong generalizability to novel categories sharing similar symmetric properties without fine-tuning and can readily adapt to object pose tracking tasks, yielding comparable results to the current state-of-the-art baselines

    GraspGF: Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping

    Full text link
    The use of anthropomorphic robotic hands for assisting individuals in situations where human hands may be unavailable or unsuitable has gained significant importance. In this paper, we propose a novel task called human-assisting dexterous grasping that aims to train a policy for controlling a robotic hand's fingers to assist users in grasping objects. Unlike conventional dexterous grasping, this task presents a more complex challenge as the policy needs to adapt to diverse user intentions, in addition to the object's geometry. We address this challenge by proposing an approach consisting of two sub-modules: a hand-object-conditional grasping primitive called Grasping Gradient Field~(GraspGF), and a history-conditional residual policy. GraspGF learns `how' to grasp by estimating the gradient from a success grasping example set, while the residual policy determines `when' and at what speed the grasping action should be executed based on the trajectory history. Experimental results demonstrate the superiority of our proposed method compared to baselines, highlighting the user-awareness and practicality in real-world applications. The codes and demonstrations can be viewed at "https://sites.google.com/view/graspgf"

    Design of Automatic Cutting and Welding Machine for Brake Beam-Axle

    Get PDF
    Abstract. Brake beam is one of the important components working in railway vehicles braking process, which directly affects the security and stability of the high-speed running railway vehicles. Through the research of the cutting and welding technology for the 209T straight plate type brake beam, this paper presents an advanced maintenance method, and designs an automatic cutting and welding machine for the brake beam-axle, then studies the basic structure and working principle for the machine in detail. The automatic cutting and welding machine has been operating properly since been used in the maintenance workshop. What is more, the maintenance of each brake beam only spends about 15 minutes, so this advanced maintenance method can improve the efficiency of the maintenance appropriately, and ensure the reliability of maintenance quality

    Learning Gradient Fields for Scalable and Generalizable Irregular Packing

    Full text link
    The packing problem, also known as cutting or nesting, has diverse applications in logistics, manufacturing, layout design, and atlas generation. It involves arranging irregularly shaped pieces to minimize waste while avoiding overlap. Recent advances in machine learning, particularly reinforcement learning, have shown promise in addressing the packing problem. In this work, we delve deeper into a novel machine learning-based approach that formulates the packing problem as conditional generative modeling. To tackle the challenges of irregular packing, including object validity constraints and collision avoidance, our method employs the score-based diffusion model to learn a series of gradient fields. These gradient fields encode the correlations between constraint satisfaction and the spatial relationships of polygons, learned from teacher examples. During the testing phase, packing solutions are generated using a coarse-to-fine refinement mechanism guided by the learned gradient fields. To enhance packing feasibility and optimality, we introduce two key architectural designs: multi-scale feature extraction and coarse-to-fine relation extraction. We conduct experiments on two typical industrial packing domains, considering translations only. Empirically, our approach demonstrates spatial utilization rates comparable to, or even surpassing, those achieved by the teacher algorithm responsible for training data generation. Additionally, it exhibits some level of generalization to shape variations. We are hopeful that this method could pave the way for new possibilities in solving the packing problem

    Score-PA: Score-based 3D Part Assembly

    Full text link
    Autonomous 3D part assembly is a challenging task in the areas of robotics and 3D computer vision. This task aims to assemble individual components into a complete shape without relying on predefined instructions. In this paper, we formulate this task from a novel generative perspective, introducing the Score-based 3D Part Assembly framework (Score-PA) for 3D part assembly. Knowing that score-based methods are typically time-consuming during the inference stage. To address this issue, we introduce a novel algorithm called the Fast Predictor-Corrector Sampler (FPC) that accelerates the sampling process within the framework. We employ various metrics to assess assembly quality and diversity, and our evaluation results demonstrate that our algorithm outperforms existing state-of-the-art approaches. We release our code at https://github.com/J-F-Cheng/Score-PA_Score-based-3D-Part-Assembly.Comment: BMVC 202

    In-situ PLL-g-PEG Functionalized Nanopore for Enhancing Protein Characterization

    Get PDF
    Single-molecule nanopore detection technology has revolutionized proteomics research by enabling highly sensitive and label-free detection of individual proteins. Herein, we designed a small, portable, and leak-free flowcell made of PMMA for nanopore experiments. In addition, we developed an in situ coating PLL-g-PEG approach to produce non-sticky nanopores for measuring the volume of diseases-relevant biomarker, such as the Alpha-1 antitrypsin (AAT) protein. The in situ coating method allows continuous monitoring, ensuring adequate coating, which can be directly used for translocation experiments. The coated nanopores exhibit improved characteristics, including an increased nanopore lifetime and enhanced translocation events of the AAT proteins. Furthermore, we demonstrated the reduction in the translocation event's dwell time, along with an increase in current blockade amplitudes and translocation numbers under different voltage stimuli. The study also successfully measures the single AAT protein volume (253 nm3 ), which closely aligns with the previously reported hydrodynamic volume. The real-time in situ PLL-g-PEG coating method and the developed nanopore flowcell hold great promise for various nanopores applications involving non-sticky single-molecule characterization

    Dexterous Functional Pre-Grasp Manipulation with Diffusion Policy

    Full text link
    In real-world scenarios, objects often require repositioning and reorientation before they can be grasped, a process known as pre-grasp manipulation. Learning universal dexterous functional pre-grasp manipulation requires precise control over the relative position, orientation, and contact between the hand and object while generalizing to diverse dynamic scenarios with varying objects and goal poses. To address this challenge, we propose a teacher-student learning approach that utilizes a novel mutual reward, incentivizing agents to optimize three key criteria jointly. Additionally, we introduce a pipeline that employs a mixture-of-experts strategy to learn diverse manipulation policies, followed by a diffusion policy to capture complex action distributions from these experts. Our method achieves a success rate of 72.6\% across more than 30 object categories by leveraging extrinsic dexterity and adjusting from feedback

    Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio Navigation

    Full text link
    Visual-audio navigation (VAN) is attracting more and more attention from the robotic community due to its broad applications, \emph{e.g.}, household robots and rescue robots. In this task, an embodied agent must search for and navigate to the sound source with egocentric visual and audio observations. However, the existing methods are limited in two aspects: 1) poor generalization to unheard sound categories; 2) sample inefficient in training. Focusing on these two problems, we propose a brain-inspired plug-and-play method to learn a semantic-agnostic and spatial-aware representation for generalizable visual-audio navigation. We meticulously design two auxiliary tasks for respectively accelerating learning representations with the above-desired characteristics. With these two auxiliary tasks, the agent learns a spatially-correlated representation of visual and audio inputs that can be applied to work on environments with novel sounds and maps. Experiment results on realistic 3D scenes (Replica and Matterport3D) demonstrate that our method achieves better generalization performance when zero-shot transferred to scenes with unseen maps and unheard sound categories
    • …
    corecore