6 research outputs found
INSTA-BEEER: Explicit Error Estimation and Refinement for Fast and Accurate Unseen Object Instance Segmentation
Efficient and accurate segmentation of unseen objects is crucial for robotic
manipulation. However, it remains challenging due to over- or
under-segmentation. Although existing refinement methods can enhance the
segmentation quality, they fix only minor boundary errors or are not
sufficiently fast. In this work, we propose INSTAnce Boundary Explicit Error
Estimation and Refinement (INSTA-BEEER), a novel refinement model that allows
for adding and deleting instances and sharpening boundaries. Leveraging an
error-estimation-then-refinement scheme, the model first estimates the
pixel-wise boundary explicit errors: true positive, true negative, false
positive, and false negative pixels of the instance boundary in the initial
segmentation. It then refines the initial segmentation using these error
estimates as guidance. Experiments show that the proposed model significantly
enhances segmentation, achieving state-of-the-art performance. Furthermore,
with a fast runtime (less than 0.1 s), the model consistently improves
performance across various initial segmentation methods, making it highly
suitable for practical robotic applications.Comment: 8 pages, 5 figure
Grasp Stability Assessment Through Attention-Guided Cross-Modality Fusion and Transfer Learning
Extensive research has been conducted on assessing grasp stability, a crucial
prerequisite for achieving optimal grasping strategies, including the minimum
force grasping policy. However, existing works employ basic feature-level
fusion techniques to combine visual and tactile modalities, resulting in the
inadequate utilization of complementary information and the inability to model
interactions between unimodal features. This work proposes an attention-guided
cross-modality fusion architecture to comprehensively integrate visual and
tactile features. This model mainly comprises convolutional neural networks
(CNNs), self-attention, and cross-attention mechanisms. In addition, most
existing methods collect datasets from real-world systems, which is
time-consuming and high-cost, and the datasets collected are comparatively
limited in size. This work establishes a robotic grasping system through
physics simulation to collect a multimodal dataset. To address the sim-to-real
transfer gap, we propose a migration strategy encompassing domain randomization
and domain adaptation techniques. The experimental results demonstrate that the
proposed fusion framework achieves markedly enhanced prediction performance
(approximately 10%) compared to other baselines. Moreover, our findings suggest
that the trained model can be reliably transferred to real robotic systems,
indicating its potential to address real-world challenges.Comment: Accepted by IROS 202
DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation
Object manipulation in cluttered scenes is a difficult and important problem
in robotics. To efficiently manipulate objects, it is crucial to understand
their surroundings, especially in cases where multiple objects are stacked one
on top of the other, preventing effective grasping. We here present DUQIM-Net,
a decision-making approach for object manipulation in a setting of stacked
objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using
Adj-Net, a model that leverages existing Transformer Encoder-Decoder object
detectors by adding an adjacency head. The output of this head
probabilistically infers the underlying hierarchical structure of the objects
in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to
perform decision making and assist with object-grasping tasks. Our experimental
results show that Adj-Net surpasses the state-of-the-art in object-relationship
inference on the Visual Manipulation Relationship Dataset (VMRD), and that
DUQIM-Net outperforms comparable approaches in bin clearing tasks.Comment: 8 pages, 6 figures, 3 tables. Accepted to the 2022 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS 2022