435 research outputs found
LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching
The local reference frame (LRF) acts as a critical role in 3D local shape
description and matching. However, most of existing LRFs are hand-crafted and
suffer from limited repeatability and robustness. This paper presents the first
attempt to learn an LRF via a Siamese network that needs weak supervision only.
In particular, we argue that each neighboring point in the local surface gives
a unique contribution to LRF construction and measure such contributions via
learned weights. Extensive analysis and comparative experiments on three public
datasets addressing different application scenarios have demonstrated that
LRF-Net is more repeatable and robust than several state-of-the-art LRF methods
(LRF-Net is only trained on one dataset). In addition, LRF-Net can
significantly boost the local shape description and 6-DoF pose estimation
performance when matching 3D point clouds.Comment: 28 pages, 14 figure
Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders
The missing signal caused by the objects being occluded or an unstable sensor
is a common challenge during data collection. Such missing signals will
adversely affect the results obtained from the data, and this issue is observed
more frequently in robotic tactile perception. In tactile perception, due to
the limited working space and the dynamic environment, the contact between the
tactile sensor and the object is frequently insufficient and unstable, which
causes the partial loss of signals, thus leading to incomplete tactile data.
The tactile data will therefore contain fewer tactile cues with low information
density. In this paper, we propose a tactile representation learning method,
named TacMAE, based on Masked Autoencoder to address the problem of incomplete
tactile data in tactile perception. In our framework, a portion of the tactile
image is masked out to simulate the missing contact region. By reconstructing
the missing signals in the tactile image, the trained model can achieve a
high-level understanding of surface geometry and tactile properties from
limited tactile cues. The experimental results of tactile texture recognition
show that our proposed TacMAE can achieve a high recognition accuracy of 71.4%
in the zero-shot transfer and 85.8% after fine-tuning, which are 15.2% and 8.2%
higher than the results without using masked modeling. The extensive
experiments on YCB objects demonstrate the knowledge transferability of our
proposed method and the potential to improve efficiency in tactile exploration.Comment: This paper is accepted at IROS 202
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
The upscaling of Large Language Models (LLMs) has yielded impressive advances
in natural language processing, yet it also poses significant deployment
challenges. Weight quantization has emerged as a widely embraced solution to
reduce memory and computational demands. This paper introduces BitDistiller, a
framework that synergizes Quantization-Aware Training (QAT) with Knowledge
Distillation (KD) to boost the performance of LLMs at ultra-low precisions
(sub-4-bit). Specifically, BitDistiller first incorporates a tailored
asymmetric quantization and clipping technique to maximally preserve the
fidelity of quantized weights, and then proposes a novel Confidence-Aware
Kullback-Leibler Divergence (CAKLD) objective, which is employed in a
self-distillation manner to enable faster convergence and superior model
performance. Empirical evaluations demonstrate that BitDistiller significantly
surpasses existing methods in both 3-bit and 2-bit configurations on general
language understanding and complex reasoning benchmarks. Notably, BitDistiller
is shown to be more cost-effective, demanding fewer data and training
resources. The code is available at https://github.com/DD-DuDa/BitDistiller
- …