78 research outputs found
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation
Many real-world manipulation tasks consist of a series of subtasks that are
significantly different from one another. Such long-horizon, complex tasks
highlight the potential of dexterous hands, which possess adaptability and
versatility, capable of seamlessly transitioning between different modes of
functionality without the need for re-grasping or external tools. However, the
challenges arise due to the high-dimensional action space of dexterous hand and
complex compositional dynamics of the long-horizon tasks. We present Sequential
Dexterity, a general system based on reinforcement learning (RL) that chains
multiple dexterous policies for achieving long-horizon task goals. The core of
the system is a transition feasibility function that progressively finetunes
the sub-policies for enhancing chaining success rate, while also enables
autonomous policy-switching for recovery from failures and bypassing redundant
stages. Despite being trained only in simulation with a few task objects, our
system demonstrates generalization capability to novel object shapes and is
able to zero-shot transfer to a real-world robot equipped with a dexterous
hand. More details and video results could be found at
https://sequential-dexterity.github.ioComment: CoRL 202
Spiking PointNet: Spiking Neural Networks for Point Clouds
Recently, Spiking Neural Networks (SNNs), enjoying extreme energy efficiency,
have drawn much research attention on 2D visual recognition and shown gradually
increasing application potential. However, it still remains underexplored
whether SNNs can be generalized to 3D recognition. To this end, we present
Spiking PointNet in the paper, the first spiking neural model for efficient
deep learning on point clouds. We discover that the two huge obstacles limiting
the application of SNNs in point clouds are: the intrinsic optimization
obstacle of SNNs that impedes the training of a big spiking model with large
time steps, and the expensive memory and computation cost of PointNet that
makes training a big spiking point model unrealistic. To solve the problems
simultaneously, we present a trained-less but learning-more paradigm for
Spiking PointNet with theoretical justifications and in-depth experimental
analysis. In specific, our Spiking PointNet is trained with only a single time
step but can obtain better performance with multiple time steps inference,
compared to the one trained directly with multiple time steps. We conduct
various experiments on ModelNet10, ModelNet40 to demonstrate the effectiveness
of Spiking PointNet. Notably, our Spiking PointNet even can outperform its ANN
counterpart, which is rare in the SNN field thus providing a potential research
direction for the following work. Moreover, Spiking PointNet shows impressive
speedup and storage saving in the training phase.Comment: Accepted by NeurIP
Teacher-Students Knowledge Distillation for Siamese Trackers
In recent years, Siamese network based trackers have significantly advanced
the state-of-the-art in real-time tracking. However, state-of-the-art Siamese
trackers suffer from high memory cost which restricts their applicability in
mobile applications having strict constraints on memory budget. To address this
issue, we propose a novel distilled Siamese tracking framework to learn small,
fast yet accurate trackers (students), which capture critical knowledge from
large Siamese trackers (teachers) by a teacher-students knowledge distillation
model. This model is intuitively inspired by a one-teacher vs multi-students
learning mechanism, which is the most usual teaching method in the school. In
particular, it contains a single teacher-student distillation model and a
student-student knowledge sharing mechanism. The first one is designed by a
tracking-specific distillation strategy to transfer knowledge from the teacher
to students. The later is utilized for mutual learning between students to
enable an in-depth knowledge understanding. To the best of our knowledge, we
are the first to investigate knowledge distillation for Siamese trackers and
propose a distilled Siamese tracking framework. We demonstrate the generality
and effectiveness of our framework by conducting a theoretical analysis and
extensive empirical evaluations on several popular Siamese trackers. The
results on five tracking benchmarks clearly show that the proposed distilled
trackers achieve compression rates up to 18 and frame-rates of
FPS with speedups of 3, while obtaining similar or even slightly
improved tracking accuracy
Research on the economic security application of energy economy in a low-carbon sustainable development society
Research on the economic security application of energy economy in a low-carbon sustainable development society is an important research field. Its purpose is to explore how to achieve the safe development of the national economy in the context of low-carbon sustainable development, including economic structural adjustment, green technology innovation, resource conservation and recycling, environmental protection, etc. This article explores how to ensure green and sustainable development of energy security and the security risk assessment of green energy economy
Membrane Potential Batch Normalization for Spiking Neural Networks
As one of the energy-efficient alternatives of conventional neural networks
(CNNs), spiking neural networks (SNNs) have gained more and more interest
recently. To train the deep models, some effective batch normalization (BN)
techniques are proposed in SNNs. All these BNs are suggested to be used after
the convolution layer as usually doing in CNNs. However, the spiking neuron is
much more complex with the spatio-temporal dynamics. The regulated data flow
after the BN layer will be disturbed again by the membrane potential updating
operation before the firing function, i.e., the nonlinear activation.
Therefore, we advocate adding another BN layer before the firing function to
normalize the membrane potential again, called MPBN. To eliminate the induced
time cost of MPBN, we also propose a training-inference-decoupled
re-parameterization technique to fold the trained MPBN into the firing
threshold. With the re-parameterization technique, the MPBN will not introduce
any extra time burden in the inference. Furthermore, the MPBN can also adopt
the element-wised form, while these BNs after the convolution layer can only
use the channel-wised form. Experimental results show that the proposed MPBN
performs well on both popular non-spiking static and neuromorphic datasets. Our
code is open-sourced at \href{https://github.com/yfguo91/MPBN}{MPBN}.Comment: Accepted by ICCV202
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation
The Segment Anything Model (SAM) has recently gained popularity in the field
of image segmentation. Thanks to its impressive capabilities in all-round
segmentation tasks and its prompt-based interface, SAM has sparked intensive
discussion within the community. It is even said by many prestigious experts
that image segmentation task has been "finished" by SAM. However, medical image
segmentation, although an important branch of the image segmentation family,
seems not to be included in the scope of Segmenting "Anything". Many individual
experiments and recent studies have shown that SAM performs subpar in medical
image segmentation. A natural question is how to find the missing piece of the
puzzle to extend the strong segmentation capability of SAM to medical image
segmentation. In this paper, instead of fine-tuning the SAM model, we propose
Med SAM Adapter, which integrates the medical specific domain knowledge to the
segmentation model, by a simple yet effective adaptation technique. Although
this work is still one of a few to transfer the popular NLP technique Adapter
to computer vision cases, this simple implementation shows surprisingly good
performance on medical image segmentation. A medical image adapted SAM, which
we have dubbed Medical SAM Adapter (MSA), shows superior performance on 19
medical image segmentation tasks with various image modalities including CT,
MRI, ultrasound image, fundus image, and dermoscopic images. MSA outperforms a
wide range of state-of-the-art (SOTA) medical image segmentation methods, such
as nnUNet, TransUNet, UNetr, MedSegDiff, and also outperforms the fully
fine-turned MedSAM with a considerable performance gap. Code will be released
at: https://github.com/WuJunde/Medical-SAM-Adapter
RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks
Spiking Neural Networks (SNNs) as one of the biology-inspired models have
received much attention recently. It can significantly reduce energy
consumption since they quantize the real-valued membrane potentials to 0/1
spikes to transmit information thus the multiplications of activations and
weights can be replaced by additions when implemented on hardware. However,
this quantization mechanism will inevitably introduce quantization error, thus
causing catastrophic information loss. To address the quantization error
problem, we propose a regularizing membrane potential loss (RMP-Loss) to adjust
the distribution which is directly related to quantization error to a range
close to the spikes. Our method is extremely simple to implement and
straightforward to train an SNN. Furthermore, it is shown to consistently
outperform previous state-of-the-art methods over different network
architectures and datasets.Comment: Accepted by ICCV202
Novel theranostic nanoporphyrins for photodynamic diagnosis and trimodal therapy for bladder cancer
The overall prognosis of bladder cancer has not been improved over the last 30 years and therefore, there is a great medical need to develop novel diagnosis and therapy approaches for bladder cancer. We developed a multifunctional nanoporphyrin platform that was coated with a bladder cancer-specific ligand named PLZ4. PLZ4-nanoporphyrin (PNP) integrates photodynamic diagnosis, image-guided photodynamic therapy, photothermal therapy and targeted chemotherapy in a single procedure. PNPs are spherical, relatively small (around 23 nm), and have the ability to preferably emit fluorescence/heat/reactive oxygen species upon illumination with near infrared light. Doxorubicin (DOX) loaded PNPs possess slower drug release and dramatically longer systemic circulation time compared to free DOX. The fluorescence signal of PNPs efficiently and selectively increased in bladder cancer cells but not normal urothelial cells in vitro and in an orthotopic patient derived bladder cancer xenograft (PDX) models, indicating their great potential for photodynamic diagnosis. Photodynamic therapy with PNPs was significantly more potent than 5-aminolevulinic acid, and eliminated orthotopic PDX bladder cancers after intravesical treatment. Image-guided photodynamic and photothermal therapies synergized with targeted chemotherapy of DOX and significantly prolonged overall survival of mice carrying PDXs. In conclusion, this uniquely engineered targeting PNP selectively targeted tumor cells for photodynamic diagnosis, and served as effective triple-modality (photodynamic/photothermal/chemo) therapeutic agents against bladder cancers. This platform can be easily adapted to individualized medicine in a clinical setting and has tremendous potential to improve the management of bladder cancer in the clinic
- …