78 research outputs found

    Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation

    Full text link
    Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight the potential of dexterous hands, which possess adaptability and versatility, capable of seamlessly transitioning between different modes of functionality without the need for re-grasping or external tools. However, the challenges arise due to the high-dimensional action space of dexterous hand and complex compositional dynamics of the long-horizon tasks. We present Sequential Dexterity, a general system based on reinforcement learning (RL) that chains multiple dexterous policies for achieving long-horizon task goals. The core of the system is a transition feasibility function that progressively finetunes the sub-policies for enhancing chaining success rate, while also enables autonomous policy-switching for recovery from failures and bypassing redundant stages. Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand. More details and video results could be found at https://sequential-dexterity.github.ioComment: CoRL 202

    Spiking PointNet: Spiking Neural Networks for Point Clouds

    Full text link
    Recently, Spiking Neural Networks (SNNs), enjoying extreme energy efficiency, have drawn much research attention on 2D visual recognition and shown gradually increasing application potential. However, it still remains underexplored whether SNNs can be generalized to 3D recognition. To this end, we present Spiking PointNet in the paper, the first spiking neural model for efficient deep learning on point clouds. We discover that the two huge obstacles limiting the application of SNNs in point clouds are: the intrinsic optimization obstacle of SNNs that impedes the training of a big spiking model with large time steps, and the expensive memory and computation cost of PointNet that makes training a big spiking point model unrealistic. To solve the problems simultaneously, we present a trained-less but learning-more paradigm for Spiking PointNet with theoretical justifications and in-depth experimental analysis. In specific, our Spiking PointNet is trained with only a single time step but can obtain better performance with multiple time steps inference, compared to the one trained directly with multiple time steps. We conduct various experiments on ModelNet10, ModelNet40 to demonstrate the effectiveness of Spiking PointNet. Notably, our Spiking PointNet even can outperform its ANN counterpart, which is rare in the SNN field thus providing a potential research direction for the following work. Moreover, Spiking PointNet shows impressive speedup and storage saving in the training phase.Comment: Accepted by NeurIP

    Teacher-Students Knowledge Distillation for Siamese Trackers

    Full text link
    In recent years, Siamese network based trackers have significantly advanced the state-of-the-art in real-time tracking. However, state-of-the-art Siamese trackers suffer from high memory cost which restricts their applicability in mobile applications having strict constraints on memory budget. To address this issue, we propose a novel distilled Siamese tracking framework to learn small, fast yet accurate trackers (students), which capture critical knowledge from large Siamese trackers (teachers) by a teacher-students knowledge distillation model. This model is intuitively inspired by a one-teacher vs multi-students learning mechanism, which is the most usual teaching method in the school. In particular, it contains a single teacher-student distillation model and a student-student knowledge sharing mechanism. The first one is designed by a tracking-specific distillation strategy to transfer knowledge from the teacher to students. The later is utilized for mutual learning between students to enable an in-depth knowledge understanding. To the best of our knowledge, we are the first to investigate knowledge distillation for Siamese trackers and propose a distilled Siamese tracking framework. We demonstrate the generality and effectiveness of our framework by conducting a theoretical analysis and extensive empirical evaluations on several popular Siamese trackers. The results on five tracking benchmarks clearly show that the proposed distilled trackers achieve compression rates up to 18×\times and frame-rates of 265265 FPS with speedups of 3×\times, while obtaining similar or even slightly improved tracking accuracy

    Research on the economic security application of energy economy in a low-carbon sustainable development society

    Get PDF
    Research on the economic security application of energy economy in a low-carbon sustainable development society is an important research field. Its purpose is to explore how to achieve the safe development of the national economy in the context of low-carbon sustainable development, including economic structural adjustment, green technology innovation, resource conservation and recycling, environmental protection, etc. This article explores how to ensure green and sustainable development of energy security and the security risk assessment of green energy economy

    Membrane Potential Batch Normalization for Spiking Neural Networks

    Full text link
    As one of the energy-efficient alternatives of conventional neural networks (CNNs), spiking neural networks (SNNs) have gained more and more interest recently. To train the deep models, some effective batch normalization (BN) techniques are proposed in SNNs. All these BNs are suggested to be used after the convolution layer as usually doing in CNNs. However, the spiking neuron is much more complex with the spatio-temporal dynamics. The regulated data flow after the BN layer will be disturbed again by the membrane potential updating operation before the firing function, i.e., the nonlinear activation. Therefore, we advocate adding another BN layer before the firing function to normalize the membrane potential again, called MPBN. To eliminate the induced time cost of MPBN, we also propose a training-inference-decoupled re-parameterization technique to fold the trained MPBN into the firing threshold. With the re-parameterization technique, the MPBN will not introduce any extra time burden in the inference. Furthermore, the MPBN can also adopt the element-wised form, while these BNs after the convolution layer can only use the channel-wised form. Experimental results show that the proposed MPBN performs well on both popular non-spiking static and neuromorphic datasets. Our code is open-sourced at \href{https://github.com/yfguo91/MPBN}{MPBN}.Comment: Accepted by ICCV202

    Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation

    Full text link
    The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation. Thanks to its impressive capabilities in all-round segmentation tasks and its prompt-based interface, SAM has sparked intensive discussion within the community. It is even said by many prestigious experts that image segmentation task has been "finished" by SAM. However, medical image segmentation, although an important branch of the image segmentation family, seems not to be included in the scope of Segmenting "Anything". Many individual experiments and recent studies have shown that SAM performs subpar in medical image segmentation. A natural question is how to find the missing piece of the puzzle to extend the strong segmentation capability of SAM to medical image segmentation. In this paper, instead of fine-tuning the SAM model, we propose Med SAM Adapter, which integrates the medical specific domain knowledge to the segmentation model, by a simple yet effective adaptation technique. Although this work is still one of a few to transfer the popular NLP technique Adapter to computer vision cases, this simple implementation shows surprisingly good performance on medical image segmentation. A medical image adapted SAM, which we have dubbed Medical SAM Adapter (MSA), shows superior performance on 19 medical image segmentation tasks with various image modalities including CT, MRI, ultrasound image, fundus image, and dermoscopic images. MSA outperforms a wide range of state-of-the-art (SOTA) medical image segmentation methods, such as nnUNet, TransUNet, UNetr, MedSegDiff, and also outperforms the fully fine-turned MedSAM with a considerable performance gap. Code will be released at: https://github.com/WuJunde/Medical-SAM-Adapter

    RMP-Loss: Regularizing Membrane Potential Distribution for Spiking Neural Networks

    Full text link
    Spiking Neural Networks (SNNs) as one of the biology-inspired models have received much attention recently. It can significantly reduce energy consumption since they quantize the real-valued membrane potentials to 0/1 spikes to transmit information thus the multiplications of activations and weights can be replaced by additions when implemented on hardware. However, this quantization mechanism will inevitably introduce quantization error, thus causing catastrophic information loss. To address the quantization error problem, we propose a regularizing membrane potential loss (RMP-Loss) to adjust the distribution which is directly related to quantization error to a range close to the spikes. Our method is extremely simple to implement and straightforward to train an SNN. Furthermore, it is shown to consistently outperform previous state-of-the-art methods over different network architectures and datasets.Comment: Accepted by ICCV202

    Novel theranostic nanoporphyrins for photodynamic diagnosis and trimodal therapy for bladder cancer

    Full text link
    The overall prognosis of bladder cancer has not been improved over the last 30 years and therefore, there is a great medical need to develop novel diagnosis and therapy approaches for bladder cancer. We developed a multifunctional nanoporphyrin platform that was coated with a bladder cancer-specific ligand named PLZ4. PLZ4-nanoporphyrin (PNP) integrates photodynamic diagnosis, image-guided photodynamic therapy, photothermal therapy and targeted chemotherapy in a single procedure. PNPs are spherical, relatively small (around 23 nm), and have the ability to preferably emit fluorescence/heat/reactive oxygen species upon illumination with near infrared light. Doxorubicin (DOX) loaded PNPs possess slower drug release and dramatically longer systemic circulation time compared to free DOX. The fluorescence signal of PNPs efficiently and selectively increased in bladder cancer cells but not normal urothelial cells in vitro and in an orthotopic patient derived bladder cancer xenograft (PDX) models, indicating their great potential for photodynamic diagnosis. Photodynamic therapy with PNPs was significantly more potent than 5-aminolevulinic acid, and eliminated orthotopic PDX bladder cancers after intravesical treatment. Image-guided photodynamic and photothermal therapies synergized with targeted chemotherapy of DOX and significantly prolonged overall survival of mice carrying PDXs. In conclusion, this uniquely engineered targeting PNP selectively targeted tumor cells for photodynamic diagnosis, and served as effective triple-modality (photodynamic/photothermal/chemo) therapeutic agents against bladder cancers. This platform can be easily adapted to individualized medicine in a clinical setting and has tremendous potential to improve the management of bladder cancer in the clinic
    • …
    corecore