110 research outputs found

    TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers

    Full text link
    CutMix is a popular augmentation technique commonly used for training modern convolutional and transformer vision networks. It was originally designed to encourage Convolution Neural Networks (CNNs) to focus more on an image's global context instead of local information, which greatly improves the performance of CNNs. However, we found it to have limited benefits for transformer-based architectures that naturally have a global receptive field. In this paper, we propose a novel data augmentation technique TokenMix to improve the performance of vision transformers. TokenMix mixes two images at token level via partitioning the mixing region into multiple separated parts. Besides, we show that the mixed learning target in CutMix, a linear combination of a pair of the ground truth labels, might be inaccurate and sometimes counter-intuitive. To obtain a more suitable target, we propose to assign the target score according to the content-based neural activation maps of the two images from a pre-trained teacher model, which does not need to have high performance. With plenty of experiments on various vision transformer architectures, we show that our proposed TokenMix helps vision transformers focus on the foreground area to infer the classes and enhances their robustness to occlusion, with consistent performance gains. Notably, we improve DeiT-T/S/B with +1% ImageNet top-1 accuracy. Besides, TokenMix enjoys longer training, which achieves 81.2% top-1 accuracy on ImageNet with DeiT-S trained for 400 epochs. Code is available at https://github.com/Sense-X/TokenMix.Comment: ECCV 2022; Code: https://github.com/Sense-X/TokenMi

    Representation Disparity-aware Distillation for 3D Object Detection

    Full text link
    In this paper, we focus on developing knowledge distillation (KD) for compact 3D detectors. We observe that off-the-shelf KD methods manifest their efficacy only when the teacher model and student counterpart share similar intermediate feature representations. This might explain why they are less effective in building extreme-compact 3D detectors where significant representation disparity arises due primarily to the intrinsic sparsity and irregularity in 3D point clouds. This paper presents a novel representation disparity-aware distillation (RDD) method to address the representation disparity issue and reduce performance gap between compact students and over-parameterized teachers. This is accomplished by building our RDD from an innovative perspective of information bottleneck (IB), which can effectively minimize the disparity of proposal region pairs from student and teacher in features and logits. Extensive experiments are performed to demonstrate the superiority of our RDD over existing KD methods. For example, our RDD increases mAP of CP-Voxel-S to 57.1% on nuScenes dataset, which even surpasses teacher performance while taking up only 42% FLOPs.Comment: Accepted by ICCV2023. arXiv admin note: text overlap with arXiv:2205.15156 by other author

    Construction and Performance of Quantum Burst Error Correction Codes for Correlated Errors

    Full text link
    © 2018 IEEE. In practical communication and computation systems, errors occur predominantly in adjacent positions rather than in a random manner. In this paper, we develop a stabilizer formalism for quantum burst error correction codes (QBECC) to combat such error patterns in the quantum regime. Our contributions are as follows. Firstly, we derive an upper bound for the correctable burst errors of QBECCs, the quantum Reiger bound (QRB). Secondly, we propose two constructions of QBECCs: one by heuristic computer search and the other by concatenating two quantum tensor product codes (QTPCs). We obtain several new QBECCs with better parameters than existing codes with the same coding length. Moreover, some of the constructed codes can saturate the quantum Reiger bounds. Finally, we perform numerical experiments for our constructed codes over Markovian correlated depolarizing quantum memory channels, and show that QBECCs indeed outperform standard QECCs in this scenario

    DocStormer: Revitalizing Multi-Degraded Colored Document Images to Pristine PDF

    Full text link
    For capturing colored document images, e.g. posters and magazines, it is common that multiple degradations such as shadows, wrinkles, etc., are simultaneously introduced due to external factors. Restoring multi-degraded colored document images is a great challenge, yet overlooked, as most existing algorithms focus on enhancing color-ignored document images via binarization. Thus, we propose DocStormer, a novel algorithm designed to restore multi-degraded colored documents to their potential pristine PDF. The contributions are: firstly, we propose a "Perceive-then-Restore" paradigm with a reinforced transformer block, which more effectively encodes and utilizes the distribution of degradations. Secondly, we are the first to utilize GAN and pristine PDF magazine images to narrow the distribution gap between the enhanced results and PDF images, in pursuit of less degradation and better visual quality. Thirdly, we propose a non-parametric strategy, PFILI, which enables a smaller training scale and larger testing resolutions with acceptable detail trade-off, while saving memory and inference time. Fourthly, we are the first to propose a novel Multi-Degraded Colored Document image Enhancing dataset, named MD-CDE, for both training and evaluation. Experimental results show that the DocStormer exhibits superior performance, capable of revitalizing multi-degraded colored documents into their potential pristine digital versions, which fills the current academic gap from the perspective of method, data, and task

    CapsuleBot: A Novel Compact Hybrid Aerial-Ground Robot with Two Actuated-wheel-rotors

    Full text link
    This paper presents the design, modeling, and experimental validation of CapsuleBot, a compact hybrid aerial-ground vehicle designed for long-term covert reconnaissance. CapsuleBot combines the manoeuvrability of bicopter in the air with the energy efficiency and noise reduction of ground vehicles on the ground. To accomplish this, a structure named actuated-wheel-rotor has been designed, utilizing a sole motor for both the unilateral rotor tilting in the bicopter configuration and the wheel movement in ground mode. CapsuleBot comes equipped with two of these structures, enabling it to attain hybrid aerial-ground propulsion with just four motors. Importantly, the decoupling of motion modes is achieved without the need for additional drivers, enhancing the versatility and robustness of the system. Furthermore, we have designed the full dynamics and control for aerial and ground locomotion based on the bicopter model and the two-wheeled self-balancing vehicle model. The performance of CapsuleBot has been validated through experiments. The results demonstrate that CapsuleBot produces 40.53% less noise in ground mode and consumes 99.35% less energy, highlighting its potential for long-term covert reconnaissance applications.Comment: 7 pages, 10 figures, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA). This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Hydrogen jet diffusion modeling by using physics-informed graph neural network and sparsely-distributed sensor data

    Full text link
    Efficient modeling of jet diffusion during accidental release is critical for operation and maintenance management of hydrogen facilities. Deep learning has proven effective for concentration prediction in gas jet diffusion scenarios. Nonetheless, its reliance on extensive simulations as training data and its potential disregard for physical laws limit its applicability to unseen accidental scenarios. Recently, physics-informed neural networks (PINNs) have emerged to reconstruct spatial information by using data from sparsely-distributed sensors which are easily collected in real-world applications. However, prevailing approaches use the fully-connected neural network as the backbone without considering the spatial dependency of sensor data, which reduces the accuracy of concentration prediction. This study introduces the physics-informed graph deep learning approach (Physic_GNN) for efficient and accurate hydrogen jet diffusion prediction by using sparsely-distributed sensor data. Graph neural network (GNN) is used to model the spatial dependency of such sensor data by using graph nodes at which governing equations describing the physical law of hydrogen jet diffusion are immediately solved. The computed residuals are then applied to constrain the training process. Public experimental data of hydrogen jet is used to compare the accuracy and efficiency between our proposed approach Physic_GNN and state-of-the-art PINN. The results demonstrate our Physic_GNN exhibits higher accuracy and physical consistency of centerline concentration prediction given sparse concentration compared to PINN and more efficient compared to OpenFOAM. The proposed approach enables accurate and robust real-time spatial consequence reconstruction and underlying physical mechanisms analysis by using sparse sensor data

    Effect of potassium simplex optimization medium (KSOM) and embryo screening on the production of human lactoferrin transgenic cloned dairy goats

    Get PDF
    In this study, we produced cloned transgenic dairy goat based on dairy goat ear skin fibroblast as donor cells for nuclear transfer (NT), which were modified by human lactoferrin (hLF) gene. The developmental competence of NT embryos was compared with either between different embryo culture medium, potassium simplex optimization medium (KSOM) and tissue culture medium (TCM 199), or different classification of NT embryos (48 h after fusion). First we cultured NT embryos to cleavage stage (48 h after fusion) by TCM 199 supplemented with 1 mg/ml bovine serum albumin BSA and KSOM, then used TCM 199 supplemented with 10% FBS to culture them to blastula stage. The results show that the NT embryos in KSOM (19.5%) were superior to TCM 199 (10.6%) in blastulation. In the second experiment, we found that the growth rate of NT embryos (48 h after fusion) was different, then we divided them into four groups: 2-cell, 3- to 4-cell, 5- to 8-cell and >8-cell in stereo microscope and cultured them in vitro respectively. The results show day-2 embryos at 3-4cell and 5-8cell stage (31.9 and 28.2%, P < 0.05) had higher blastocyst formation rates than those at both 2-cell (9.1%) and >8-cell (8.3%) stage, and finally three healthy cloned transgenic goat were successfully produced using 3-8 cell embryos at Day-2 (82%). Using Hoechst 33342 staining, we also found that the >8 cells embryos at Day- 2 demonstrated higher frequency of fragmentation, which may be the one cause of the low blastocyst formation rate. This study therefore demonstrates that KSOM medium could be selected as the early embryo culture medium, and 3-8 cell embryos at day-2 (48 h after fusion) may be the suitable embryos for transplantation, which could reduce the nuclei fragmentation and result in good quality blastocysts that may also enhance the efficiency of transgenic cloned dairy goats production, as well as decrease the economic loss due to embryonic mortality when embryos are transferred to synchronized recipients.Key words: Nuclear transfer, KSOM, transgenic, human lactoferrin, dairy goat
    corecore