67 research outputs found

    Direct Acyclic Graph based Ledger for Internet of Things: Performance and Security Analysis

    Get PDF
    Direct Acyclic Graph (DAG)-based ledger and the corresponding consensus algorithm has been identified as a promising technology for Internet of Things (IoT). Compared with Proof-of-Work (PoW) and Proof-of-Stake (PoS) that have been widely used in blockchain, the consensus mechanism designed on DAG structure (simply called as DAG consensus) can overcome some shortcomings such as high resource consumption, high transaction fee, low transaction throughput and long confirmation delay. However, the theoretic analysis on the DAG consensus is an untapped venue to be explored. To this end, based on one of the most typical DAG consensuses, Tangle, we investigate the impact of network load on the performance and security of the DAG-based ledger. Considering unsteady network load, we first propose a Markov chain model to capture the behavior of DAG consensus process under dynamic load conditions. The key performance metrics, i.e., cumulative weight and confirmation delay are analysed based on the proposed model. Then, we leverage a stochastic model to analyse the probability of a successful double-spending attack in different network load regimes. The results can provide an insightful understanding of DAG consensus process, e.g., how the network load affects the confirmation delay and the probability of a successful attack. Meanwhile, we also demonstrate the trade-off between security level and confirmation delay, which can act as a guidance for practical deployment of DAG-based ledgers.Comment: accepted by IEEE Transactions on Networkin

    Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning

    Full text link
    Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers a significant accuracy drop compared to the full fine-tuning. In this paper, we propose a new parameter-efficient fine-tuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance of full fine-tuning. In this way, SSF also surprisingly outperforms other parameter-efficient fine-tuning approaches even with a smaller number of tunable parameters. Furthermore, different from some existing parameter-efficient fine-tuning methods (e.g., Adapter or VPT) that introduce the extra parameters and computational cost in the training and inference stages, SSF only adds learnable parameters during the training stage, and these additional parameters can be merged into the original pre-trained model weights via re-parameterization in the inference phase. With the proposed SSF, our model obtains 2.46% (90.72% vs. 88.54%) and 11.48% (73.10% vs. 65.57%) performance improvement on FGVC and VTAB-1k in terms of Top-1 accuracy compared to the full fine-tuning but only fine-tuning about 0.3M parameters. We also conduct amounts of experiments in various model families (CNNs, Transformers, and MLPs) and datasets. Results on 26 image classification datasets in total and 3 robustness & out-of-distribution datasets show the effectiveness of SSF. Code is available at https://github.com/dongzelian/SSF.Comment: Accepted by NeurIPS202

    Shunted Self-Attention via Multi-Scale Token Aggregation

    Full text link
    Recent Vision Transformer~(ViT) models have demonstrated encouraging results across various computer vision tasks, thanks to their competence in modeling long-range dependencies of image patches or tokens via self-attention. These models, however, usually designate the similar receptive fields of each token feature within each layer. Such a constraint inevitably limits the ability of each self-attention layer in capturing multi-scale features, thereby leading to performance degradation in handling images with multiple objects of different scales. To address this issue, we propose a novel and generic strategy, termed shunted self-attention~(SSA), that allows ViTs to model the attentions at hybrid scales per attention layer. The key idea of SSA is to inject heterogeneous receptive field sizes into tokens: before computing the self-attention matrix, it selectively merges tokens to represent larger object features while keeping certain tokens to preserve fine-grained features. This novel merging scheme enables the self-attention to learn relationships between objects with different sizes and simultaneously reduces the token numbers and the computational cost. Extensive experiments across various tasks demonstrate the superiority of SSA. Specifically, the SSA-based transformer achieves 84.0\% Top-1 accuracy and outperforms the state-of-the-art Focal Transformer on ImageNet with only half of the model size and computation cost, and surpasses Focal Transformer by 1.3 mAP on COCO and 2.9 mIOU on ADE20K under similar parameter and computation cost. Code has been released at https://github.com/OliverRensu/Shunted-Transformer

    xURLLC-Aware Service Provisioning in Vehicular Networks: A Semantic Communication Perspective

    Full text link
    Semantic communication (SemCom), as an emerging paradigm focusing on meaning delivery, has recently been considered a promising solution for the inevitable crisis of scarce communication resources. This trend stimulates us to explore the potential of applying SemCom to wireless vehicular networks, which normally consume a tremendous amount of resources to meet stringent reliability and latency requirements. Unfortunately, the unique background knowledge matching mechanism in SemCom makes it challenging to simultaneously realize efficient service provisioning for multiple users in vehicle-to-vehicle networks. To this end, this paper identifies and jointly addresses two fundamental problems of knowledge base construction (KBC) and vehicle service pairing (VSP) inherently existing in SemCom-enabled vehicular networks in alignment with the next-generation ultra-reliable and low-latency communication (xURLLC) requirements. Concretely, we first derive the knowledge matching based queuing latency specific for semantic data packets, and then formulate a latency-minimization problem subject to several KBC and VSP related reliability constraints. Afterward, a SemCom-empowered Service Supplying Solution (S4^{\text{4}}) is proposed along with the theoretical analysis of its optimality guarantee and computational complexity. Numerical results demonstrate the superiority of S4^{\text{4}} in terms of average queuing latency, semantic data packet throughput, user knowledge matching degree and knowledge preference satisfaction compared with two benchmarks.Comment: This paper has been submitted to IEEE Transactions on Wireless Communications for the second round of peer review after a major revisio

    Expanding Small-Scale Datasets with Guided Imagination

    Full text link
    The power of DNNs relies heavily on the quantity and quality of training data. However, collecting and annotating data on a large scale is often expensive and time-consuming. To address this issue, we explore a new task, termed dataset expansion, aimed at expanding a ready-to-use small dataset by automatically creating new labeled samples. To this end, we present a Guided Imagination Framework (GIF) that leverages cutting-edge generative models like DALL-E2 and Stable Diffusion (SD) to "imagine" and create informative new data from the input seed data. Specifically, GIF conducts data imagination by optimizing the latent features of the seed data in the semantically meaningful space of the prior model, resulting in the creation of photo-realistic images with new content. To guide the imagination towards creating informative samples for model training, we introduce two key criteria, i.e., class-maintained information boosting and sample diversity promotion. These criteria are verified to be essential for effective dataset expansion: GIF-SD obtains 13.5% higher model accuracy on natural image datasets than unguided expansion with SD. With these essential criteria, GIF successfully expands small datasets in various scenarios, boosting model accuracy by 36.9% on average over six natural image datasets and by 13.5% on average over three medical datasets. The source code is available at https://github.com/Vanint/DatasetExpansion.Comment: NeurIPS 2023. Source code: https://github.com/Vanint/DatasetExpansio

    MagicVideo: Efficient Video Generation With Latent Diffusion Models

    Full text link
    We present an efficient text-to-video generation framework based on latent diffusion models, termed MagicVideo. Given a text description, MagicVideo can generate photo-realistic video clips with high relevance to the text content. With the proposed efficient latent 3D U-Net design, MagicVideo can generate video clips with 256x256 spatial resolution on a single GPU card, which is 64x faster than the recent video diffusion model (VDM). Unlike previous works that train video generation from scratch in the RGB space, we propose to generate video clips in a low-dimensional latent space. We further utilize all the convolution operator weights of pre-trained text-to-image generative U-Net models for faster training. To achieve this, we introduce two new designs to adapt the U-Net decoder to video data: a framewise lightweight adaptor for the image-to-video distribution adjustment and a directed temporal attention module to capture frame temporal dependencies. The whole generation process is within the low-dimension latent space of a pre-trained variation auto-encoder. We demonstrate that MagicVideo can generate both realistic video content and imaginary content in a photo-realistic style with a trade-off in terms of quality and computational cost. Refer to https://magicvideo.github.io/# for more examples
    corecore