345 research outputs found

    On Content-centric Wireless Delivery Networks

    Full text link
    The flux of social media and the convenience of mobile connectivity has created a mobile data phenomenon that is expected to overwhelm the mobile cellular networks in the foreseeable future. Despite the advent of 4G/LTE, the growth rate of wireless data has far exceeded the capacity increase of the mobile networks. A fundamentally new design paradigm is required to tackle the ever-growing wireless data challenge. In this article, we investigate the problem of massive content delivery over wireless networks and present a systematic view on content-centric network design and its underlying challenges. Towards this end, we first review some of the recent advancements in Information Centric Networking (ICN) which provides the basis on how media contents can be labeled, distributed, and placed across the networks. We then formulate the content delivery task into a content rate maximization problem over a share wireless channel, which, contrasting the conventional wisdom that attempts to increase the bit-rate of a unicast system, maximizes the content delivery capability with a fixed amount of wireless resources. This conceptually simple change enables us to exploit the "content diversity" and the "network diversity" by leveraging the abundant computation sources (through application-layer encoding, pushing and caching, etc.) within the existing wireless networks. A network architecture that enables wireless network crowdsourcing for content delivery is then described, followed by an exemplary campus wireless network that encompasses the above concepts.Comment: 20 pages, 7 figures,accepted by IEEE Wireless Communications,Sept.201

    Asymmetric Polynomial Loss For Multi-Label Classification

    Full text link
    Various tasks are reformulated as multi-label classification problems, in which the binary cross-entropy (BCE) loss is frequently utilized for optimizing well-designed models. However, the vanilla BCE loss cannot be tailored for diverse tasks, resulting in a suboptimal performance for different models. Besides, the imbalance between redundant negative samples and rare positive samples could degrade the model performance. In this paper, we propose an effective Asymmetric Polynomial Loss (APL) to mitigate the above issues. Specifically, we first perform Taylor expansion on BCE loss. Then we ameliorate the coefficients of polynomial functions. We further employ the asymmetric focusing mechanism to decouple the gradient contribution from the negative and positive samples. Moreover, we validate that the polynomial coefficients can recalibrate the asymmetric focusing hyperparameters. Experiments on relation extraction, text classification, and image classification show that our APL loss can consistently improve performance without extra training burden.Comment: ICASSP 202

    Human-Readable Fingerprint for Large Language Models

    Full text link
    Protecting the copyright of large language models (LLMs) has become crucial due to their resource-intensive training and accompanying carefully designed licenses. However, identifying the original base model of an LLM is challenging due to potential parameter alterations. In this study, we introduce a human-readable fingerprint for LLMs that uniquely identifies the base model without exposing model parameters or interfering with training. We first observe that the vector direction of LLM parameters remains stable after the model has converged during pretraining, showing negligible perturbations through subsequent training steps, including continued pretraining, supervised fine-tuning (SFT), and RLHF, which makes it a sufficient condition to identify the base model. The necessity is validated by continuing to train an LLM with an extra term to drive away the model parameters' direction and the model becomes damaged. However, this direction is vulnerable to simple attacks like dimension permutation or matrix rotation, which significantly change it without affecting performance. To address this, leveraging the Transformer structure, we systematically analyze potential attacks and define three invariant terms that identify an LLM's base model. We make these invariant terms human-readable by mapping them to a Gaussian vector using a convolutional encoder and then converting it into a natural image with StyleGAN2. Our method generates a dog image as an identity fingerprint for an LLM, where the dog's appearance strongly indicates the LLM's base model. The fingerprint provides intuitive information for qualitative discrimination, while the invariant terms can be employed for quantitative and precise verification. Experimental results across various LLMs demonstrate the effectiveness of our method
    • …
    corecore