20 research outputs found

    LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers

    Full text link
    Prior works have attempted to build private inference frameworks for transformer-based large language models (LLMs) in a server-client setting, where the server holds the model parameters and the client inputs the private data for inference. However, these frameworks impose significant overhead when the private inputs are forward propagated through the original LLMs. In this paper, we show that substituting the computation- and communication-heavy operators in the transformer architecture with privacy-computing friendly approximations can greatly reduce the private inference costs with minor impact on model performance. Compared to the state-of-the-art Iron (NeurIPS 2022), our privacy-computing friendly model inference pipeline achieves a 5×5\times acceleration in computation and an 80\% reduction in communication overhead, while retaining nearly identical accuracy

    Minimal deployable endpoint-driven network forwarding: principle, designs and applications

    Get PDF
    Networked systems now have significant impact on human lives: the Internet, connecting the world globally, is the foundation of our information age, the data centers, running hundreds of thousands of servers, drive the era of cloud computing, and even the Tor project, a networked system providing online anonymity, now serves millions of daily users. Guided by the end-to-end principle, many computer networks have been designed with a simple and flexible core offering general data transfer service, whereas the bulk of the application-level functionalities have been implemented on endpoints that are attached to the edge of the network. Although the end-to-end design principle gives these networked systems tremendous success, a number of new requirements have emerged for computer networks and their running applications, including untrustworthy of endpoints, privacy requirement of endpoints, more demanding applications, the rise of third-party Intermediaries and the asymmetric capability of endpoints and so on. These emerging requirements have created various challenges in different networked systems. To address these challenges, there are no obvious solutions without adding in-network functions to the network core. However, no design principle has ever been proposed for guiding the implementation of in-network functions. In this thesis, We propose the first such principle and apply this principle to propose four designs in three different networked systems to address four separate challenges. We demonstrate through detailed implementation and extensive evaluations that the proposed principle can live in harmony with the end-to-end principle, and a combination of the two principle offers more complete, effective and accurate guides for innovating the modern computer networks and their applications.Ope

    FlowPolice: enforcing congestion accountability to defend against DDoS attacks

    Get PDF
    Defending the Internet against distributed denial of service (DDoS) attacks is a fundamental problem. Despite over a decade of research, little progress has been made on the real-world deployment of proposed approaches due to the prohibitive deployment hurdles. This thesis presents FlowPolice, a new DDoS defense mechanism capable of thwarting millions of attack flows, while requiring very lightweight deployment. Specifically, FlowPolice can immediately benefit the first deployed autonomous system (AS) without further deployment at other ASs, and a single deployed router can protect all downstream links that implement a simple prioritization mechanism. The design of FlowPolice suppresses attack traffic by forcing attackers to be accountable for congestion via proper rate limiting. To learn users’ congestion accountability, FlowPolice leverages a capability feedback mechanism so that the deploying router can make rate limiting decisions based only on its self-generated capability tags. We use theoretical analysis, large scale simulation and Linux implementation to demonstrate the effectiveness of FlowPolice. Specifically, the the- oretical analysis proves that FlowPolice ensures per-flow fair share at the bottleneck link. Our implementation shows that FlowPolice can scale up to handle very large scale DDoS attacks and introduces little packet process- ing overhead. We also perform detailed packet-level simulation to show that FlowPolice is effective to mitigate DDoS attacks

    martFL: Enabling Utility-Driven Data Marketplace with a Robust and Verifiable Federated Learning Architecture

    Full text link
    The development of machine learning models requires a large amount of training data. Data marketplaces are essential for trading high-quality, private-domain data not publicly available online. However, due to growing data privacy concerns, direct data exchange is inappropriate. Federated Learning (FL) is a distributed machine learning paradigm that exchanges data utilities (in form of local models or gradients) among multiple parties without directly sharing the raw data. However, several challenges exist when applying existing FL architectures to construct a data marketplace: (i) In existing FL architectures, Data Acquirers (DAs) cannot privately evaluate local models from Data Providers (DPs) prior to trading; (ii) Model aggregation protocols in existing FL designs struggle to exclude malicious DPs without "overfitting" to the DA's (possibly biased) root dataset; (iii) Prior FL designs lack a proper billing mechanism to enforce the DA to fairly allocate the reward according to contributions made by different DPs. To address above challenges, we propose martFL, the first federated learning architecture that is specifically designed to enable a secure utility-driven data marketplace. At a high level, martFL is powered by two innovative designs: (i) a quality-aware model aggregation protocol that achieves robust local model aggregation even when the DA's root dataset is biased; (ii) a verifiable data transaction protocol that enables the DA to prove, both succinctly and in zero-knowledge, that it has faithfully aggregates the local models submitted by different DPs according to the committed aggregation weights, based on which the DPs can unambiguously claim the corresponding reward. We implement a prototype of martFL and evaluate it extensively over various tasks. The results show that martFL can improve the model accuracy by up to 25% while saving up to 64% data acquisition cost

    Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding

    Full text link
    Today's cloud networks are shared among many tenants. Bandwidth guarantees and work conservation are two key properties to ensure predictable performance for tenant applications and high network utilization for providers. Despite significant efforts, very little prior work can really achieve both properties simultaneously even some of them claimed so. In this paper, we present QShare, an in-network based solution to achieve bandwidth guarantees and work conservation simultaneously. QShare leverages weighted fair queuing on commodity switches to slice network bandwidth for tenants, and solves the challenge of queue scarcity through balanced tenant placement and dynamic tenant-queue binding. QShare is readily implementable with existing switching chips. We have implemented a QShare prototype and evaluated it via both testbed experiments and simulations. Our results show that QShare ensures bandwidth guarantees while driving network utilization to over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201

    ResLT: Residual Learning for Long-tailed Recognition

    Full text link
    Deep learning algorithms face great challenges with long-tailed data distribution which, however, is quite a common case in real-world scenarios. Previous methods tackle the problem from either the aspect of input space (re-sampling classes with different frequencies) or loss space (re-weighting classes with different weights), suffering from heavy over-fitting to tail classes or hard optimization during training. To alleviate these issues, we propose a more fundamental perspective for long-tailed recognition, {i.e., from the aspect of parameter space, and aims to preserve specific capacity for classes with low frequencies. From this perspective, the trivial solution utilizes different branches for the head, medium, tail classes respectively, and then sums their outputs as the final results is not feasible. Instead, we design the effective residual fusion mechanism -- with one main branch optimized to recognize images from all classes, another two residual branches are gradually fused and optimized to enhance images from medium+tail classes and tail classes respectively. Then the branches are aggregated into final results by additive shortcuts. We test our method on several benchmarks, {i.e., long-tailed version of CIFAR-10, CIFAR-100, Places, ImageNet, and iNaturalist 2018. Experimental results manifest that our method achieves new state-of-the-art for long-tailed recognition. Code will be available at \url{https://github.com/FPNAS/ResLT}

    Generalized Parametric Contrastive Learning

    Full text link
    In this paper, we propose the Generalized Parametric Contrastive Learning (GPaCo/PaCo) which works well on both imbalanced and balanced data. Based on theoretical analysis, we observe that supervised contrastive loss tends to bias high-frequency classes and thus increases the difficulty of imbalanced learning. We introduce a set of parametric class-wise learnable centers to rebalance from an optimization perspective. Further, we analyze our GPaCo/PaCo loss under a balanced setting. Our analysis demonstrates that GPaCo/PaCo can adaptively enhance the intensity of pushing samples of the same class close as more samples are pulled together with their corresponding centers and benefit hard example learning. Experiments on long-tailed benchmarks manifest the new state-of-the-art for long-tailed recognition. On full ImageNet, models from CNNs to vision transformers trained with GPaCo loss show better generalization performance and stronger robustness compared with MAE models. Moreover, GPaCo can be applied to the semantic segmentation task and obvious improvements are observed on the 4 most popular benchmarks. Our code is available at https://github.com/dvlab-research/Parametric-Contrastive-Learning.Comment: TPAMI 2023. arXiv admin note: substantial text overlap with arXiv:2107.1202
    corecore