2,079 research outputs found


    Get PDF

    NTK-approximating MLP Fusion for Efficient Language Model Fine-tuning

    Full text link
    Fine-tuning a pre-trained language model (PLM) emerges as the predominant strategy in many natural language processing applications. However, even fine-tuning the PLMs and doing inference are expensive, especially on edge devices with low computing power. Some general approaches (e.g. quantization and distillation) have been widely studied to reduce the compute/memory of PLM fine-tuning, while very few one-shot compression techniques are explored. In this paper, we investigate the neural tangent kernel (NTK)--which reveals the gradient descent dynamics of neural networks--of the multilayer perceptrons (MLP) modules in a PLM and propose to coin a lightweight PLM through NTK-approximating MLP fusion. To achieve this, we reconsider the MLP as a bundle of sub-MLPs, and cluster them into a given number of centroids, which can then be restored as a compressed MLP and surprisingly shown to well approximate the NTK of the original PLM. Extensive experiments of PLM fine-tuning on both natural language understanding (NLU) and generation (NLG) tasks are provided to verify the effectiveness of the proposed method MLP fusion. Our code is available at https://github.com/weitianxin/MLP_Fusion.Comment: ICML 202

    Distributionally Robust Circuit Design Optimization under Variation Shifts

    Full text link
    Due to the significant process variations, designers have to optimize the statistical performance distribution of nano-scale IC design in most cases. This problem has been investigated for decades under the formulation of stochastic optimization, which minimizes the expected value of a performance metric while assuming that the distribution of process variation is exactly given. This paper rethinks the variation-aware circuit design optimization from a new perspective. First, we discuss the variation shift problem, which means that the actual density function of process variations almost always differs from the given model and is often unknown. Consequently, we propose to formulate the variation-aware circuit design optimization as a distributionally robust optimization problem, which does not require the exact distribution of process variations. By selecting an appropriate uncertainty set for the probability density function of process variations, we solve the shift-aware circuit optimization problem using distributionally robust Bayesian optimization. This method is validated with both a photonic IC and an electronics IC. Our optimized circuits show excellent robustness against variation shifts: the optimized circuit has excellent performance under many possible distributions of process variations that differ from the given statistical model. This work has the potential to enable a new research direction and inspire subsequent research at different levels of the EDA flow under the setting of variation shift.Comment: accepted by ICCAD 2023, 8 page

    The Rational Agent Benchmark for Data Visualization

    Full text link
    Understanding how helpful a visualization is from experimental results is difficult because the observed performance is confounded with aspects of the study design, such as how useful the information that is visualized is for the task. We develop a rational agent framework for designing and interpreting visualization experiments. Our framework conceives two experiments with the same setup: one with behavioral agents (human subjects), the other one with a hypothetical rational agent. A visualization is evaluated by comparing the expected performance of behavioral agents to that of rational agent under different assumptions. Using recent visualization decision studies from the literature, we demonstrate how the framework can be used to pre-experimentally evaluate the experiment design by bounding the expected improvement in performance from having access to visualizations, and post-experimentally to deconfound errors of information extraction from errors of optimization, among other analyses