196 research outputs found

    Entanglement and chaos in warped conformal field theories

    Get PDF
    Various aspects of warped conformal field theories (WCFTs) are studied including entanglement entropy on excited states, the Renyi entropy after a local quench, and out-of-time-order four-point functions. Assuming a large central charge and dominance of the vacuum block in the conformal block expansion, (i) we calculate the single-interval entanglement entropy on an excited state, matching previous finite temperature results by changing the ensemble; and (ii) we show that WCFTs are maximally chaotic, a result that is compatible with the existence of black holes in the holographic duals. Finally, we relax the aforementioned assumptions and study the time evolution of the Renyi entropy after a local quench. We find that the change in the Renyi entropy is topological, vanishing at early and late times, and nonvanishing in between only for charged states in spectrally-flowed WCFTs.Comment: 31 pages; v2: corrected typos, matches published versio

    Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models

    Full text link
    Large pre-trained language models (PLMs) have demonstrated strong performance on natural language understanding (NLU) tasks through fine-tuning. However, fine-tuned models still suffer from overconfident predictions, especially in out-of-domain settings. In this paper, we tackle the problem of calibrating fine-tuned language models. We demonstrate that the PLMs are well-calibrated on the masked language modeling task with robust predictive confidence under domain shift, yet the fine-tuned models fail to retain such property due to catastrophic forgetting, which impacts the calibration on the downstream classification task. In light of these observations, we evaluate the calibration of several methods that preserve pre-trained features and show that preserving pre-trained features can improve the calibration of fine-tuned language models. Among these methods, our proposed method that encourages the fine-tuned model to learn generative representations with auxiliary language modeling objective achieves competitive accuracy and the lowest expected calibration error compared to several strong baselines under both in-domain and out-of-domain settings on three downstream NLU tasks.Comment: ICLR 202

    Joint Source Channel Rate-Distortion Analysis for Adaptive Mode Selection and Rate Control in Wireless Video Coding

    Get PDF
    DOI 10.1109/TCSVT.2002.800313In this paper, we first develop a rate-distortion (R-D) model for DCT-based video coding incorporating the macroblock (MB) intra refreshing rate. For any given bit rate and intra refreshing rate, this model is capable of estimating the corresponding coding distortion even before a video frame is coded. We then present a theoretical analysis of the picture distortion caused by channel errors and the subsequent inter-frame propagation. Based on this analysis, we develop a statistical model to estimate such channel errors induced distortion for different channel conditions and encoder settings. The proposed analytic model mathematically describes the complex behavior of channel errors in a video coding and transmission system. Unlike other experimental approaches for distortion estimation reported in the literature, this analytic model has very low computational complexity and implementation cost, which are highly desirable in wireless video applications. Simulation results show that this model is able to accurately estimate the channel errors induced distortion with a minimum delay in processing. Based on the proposed source coding R-D model and the analytic channel-distortion estimation, we derive an analytic solution for adaptive intra mode selection and joint source-channel rate control under time-varying wireless channel conditions. Extensive experimental results demonstrate that this scheme significantly improves the end-to-end video quality in wireless video coding and transmission

    End-to-end One-shot Human Parsing

    Full text link
    Previous human parsing methods are limited to parsing humans into pre-defined classes, which is inflexible for practical fashion applications that often have new fashion item classes. In this paper, we define a novel one-shot human parsing (OSHP) task that requires parsing humans into an open set of classes defined by any test example. During training, only base classes are exposed, which only overlap with part of the test-time classes. To address three main challenges in OSHP, i.e., small sizes, testing bias, and similar parts, we devise an End-to-end One-shot human Parsing Network (EOP-Net). Firstly, an end-to-end human parsing framework is proposed to parse the query image into both coarse-grained and fine-grained human classes, which embeds rich semantic information that is shared across different granularities to identify the small-sized human classes. Then, we gradually smooth the training-time static prototypes to get robust class representations. Moreover, we employ a dynamic objective to encourage the network's enhancing features' representational capability in the early training phase while improving features' transferability in the late training phase. Therefore, our method can quickly adapt to the novel classes and mitigate the testing bias issue. In addition, we add a contrastive loss at the prototype level to enforce inter-class distances, thereby discriminating the similar parts. For comprehensive evaluations on the new task, we tailor three existing popular human parsing benchmarks to the OSHP task. Experiments demonstrate that EOP-Net outperforms representative one-shot segmentation models by large margins and serves as a strong baseline for further research. The source code is available at https://github.com/Charleshhy/One-shot-Human-Parsing.Comment: Accepted to IEEE Trans. Pattern Analysis and Machine Intelligence (TPAMI

    Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning

    Full text link
    Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning so as to adapt pre-trained vision models to downstream tasks, which only tunes a small number of parameters while freezing the vast majority ones to ease storage burden and optimization difficulty. However, existing PEFT methods introduce trainable parameters to the same positions across different tasks depending solely on human heuristics and neglect the domain gaps. To this end, we study where to introduce and how to allocate trainable parameters by proposing a novel Sensitivity-aware visual Parameter-efficient fine-Tuning (SPT) scheme, which adaptively allocates trainable parameters to task-specific important positions given a desired tunable parameter budget. Specifically, our SPT first quickly identifies the sensitive parameters that require tuning for a given task in a data-dependent way. Next, our SPT further boosts the representational capability for the weight matrices whose number of sensitive parameters exceeds a pre-defined threshold by utilizing existing structured tuning methods, e.g., LoRA [23] or Adapter [22], to replace directly tuning the selected sensitive parameters (unstructured tuning) under the budget. Extensive experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods and largely boosts their performance, e.g., SPT improves Adapter with supervised pre-trained ViT-B/16 backbone by 4.2% and 1.4% mean Top-1 accuracy, reaching SOTA performance on FGVC and VTAB-1k benchmarks, respectively. Source code is at https://github.com/ziplab/SPTComment: ICCV 2023 Ora

    Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

    Full text link
    Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs. In this work, we systematically evaluate the impact of the alignment process on logit-based uncertainty calibration of LMs under the multiple-choice setting. We first conduct a thoughtful empirical study on how aligned LMs differ in calibration from their pre-trained counterparts. Experimental results reveal that there are two distinct uncertainties in LMs under the multiple-choice setting, which are responsible for the answer decision and the format preference of the LMs, respectively. Then, we investigate the role of these two uncertainties on aligned LM's calibration through fine-tuning in simple synthetic alignment schemes and conclude that one reason for aligned LMs' overconfidence is the conflation of these two types of uncertainty. Furthermore, we examine the utility of common post-hoc calibration methods for aligned LMs and propose an easy-to-implement and sample-efficient method to calibrate aligned LMs. We hope our findings could provide insights into the design of more reliable alignment processes for LMs

    Efficient Stitchable Task Adaptation

    Full text link
    The paradigm of pre-training and fine-tuning has laid the foundation for deploying deep learning models. However, most fine-tuning methods are designed to meet a specific resource budget. Recently, considering diverse deployment scenarios with various resource budgets, stitchable neural network (SN-Net) is introduced to quickly obtain numerous new networks (stitches) from the pre-trained models (anchors) in a model family via model stitching. Although promising, SN-Net confronts new challenges when adapting it to new target domains, including huge memory and storage requirements and a long and sub-optimal multistage adaptation process. In this work, we present a novel framework, Efficient Stitchable Task Adaptation (ESTA), to efficiently produce a palette of fine-tuned models that adhere to diverse resource constraints. Specifically, we first tailor parameter-efficient fine-tuning to share low-rank updates among the stitches while maintaining independent bias terms. In this way, we largely reduce fine-tuning memory burdens and mitigate the interference among stitches that arises in task adaptation. Furthermore, we streamline a simple yet effective one-stage deployment pipeline, which estimates the important stitches to deploy with training-time gradient statistics. By assigning higher sampling probabilities to important stitches, we also get a boosted Pareto frontier. Extensive experiments on 25 downstream visual recognition tasks demonstrate that our ESTA is capable of generating stitches with smooth accuracy-efficiency trade-offs and surpasses the direct SN-Net adaptation by remarkable margins with significantly lower training time and fewer trainable parameters. Furthermore, we demonstrate the flexibility and scalability of our ESTA framework by stitching LLMs from LLaMA family, obtaining chatbot stitches of assorted sizes.Comment: Source code will be released at https://github.com/ziplab/Stitched_LLaM
    corecore