Search CORE

197 research outputs found

The asymptotic distribution and Berry--Esseen bound of a new test for independence in high dimension with an application to stochastic optimization

Author: Lin Zhengyan
Liu Wei-Dong
Shao Qi-Man
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

Let

\mathbf{X}_1,...,\mathbf{X}_n

be a random sample from a

p

-dimensional population distribution. Assume that

c_1n^{\alpha}\leq p\leq c_2n^{\alpha}

for some positive constants

c_1,c_2

and

\alpha

. In this paper we introduce a new statistic for testing independence of the

p

-variates of the population and prove that the limiting distribution is the extreme distribution of type I with a rate of convergence

O((\log n)^{5/2}/\sqrt{n})

. This is much faster than

O(1/\log n)

, a typical convergence rate for this type of extreme distribution. A simulation study and application to stochastic optimization are discussed.Comment: Published in at http://dx.doi.org/10.1214/08-AAP527 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Hong Kong University of Science and Technology Institutional Repository

CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models

Author: Han Xu
Huang Yuxiang
Liu Zhiyuan
Sun Maosong
Zhang Zhengyan
Zhao Weilin
Publication venue
Publication date: 15/11/2023
Field of study

Parameter-efficient tuning (PET) has been widely explored in recent years because it tunes much fewer parameters (PET modules) than full-parameter fine-tuning (FT) while still stimulating sufficient knowledge from large language models (LLMs) for downstream tasks. Moreover, when PET is employed to serve multiple tasks, different task-specific PET modules can be built on a frozen LLM, avoiding redundant LLM deployments. Although PET significantly reduces the cost of tuning and deploying LLMs, its inference still suffers from the computational bottleneck of LLMs. To address the above issue, we propose an effective PET framework based on compressed LLMs, named "CPET". In CPET, we evaluate the impact of mainstream LLM compression techniques on PET performance and then introduce knowledge inheritance and recovery strategies to restore the knowledge loss caused by these compression techniques. Our experimental results demonstrate that, owing to the restoring strategies of CPET, collaborating task-specific PET modules with a compressed LLM can achieve comparable performance to collaborating PET modules with the original version of the compressed LLM and outperform directly applying vanilla PET methods to the compressed LLM

arXiv.org e-Print Archive

READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises

Author: Chen Yingfa
Liu Zhiyuan
Si Chenglei
Sun Maosong
Wang Xiaozhi
Zhang Zhengyan
Publication venue
Publication date: 14/02/2023
Field of study

For many real-world applications, the user-generated inputs usually contain various noises due to speech recognition errors caused by linguistic variations1 or typographical errors (typos). Thus, it is crucial to test model performance on data with realistic input noises to ensure robustness and fairness. However, little study has been done to construct such benchmarks for Chinese, where various language-specific input noises happen in the real world. In order to fill this important gap, we construct READIN: a Chinese multi-task benchmark with REalistic And Diverse Input Noises. READIN contains four diverse tasks and requests annotators to re-enter the original test data with two commonly used Chinese input methods: Pinyin input and speech input. We designed our annotation pipeline to maximize diversity, for example by instructing the annotators to use diverse input method editors (IMEs) for keyboard noises and recruiting speakers from diverse dialectical groups for speech noises. We experiment with a series of strong pretrained language models as well as robust training methods, we find that these models often suffer significant performance drops on READIN even with robustness methods like data augmentation. As the first large-scale attempt in creating a benchmark with noises geared towards user-generated inputs, we believe that READIN serves as an important complement to existing Chinese NLP benchmarks. The source code and dataset can be obtained from https://github.com/thunlp/READIN.Comment: Preprin

arXiv.org e-Print Archive

Effective Few-Shot Named Entity Linking by Meta-Learning

Author: Li Xiuxing
Li Zhenyu
Liu Ning
Liu Zhiyuan
Wang Jianyong
Yuan Haitao
Zhang Wei
Zhang Zhengyan
Publication venue
Publication date: 19/07/2022
Field of study

Entity linking aims to link ambiguous mentions to their corresponding entities in a knowledge base, which is significant and fundamental for various downstream applications, e.g., knowledge base completion, question answering, and information extraction. While great efforts have been devoted to this task, most of these studies follow the assumption that large-scale labeled data is available. However, when the labeled data is insufficient for specific domains due to labor-intensive annotation work, the performance of existing algorithms will suffer an intolerable decline. In this paper, we endeavor to solve the problem of few-shot entity linking, which only requires a minimal amount of in-domain labeled data and is more practical in real situations. Specifically, we firstly propose a novel weak supervision strategy to generate non-trivial synthetic entity-mention pairs based on mention rewriting. Since the quality of the synthetic data has a critical impact on effective model training, we further design a meta-learning mechanism to assign different weights to each synthetic entity-mention pair automatically. Through this way, we can profoundly exploit rich and precious semantic information to derive a well-trained entity linking model under the few-shot setting. The experiments on real-world datasets show that the proposed method can extensively improve the state-of-the-art few-shot entity linking model and achieve impressive performance when only a small amount of labeled data is available. Moreover, we also demonstrate the outstanding ability of the model's transferability.Comment: 14 pages, 4 figures. Accepted at IEEE ICDE 202

arXiv.org e-Print Archive

Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models

Author: Gao Tianyu
Lin Yankai
Liu Zhiyuan
Sun Maosong
Yu Zichun
Zhang Zhengyan
Zhou Jie
Publication venue
Publication date: 19/09/2022
Field of study

Prompting, which casts downstream applications as language modeling tasks, has shown to be sample efficient compared to standard fine-tuning with pre-trained models. However, one pitfall of prompting is the need of manually-designed patterns, whose outcome can be unintuitive and requires large validation sets to tune. To tackle the challenge, we propose AutoSeq, a fully automatic prompting method: (1) We adopt natural language prompts on sequence-to-sequence models, enabling free-form generation and larger label search space; (2) We propose label sequences -- phrases with indefinite lengths to verbalize the labels -- which eliminate the need of manual templates and are more expressive than single label words; (3) We use beam search to automatically generate a large amount of label sequence candidates and propose contrastive re-ranking to get the best combinations. AutoSeq significantly outperforms other no-manual-design methods, such as soft prompt tuning, adapter tuning, and automatic search on single label words; the generated label sequences are even better than curated manual ones on a variety of tasks. Our method reveals the potential of sequence-to-sequence models in few-shot learning and sheds light on a path to generic and automatic prompting. The source code of this paper can be obtained from https://github.com/thunlp/Seq2Seq-Prompt.Comment: Accepted to COLING 202

arXiv.org e-Print Archive