Search CORE

17 research outputs found

Revisiting k-NN for Pre-trained Language Models

Author: Chen Jing
Li Lei
Tian Bozhong
Zhang Ningyu
Publication venue
Publication date: 18/04/2023
Field of study

Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres

arXiv.org e-Print Archive

Editing Language Model-based Knowledge Graph Embeddings

Author: Chen Huajun
Cheng Siyuan
Dai Zelin
Guo Wei
Tian Bozhong
Xiong Feiyu
Zhang Ningyu
Publication venue
Publication date: 25/03/2023
Field of study

Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, which are challenging to modify without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. The proposed task aims to enable data-efficient and fast updates to KG embeddings without damaging the performance of the rest. We build four new datasets: E-FB15k237, A-FB15k237, E-WN18RR, and A-WN18RR, and evaluate several knowledge editing baselines demonstrating the limited ability of previous models to handle the proposed challenging task. We further propose a simple yet strong baseline dubbed KGEditor, which utilizes additional parametric layers of the hyper network to edit/add facts. Comprehensive experimental results demonstrate that KGEditor can perform better when updating specific facts while not affecting the rest with low training resources. Code and datasets will be available in https://github.com/zjunlp/PromptKG/tree/main/deltaKG.Comment: Work in progress and the project website is https://zjunlp.github.io/project/KGE_Editing

arXiv.org e-Print Archive

Editing Large Language Models: Problems, Methods, and Opportunities

Author: Chen Huajun
Cheng Siyuan
Deng Shumin
Li Zhoubo
Tian Bozhong
Wang Peng
Yao Yunzhi
Zhang Ningyu
Publication venue
Publication date: 30/11/2023
Field of study

Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context. Code and datasets are available at https://github.com/zjunlp/EasyEdit.Comment: EMNLP 2023. Updated with new experiment

arXiv.org e-Print Archive

MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing

Author: Chen Yongrui
Cheng Siyuan
Du Miaozeng
Hu Nan
Jiang Haiyun
Li Jiaqi
Qi Guilin
Tian Bozhong
Zhang Chuanyi
Publication venue
Publication date: 18/02/2024
Field of study

Multimodal knowledge editing represents a critical advancement in enhancing the capabilities of Multimodal Large Language Models (MLLMs). Despite its potential, current benchmarks predominantly focus on coarse-grained knowledge, leaving the intricacies of fine-grained (FG) multimodal entity knowledge largely unexplored. This gap presents a notable challenge, as FG entity recognition is pivotal for the practical deployment and effectiveness of MLLMs in diverse real-world scenarios. To bridge this gap, we introduce MIKE, a comprehensive benchmark and dataset specifically designed for the FG multimodal entity knowledge editing. MIKE encompasses a suite of tasks tailored to assess different perspectives, including Vanilla Name Answering, Entity-Level Caption, and Complex-Scenario Recognition. In addition, a new form of knowledge editing, Multi-step Editing, is introduced to evaluate the editing efficiency. Through our extensive evaluations, we demonstrate that the current state-of-the-art methods face significant challenges in tackling our proposed benchmark, underscoring the complexity of FG knowledge editing in MLLMs. Our findings spotlight the urgent need for novel approaches in this domain, setting a clear agenda for future research and development efforts within the community.Comment: 8 page

arXiv.org e-Print Archive

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

Author: Chen Huajun
Cheng Siyuan
Liu Kangwei
Tian Bozhong
Wang Mengru
Wang Peng
Xi Zekun
Xie Xin
Yao Yunzhi
Zhang Ningyu
Zheng Guozhou
Publication venue
Publication date: 14/08/2023
Field of study

Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to the outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged -- aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Nevertheless, due to significant differences among various knowledge editing methods and the variations in task setups, there is no standard implementation framework available for the community, which hinders practitioners to apply knowledge editing to applications. To address these issues, we propose EasyEdit, an easy-to-use knowledge editing framework for LLMs. It supports various cutting-edge knowledge editing approaches and can be readily apply to many well-known LLMs such as T5, GPT-J, LlaMA, etc. Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit, demonstrating that knowledge editing surpasses traditional fine-tuning in terms of reliability and generalization. We have released the source code on GitHub at https://github.com/zjunlp/EasyEdit, along with Google Colab tutorials and comprehensive documentation for beginners to get started. Besides, we present an online system for real-time knowledge editing, and a demo video at http://knowlm.zjukg.cn/easyedit.mp4.Comment: The project website is https://github.com/zjunlp/EasyEdi

arXiv.org e-Print Archive

A Comprehensive Study of Knowledge Editing for Large Language Models

Author: Chen Huajun
Cheng Siyuan
Deng Shumin
Gu Jia-Chen
Huang Fei
Jiang Yong
Liang Lei
Mao Shengyu
Ni Yuansheng
Tian Bozhong
Wang Mengru
Wang Peng
Xi Zekun
Xie Pengjun
Xu Xin
Xu Ziwen
Yao Yunzhi
Zhang Jintian
Zhang Ningyu
Zhang Zhiqiang
Zhou Jun
Zhu Xiaowei
Publication venue
Publication date: 28/03/2024
Field of study

Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs to correct outdated information or integrate new knowledge, thereby ensuring their continued relevance. Note that many applications demand continual model adjustments post-training to address deficiencies or undesirable behaviors. There is an increasing interest in efficient, lightweight methods for on-the-fly model modifications. To this end, recent years have seen a burgeoning in the techniques of knowledge editing for LLMs, which aim to efficiently modify LLMs' behaviors within specific domains while preserving overall performance across various inputs. In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches. Drawing inspiration from educational and cognitive research theories, we propose a unified categorization criterion that classifies knowledge editing methods into three groups: resorting to external knowledge, merging knowledge into the model, and editing intrinsic knowledge. Furthermore, we introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches. Additionally, we provide an in-depth analysis of knowledge location, which can give a deeper understanding of the knowledge structures inherent within LLMs. Finally, we discuss several potential applications of knowledge editing, outlining its broad and impactful implications.Comment: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPaper

arXiv.org e-Print Archive

The Jiao Tong University Spectroscopic Telescope Project

Author: Bai Hua
Cui Xiangqun
Feng Fabo
Gu Bozhong
Gu Yizhou
Han Jiaxin
Hou Yonghui
Hu Zhongwen
Ji Hangxin
Jing Yipeng
JUST Team
Li Wei
Li Zhaoyu
Liu Chengze
Qi Zhaoxiang
Tan Xianyu
Tian Cairang
Yang Dehua
Yang Xiaohu
Yu Yu
Yuan Xiangyan
Zhai Chao
Zhang Congcong
Zhang Haotong
Zhang Jun
Zhang Pengjie
Zhang Yong
Zhao Yi
Zheng Xianzhong
Zhu Qingfeng
Zu Ying
Publication venue
Publication date: 29/02/2024
Field of study

The Jiao Tong University Spectroscopic Telescope (JUST) is a 4.4-meter f/6.0 segmentedmirror telescope dedicated to spectroscopic observations. The JUST primary mirror is composed of 18 hexagonal segments, each with a diameter of 1.1 m. JUST provides two Nasmyth platforms for placing science instruments. One Nasmyth focus fits a field of view of 10 arcmin and the other has an extended field of view of 1.2 deg with correction optics. A tertiary mirror is used to switch between the two Nasmyth foci. JUST will be installed at a site at Lenghu in Qinghai Province, China, and will conduct spectroscopic observations with three types of instruments to explore the dark universe, trace the dynamic universe, and search for exoplanets: (1) a multi-fiber (2000 fibers) medium-resolution spectrometer (R=4000-5000) to spectroscopically map galaxies and large-scale structure; (2) an integral field unit (IFU) array of 500 optical fibers and/or a long-slit spectrograph dedicated to fast follow-ups of transient sources for multimessenger astronomy; (3) a high-resolution spectrometer (R~100000) designed to identify Jupiter analogs and Earth-like planets, with the capability to characterize the atmospheres of hot exoplanets.Comment: 28 pages, 6 figure

arXiv.org e-Print Archive

Exoplanets in the Antarctic Sky I. The first data release of AST3-II (CHESPA) and new found variables within the southern CVZ of TESS

Author: Ashley Michael C.B.
Cui Xiangqun
Du Fujia
Fu Jianning
Gong Xuefei
Gu Bozhong
Hu Yi
Jiang Peng
Lawrence Jon
Li Xiaoyan
Li Zhengyang
Liang Ensi
Liu Huigen
Liu Qiang
Ma Bin
Mould Jeremy
Shang Zhaohui
Suntzeff Nicholas B.
Tao Charling
Tian Qiguo
Tinney C. G.
Uddin Syed A.
Wang Lifan
Wang Songhu
Wang Xiaofeng
Wei Peng
Wittenmyer Robert A.
Wright Duncan
Wu Xuefeng
Xu Lingzhe
Yang Ming
Yang Shi-hai
Yu Ce
Yu Zhouyi
Yuan Xiangyan
Zhang Hui
Zheng Jessica
Zhou Hongyan
Zhou Ji-lin
Zhu Zhenxi
Publication venue: 'American Astronomical Society'
Publication date: 31/12/2018
Field of study

Located at Dome A, the highest point of the Antarctic plateau, the Chinese Kunlun station is considered to be one of the best ground-based photometric sites because of its extremely cold, dry, and stable atmosphere. A target can be monitored from there for over 40 days without diurnal interruption during a polar winter. This makes Kunlun station a perfect site to search for short-period transiting exoplanets. Since 2008, an observatory has existed at Kunlun station, and three telescopes are working there. Using these telescopes, the AST3 project has been carried out over the last 6 yr with a search for transiting exoplanets as one of its key programs (CHESPA). In the austral winters of 2016 and 2017, a set of target fields in the southern continuous viewing zone (CVZ) of TESS were monitored by the AST3-II telescope. In this paper, we introduce the CHESPA and present the first data release containing photometry of 26,578 bright stars (m(i) <= 15). The best photometric precision at the optimum magnitude for the survey is around 2 mmag. To demonstrate the data quality, we also present a catalog of 221 variables with a brightness variation greater than 5 mmag from the 2016 data. Among these variables, 179 are newly identified periodic variables not listed in the AAVSO database (https://www.aavso.org/), and 67 are listed in the Candidate Target List. These variables will require careful attention to avoid false-positive signals when searching for transiting exoplanets. Dozens of new transiting exoplanet candidates will be released in a subsequent paper

arXiv.org e-Print Archive

HAL-IN2P3

HAL AMU

eScholarship - University of California

University of Southern Queensland ePrints

Exoplanets in the Antarctic Sky. II. 116 Transiting Exoplanet Candidates Found by AST3-II (CHESPA) within the Southern CVZ of TESS

Author: Ashley Michael C. B.
Cui Xiangqun
Du Fujia
Fu Jianning
Gong Xuefei
Gu Bozhong
Hu Yi
Jiang Peng
Lawrence Jon
Li Xiaoyan
Li Zhengyang
Liang Ensi
Liu Huigen
Liu Qiang
Ma Bin
Mould Jeremy
Shang Zhaohui
Suntzeff Nicholas B.
Tao Charling
Tian Qiguo
Tinney C. G.
Uddin Syed A.
Wang Lifan
Wang Songhu
Wang Xiaofeng
Wei Peng
Wittenmyer Robert A.
Wright Duncan
Wu Xuefeng
Xu Lingzhe
Yang Ming
Yang Shi-hai
Yu Ce
Yu Zhouyi
Yuan Xiangyan
Zhang Hui
Zheng Jessica
Zhou Hongyan
Zhou Ji-lin
Zhu Zhenxi
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2019
Field of study

We report first results from the CHinese Exoplanet Searching Program from Antarctica (CHESPA)-a wide-field high-resolution photometric survey for transiting exoplanets carried out using telescopes of the AST3 (Antarctic Survey Telescopes times 3) project. There are now three telescopes (AST3-I, AST3-II, and CSTAR-II) operating at Dome A-the highest point on the Antarctic Plateau-in a fully automatic and remote mode to exploit the superb observing conditions of the site, and its long and uninterrupted polar nights. The search for transiting exoplanets is one of the key projects for AST3. During the austral winters of 2016 and 2017 we used the AST3-II telescope to survey a set of target fields near the southern ecliptic pole, falling within the continuous viewing zone of the TESS mission. The first data release of the 2016 data, including images, catalogs, and light curves of 26,578 bright stars (7.5 <= m(i) <= 15), was presented in Zhang et al. The best precision, as measured by the rms of the light curves at the optimum magnitude of the survey (m(i) = 10), is around 2 mmag. We detect 222 objects with plausible transit signals from these data, 116 of which are plausible transiting exoplanet candidates according to their stellar properties as given by the TESS Input Catalog, Gaia DR2, and TESS-HERMES spectroscopy. With the first data release from TESS expected in late 2018, this candidate list will be timely for improving the rejection of potential false-positives

arXiv.org e-Print Archive

HAL-IN2P3

HAL AMU

eScholarship - University of California

University of Southern Queensland ePrints

Differential detection scheme for compact CPT atomic clocks

Author: Bozhong Tan
Huifang Lin
Sihong Gu
Yuan Tian
Publication venue: EDP Sciences, IOP Publishing and Società Italiana di Fisica
Publication date: 28/09/2017
Field of study

A scheme is investigated for a coherent population trapping (CPT) atomic clock, wherein the polarization of a beam produced by a vertical-cavity surface-emitting laser is converted to an elliptically polarized beam that interacts with alkali atoms, where the CPT signal is extracted by differentially detecting the magneto-optically rotated light within the transmitted beam. The scheme eliminates the spin-polarized trap state of the atoms and the unwanted background signal, and suppresses in the CPT signal the noise converted from the laser noise. This result reveals the promise of this scheme for realizing a compact CPT atomic clock possessing a significantly improved frequency stability compared to current compact CPT atomic clock devices, coupled with similar power consumption, volume, and cost therewith

EDP Sciences OAI-PMH repository (1.2.0)