Search CORE

64 research outputs found

Revisiting k-NN for Pre-trained Language Models

Author: Chen Jing
Li Lei
Tian Bozhong
Zhang Ningyu
Publication venue
Publication date: 18/04/2023
Field of study

Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres

arXiv.org e-Print Archive

Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network

Author: Gao Shenghua
Gu Yao
Li Wenqiao
Wu Yingna
Xu Xiaohao
Zheng Bozhong
Publication venue
Publication date: 29/11/2023
Field of study

Recently, 3D anomaly detection, a crucial problem involving fine-grained geometry discrimination, is getting more attention. However, the lack of abundant real 3D anomaly data limits the scalability of current models. To enable scalable anomaly data collection, we propose a 3D anomaly synthesis pipeline to adapt existing large-scale 3Dmodels for 3D anomaly detection. Specifically, we construct a synthetic dataset, i.e., Anomaly-ShapeNet, basedon ShapeNet. Anomaly-ShapeNet consists of 1600 point cloud samples under 40 categories, which provides a rich and varied collection of data, enabling efficient training and enhancing adaptability to industrial scenarios. Meanwhile,to enable scalable representation learning for 3D anomaly localization, we propose a self-supervised method, i.e., Iterative Mask Reconstruction Network (IMRNet). During training, we propose a geometry-aware sample module to preserve potentially anomalous local regions during point cloud down-sampling. Then, we randomly mask out point patches and sent the visible patches to a transformer for reconstruction-based self-supervision. During testing, the point cloud repeatedly goes through the Mask Reconstruction Network, with each iteration's output becoming the next input. By merging and contrasting the final reconstructed point cloud with the initial input, our method successfully locates anomalies. Experiments show that IMRNet outperforms previous state-of-the-art methods, achieving 66.1% in I-AUC on Anomaly-ShapeNet dataset and 72.5% in I-AUC on Real3D-AD dataset. Our dataset will be released at https://github.com/Chopper-233/Anomaly-ShapeNe

arXiv.org e-Print Archive

Editing Large Language Models: Problems, Methods, and Opportunities

Author: Chen Huajun
Cheng Siyuan
Deng Shumin
Li Zhoubo
Tian Bozhong
Wang Peng
Yao Yunzhi
Zhang Ningyu
Publication venue
Publication date: 30/11/2023
Field of study

Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context. Code and datasets are available at https://github.com/zjunlp/EasyEdit.Comment: EMNLP 2023. Updated with new experiment

arXiv.org e-Print Archive

Data Release of the AST3-2 Automatic Survey from Dome A, Antarctica

Author: Ashley Michael C. B.
Cui Xiangqun
Du Fujia
Fu Jianning
Gong Xuefei
Gu Bozhong
Hu Yi
Jiang Peng
Li Xiaoyan
Li Zhengyang
Ma Bin
Shang Zhaohui
Tao Charling
Wang Lifan
Xu Lingzhe
Yang Shi-hai
Yang Xu
Yu Ce
Yuan Xiangyan
Zhou Ji-lin
Zhu Zhenxi
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/02/2023
Field of study

AST3-2 is the second of the three Antarctic Survey Telescopes, aimed at wide-field time-domain optical astronomy. It is located at Dome A, Antarctica, which is by many measures the best optical astronomy site on the Earth's surface. Here we present the data from the AST3-2 automatic survey in 2016 and the photometry results. The median 5

\sigma

limiting magnitude in

i

-band is 17.8 mag and the light curve precision is 4 mmag for bright stars. The data release includes photometry for over 7~million stars, from which over 3,500 variable stars were detected, with 70 of them newly discovered. We classify these new variables into different types by combining their light curve features with stellar properties from surveys such as StarHorse.Comment: 16 pages, 20 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

Exoplanets in the Antarctic Sky I. The first data release of AST3-II (CHESPA) and new found variables within the southern CVZ of TESS

Author: Ashley Michael C.B.
Cui Xiangqun
Du Fujia
Fu Jianning
Gong Xuefei
Gu Bozhong
Hu Yi
Jiang Peng
Lawrence Jon
Li Xiaoyan
Li Zhengyang
Liang Ensi
Liu Huigen
Liu Qiang
Ma Bin
Mould Jeremy
Shang Zhaohui
Suntzeff Nicholas B.
Tao Charling
Tian Qiguo
Tinney C. G.
Uddin Syed A.
Wang Lifan
Wang Songhu
Wang Xiaofeng
Wei Peng
Wittenmyer Robert A.
Wright Duncan
Wu Xuefeng
Xu Lingzhe
Yang Ming
Yang Shi-hai
Yu Ce
Yu Zhouyi
Yuan Xiangyan
Zhang Hui
Zheng Jessica
Zhou Hongyan
Zhou Ji-lin
Zhu Zhenxi
Publication venue: 'American Astronomical Society'
Publication date: 31/12/2018
Field of study

Located at Dome A, the highest point of the Antarctic plateau, the Chinese Kunlun station is considered to be one of the best ground-based photometric sites because of its extremely cold, dry, and stable atmosphere. A target can be monitored from there for over 40 days without diurnal interruption during a polar winter. This makes Kunlun station a perfect site to search for short-period transiting exoplanets. Since 2008, an observatory has existed at Kunlun station, and three telescopes are working there. Using these telescopes, the AST3 project has been carried out over the last 6 yr with a search for transiting exoplanets as one of its key programs (CHESPA). In the austral winters of 2016 and 2017, a set of target fields in the southern continuous viewing zone (CVZ) of TESS were monitored by the AST3-II telescope. In this paper, we introduce the CHESPA and present the first data release containing photometry of 26,578 bright stars (m(i) <= 15). The best photometric precision at the optimum magnitude for the survey is around 2 mmag. To demonstrate the data quality, we also present a catalog of 221 variables with a brightness variation greater than 5 mmag from the 2016 data. Among these variables, 179 are newly identified periodic variables not listed in the AAVSO database (https://www.aavso.org/), and 67 are listed in the Candidate Target List. These variables will require careful attention to avoid false-positive signals when searching for transiting exoplanets. Dozens of new transiting exoplanet candidates will be released in a subsequent paper

arXiv.org e-Print Archive

HAL-IN2P3

HAL AMU

eScholarship - University of California

University of Southern Queensland ePrints

Exoplanets in the Antarctic Sky. II. 116 Transiting Exoplanet Candidates Found by AST3-II (CHESPA) within the Southern CVZ of TESS

Author: Ashley Michael C. B.
Cui Xiangqun
Du Fujia
Fu Jianning
Gong Xuefei
Gu Bozhong
Hu Yi
Jiang Peng
Lawrence Jon
Li Xiaoyan
Li Zhengyang
Liang Ensi
Liu Huigen
Liu Qiang
Ma Bin
Mould Jeremy
Shang Zhaohui
Suntzeff Nicholas B.
Tao Charling
Tian Qiguo
Tinney C. G.
Uddin Syed A.
Wang Lifan
Wang Songhu
Wang Xiaofeng
Wei Peng
Wittenmyer Robert A.
Wright Duncan
Wu Xuefeng
Xu Lingzhe
Yang Ming
Yang Shi-hai
Yu Ce
Yu Zhouyi
Yuan Xiangyan
Zhang Hui
Zheng Jessica
Zhou Hongyan
Zhou Ji-lin
Zhu Zhenxi
Publication venue: 'American Astronomical Society'
Publication date: 01/01/2019
Field of study

We report first results from the CHinese Exoplanet Searching Program from Antarctica (CHESPA)-a wide-field high-resolution photometric survey for transiting exoplanets carried out using telescopes of the AST3 (Antarctic Survey Telescopes times 3) project. There are now three telescopes (AST3-I, AST3-II, and CSTAR-II) operating at Dome A-the highest point on the Antarctic Plateau-in a fully automatic and remote mode to exploit the superb observing conditions of the site, and its long and uninterrupted polar nights. The search for transiting exoplanets is one of the key projects for AST3. During the austral winters of 2016 and 2017 we used the AST3-II telescope to survey a set of target fields near the southern ecliptic pole, falling within the continuous viewing zone of the TESS mission. The first data release of the 2016 data, including images, catalogs, and light curves of 26,578 bright stars (7.5 <= m(i) <= 15), was presented in Zhang et al. The best precision, as measured by the rms of the light curves at the optimum magnitude of the survey (m(i) = 10), is around 2 mmag. We detect 222 objects with plausible transit signals from these data, 116 of which are plausible transiting exoplanet candidates according to their stellar properties as given by the TESS Input Catalog, Gaia DR2, and TESS-HERMES spectroscopy. With the first data release from TESS expected in late 2018, this candidate list will be timely for improving the rejection of potential false-positives

arXiv.org e-Print Archive

HAL-IN2P3

HAL AMU

eScholarship - University of California

University of Southern Queensland ePrints

Challenging National Narratives: On the Origins of Sweet Potato in China as Global Commodity During the Early Modern Period

Author: A Maddisson
A Maddisson
A Madisson
Bozhong Li
D Ma
D Northrop
Debin Ma
Deduo Wu
DH Perkins
Fangzhong Liang
Gang Deng
Gang Zhang
H Zurndorfer
Joanna Waley-Cohen
Kenneth Pomeranz
LC Goodrich
Maxine Berg
Nan Zheng
OA Westad
Peer Vries
Ping-Ti Ho
R Bin Wong
R Glahn Von
R Glahn Von
Robert B Marks
TG Rawski
WM Tu
Yuanhe Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The introduction of American cereal crops is probably one of the most important events in China¿s agricultural history, having a great effect on the agriculture production, national life, the transformation of consumer behaviour and, to some extent, the nationalization of consumption. The sweet potato (Ipomoea Batatas L.), in Chinese g¿nsh¿ ¿¿, is a staple food crop for ancient Chinese society. Today it still plays an important role in Chinese daily life, as well as guaranteeing national food security.GECEM Project, Global Encounters between China and Europe: Trade Networks, Consumption and Cultural Exchanges in Macau and Marseille (1680-1840), ERC (European Research Council)- Starting Grant, programa Horizon 2020, número de ref. 679371, www.gecem.eu.Versión del edito

Repositorio Institucional Olavide

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas