64 research outputs found
Revisiting k-NN for Pre-trained Language Models
Pre-trained Language Models (PLMs), as parametric-based eager learners, have
become the de-facto choice for current paradigms of Natural Language Processing
(NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning
paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we
revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the
methodological level, we propose to adopt k-NN with textual representations of
PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the
training process. (2) Linearly interpolate the probability distribution
predicted by k-NN with that of the PLMs' classifier. At the heart of our
approach is the implementation of k-NN-calibrated training, which treats
predicted results as indicators for easy versus hard examples during the
training process. From the perspective of the diversity of application
scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning
paradigms and zero-shot, few-shot and fully-supervised settings, respectively,
across eight diverse end-tasks. We hope our exploration will encourage the
community to revisit the power of classical methods for efficient
NLP\footnote{Code and datasets are available in
https://github.com/zjunlp/Revisit-KNN.Comment: Work in progres
Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network
Recently, 3D anomaly detection, a crucial problem involving fine-grained
geometry discrimination, is getting more attention. However, the lack of
abundant real 3D anomaly data limits the scalability of current models. To
enable scalable anomaly data collection, we propose a 3D anomaly synthesis
pipeline to adapt existing large-scale 3Dmodels for 3D anomaly detection.
Specifically, we construct a synthetic dataset, i.e., Anomaly-ShapeNet, basedon
ShapeNet. Anomaly-ShapeNet consists of 1600 point cloud samples under 40
categories, which provides a rich and varied collection of data, enabling
efficient training and enhancing adaptability to industrial scenarios.
Meanwhile,to enable scalable representation learning for 3D anomaly
localization, we propose a self-supervised method, i.e., Iterative Mask
Reconstruction Network (IMRNet). During training, we propose a geometry-aware
sample module to preserve potentially anomalous local regions during point
cloud down-sampling. Then, we randomly mask out point patches and sent the
visible patches to a transformer for reconstruction-based self-supervision.
During testing, the point cloud repeatedly goes through the Mask Reconstruction
Network, with each iteration's output becoming the next input. By merging and
contrasting the final reconstructed point cloud with the initial input, our
method successfully locates anomalies. Experiments show that IMRNet outperforms
previous state-of-the-art methods, achieving 66.1% in I-AUC on Anomaly-ShapeNet
dataset and 72.5% in I-AUC on Real3D-AD dataset. Our dataset will be released
at https://github.com/Chopper-233/Anomaly-ShapeNe
Editing Large Language Models: Problems, Methods, and Opportunities
Despite the ability to train capable LLMs, the methodology for maintaining
their relevancy and rectifying errors remains elusive. To this end, the past
few years have witnessed a surge in techniques for editing LLMs, the objective
of which is to efficiently alter the behavior of LLMs within a specific domain
without negatively impacting performance across other inputs. This paper
embarks on a deep exploration of the problems, methods, and opportunities
related to model editing for LLMs. In particular, we provide an exhaustive
overview of the task definition and challenges associated with model editing,
along with an in-depth empirical analysis of the most progressive methods
currently at our disposal. We also build a new benchmark dataset to facilitate
a more robust evaluation and pinpoint enduring issues intrinsic to existing
techniques. Our objective is to provide valuable insights into the
effectiveness and feasibility of each editing technique, thereby assisting the
community in making informed decisions on the selection of the most appropriate
method for a specific task or context. Code and datasets are available at
https://github.com/zjunlp/EasyEdit.Comment: EMNLP 2023. Updated with new experiment
Data Release of the AST3-2 Automatic Survey from Dome A, Antarctica
AST3-2 is the second of the three Antarctic Survey Telescopes, aimed at
wide-field time-domain optical astronomy. It is located at Dome A, Antarctica,
which is by many measures the best optical astronomy site on the Earth's
surface. Here we present the data from the AST3-2 automatic survey in 2016 and
the photometry results. The median 5 limiting magnitude in -band is
17.8 mag and the light curve precision is 4 mmag for bright stars. The data
release includes photometry for over 7~million stars, from which over 3,500
variable stars were detected, with 70 of them newly discovered. We classify
these new variables into different types by combining their light curve
features with stellar properties from surveys such as StarHorse.Comment: 16 pages, 20 figures, accepted for publication in MNRA
Exoplanets in the Antarctic Sky I. The first data release of AST3-II (CHESPA) and new found variables within the southern CVZ of TESS
Located at Dome A, the highest point of the Antarctic plateau, the Chinese Kunlun station is considered to be one of the best ground-based photometric sites because of its extremely cold, dry, and stable atmosphere. A target can be monitored from there for over 40 days without diurnal interruption during a polar winter. This makes Kunlun station a perfect site to search for short-period transiting exoplanets. Since 2008, an observatory has existed at Kunlun station, and three telescopes are working there. Using these telescopes, the AST3 project has been carried out over the last 6 yr with a search for transiting exoplanets as one of its key programs (CHESPA). In the austral winters of 2016 and 2017, a set of target fields in the southern continuous viewing zone (CVZ) of TESS were monitored by the AST3-II telescope. In this paper, we introduce the CHESPA and present the first data release containing photometry of 26,578 bright stars (m(i) <= 15). The best photometric precision at the optimum magnitude for the survey is around 2 mmag. To demonstrate the data quality, we also present a catalog of 221 variables with a brightness variation greater than 5 mmag from the 2016 data. Among these variables, 179 are newly identified periodic variables not listed in the AAVSO database (https://www.aavso.org/), and 67 are listed in the Candidate Target List. These variables will require careful attention to avoid false-positive signals when searching for transiting exoplanets. Dozens of new transiting exoplanet candidates will be released in a subsequent paper
Exoplanets in the Antarctic Sky. II. 116 Transiting Exoplanet Candidates Found by AST3-II (CHESPA) within the Southern CVZ of TESS
We report first results from the CHinese Exoplanet Searching Program from Antarctica (CHESPA)-a wide-field high-resolution photometric survey for transiting exoplanets carried out using telescopes of the AST3 (Antarctic Survey Telescopes times 3) project. There are now three telescopes (AST3-I, AST3-II, and CSTAR-II) operating at Dome A-the highest point on the Antarctic Plateau-in a fully automatic and remote mode to exploit the superb observing conditions of the site, and its long and uninterrupted polar nights. The search for transiting exoplanets is one of the key projects for AST3. During the austral winters of 2016 and 2017 we used the AST3-II telescope to survey a set of target fields near the southern ecliptic pole, falling within the continuous viewing zone of the TESS mission. The first data release of the 2016 data, including images, catalogs, and light curves of 26,578 bright stars (7.5 <= m(i) <= 15), was presented in Zhang et al. The best precision, as measured by the rms of the light curves at the optimum magnitude of the survey (m(i) = 10), is around 2 mmag. We detect 222 objects with plausible transit signals from these data, 116 of which are plausible transiting exoplanet candidates according to their stellar properties as given by the TESS Input Catalog, Gaia DR2, and TESS-HERMES spectroscopy. With the first data release from TESS expected in late 2018, this candidate list will be timely for improving the rejection of potential false-positives
Challenging National Narratives: On the Origins of Sweet Potato in China as Global Commodity During the Early Modern Period
The introduction of American cereal crops is probably one of the most
important events in China¿s agricultural history, having a great effect on
the agriculture production, national life, the transformation
of consumer behaviour and, to some extent, the nationalization
of consumption. The sweet potato (Ipomoea Batatas L.), in Chinese
g¿nsh¿ ¿¿, is a staple food crop for ancient Chinese society. Today it
still plays an important role in Chinese daily life, as well as guaranteeing
national food security.GECEM Project, Global Encounters between China and Europe: Trade Networks, Consumption and Cultural Exchanges in Macau and Marseille (1680-1840), ERC (European Research Council)- Starting Grant, programa Horizon 2020, número de ref. 679371, www.gecem.eu.Versión del edito
From the “Great Divergence” to the “Great Convergence”: The Modern Transformation of the Yangzi Delta’s Economy in a New Perspective
journal articl
- …