65 research outputs found
Bridge the Gap Between CV and NLP! An Optimization-based Textual Adversarial Attack Framework
Despite recent success on various tasks, deep learning techniques still
perform poorly on adversarial examples with small perturbations. While
optimization-based methods for adversarial attacks are well-explored in the
field of computer vision, it is impractical to directly apply them in natural
language processing due to the discrete nature of the text. To address the
problem, we propose a unified framework to extend the existing
optimization-based adversarial attack methods in the vision domain to craft
textual adversarial samples. In this framework, continuously optimized
perturbations are added to the embedding layer and amplified in the forward
propagation process. Then the final perturbed latent representations are
decoded with a masked language model head to obtain potential adversarial
samples. In this paper, we instantiate our framework with an attack algorithm
named Textual Projected Gradient Descent (T-PGD). We find our algorithm
effective even using proxy gradient information. Therefore, we perform the more
challenging transfer black-box attack and conduct comprehensive experiments to
evaluate our attack algorithm with several models on three benchmark datasets.
Experimental results demonstrate that our method achieves an overall better
performance and produces more fluent and grammatical adversarial samples
compared to strong baseline methods. All the code and data will be made public.Comment: Codes are available at: https://github.com/Phantivia/T-PG
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
This paper reexamines the research on out-of-distribution (OOD) robustness in
the field of NLP. We find that the distribution shift settings in previous
studies commonly lack adequate challenges, hindering the accurate evaluation of
OOD robustness. To address these issues, we propose a benchmark construction
protocol that ensures clear differentiation and challenging distribution
shifts. Then we introduce BOSS, a Benchmark suite for Out-of-distribution
robustneSS evaluation covering 5 tasks and 20 datasets. Based on BOSS, we
conduct a series of experiments on pre-trained language models for analysis and
evaluation of OOD robustness. First, for vanilla fine-tuning, we examine the
relationship between in-distribution (ID) and OOD performance. We identify
three typical types that unveil the inner learning mechanism, which could
potentially facilitate the forecasting of OOD robustness, correlating with the
advancements on ID datasets. Then, we evaluate 5 classic methods on BOSS and
find that, despite exhibiting some effectiveness in specific cases, they do not
offer significant improvement compared to vanilla fine-tuning. Further, we
evaluate 5 LLMs with various adaptation paradigms and find that when sufficient
ID data is available, fine-tuning domain-specific models outperform LLMs on ID
examples significantly. However, in the case of OOD instances, prioritizing
LLMs with in-context learning yields better results. We identify that both
fine-tuned small models and LLMs face challenges in effectively addressing
downstream tasks. The code is public at
\url{https://github.com/lifan-yuan/OOD_NLP}.Comment: Accepted to NeurIPS 2023 Dataset and Benchmark Track. Code is
available at \url{https://github.com/lifan-yuan/OOD_NLP
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework
Textual adversarial attacks can discover models' weaknesses by adding
semantic-preserved but misleading perturbations to the inputs. The long-lasting
adversarial attack-and-defense arms race in Natural Language Processing (NLP)
is algorithm-centric, providing valuable techniques for automatic robustness
evaluation. However, the existing practice of robustness evaluation may exhibit
issues of incomprehensive evaluation, impractical evaluation protocol, and
invalid adversarial samples. In this paper, we aim to set up a unified
automatic robustness evaluation framework, shifting towards model-centric
evaluation to further exploit the advantages of adversarial attacks. To address
the above challenges, we first determine robustness evaluation dimensions based
on model capabilities and specify the reasonable algorithm to generate
adversarial samples for each dimension. Then we establish the evaluation
protocol, including evaluation settings and metrics, under realistic demands.
Finally, we use the perturbation degree of adversarial samples to control the
sample validity. We implement a toolkit RobTest that realizes our automatic
robustness evaluation framework. In our experiments, we conduct a robustness
evaluation of RoBERTa models to demonstrate the effectiveness of our evaluation
framework, and further show the rationality of each component in the framework.
The code will be made public at \url{https://github.com/thunlp/RobTest}.Comment: Accepted to Findings of ACL 202
Photometry of Variable Stars from Dome A, Antarctica
Dome A on the Antarctic plateau is likely one of the best observing sites on
Earth thanks to the excellent atmospheric conditions present at the site during
the long polar winter night. We present high-cadence time-series aperture
photometry of 10,000 stars with i<14.5 mag located in a 23 square-degree region
centered on the south celestial pole. The photometry was obtained with one of
the CSTAR telescopes during 128 days of the 2008 Antarctic winter.
We used this photometric data set to derive site statistics for Dome A and to
search for variable stars. Thanks to the nearly-uninterrupted synoptic
coverage, we find 6 times as many variables as previous surveys with similar
magnitude limits. We detected 157 variable stars, of which 55% are
unclassified, 27% are likely binaries and 17% are likely pulsating stars. The
latter category includes delta Scuti, gamma Doradus and RR Lyrae variables. One
variable may be a transiting exoplanet.Comment: Accepted for publication in the Astronomical Journal. PDF version
with high-resolution figures available at
http://faculty.physics.tamu.edu/lmacri/papers/wang11.pd
Photometric Variability in the CSTAR Field: Results From the 2008 Data Set
The Chinese Small Telescope ARray (CSTAR) is the first telescope facility
built at Dome A, Antarctica. During the 2008 observing season, the installation
provided long-baseline and high-cadence photometric observations in the i-band
for 18,145 targets within 20 deg2 CSTAR field around the South Celestial Pole
for the purpose of monitoring the astronomical observing quality of Dome A and
detecting various types of photometric variability. Using sensitive and robust
detection methods, we discover 274 potential variables from this data set, 83
of which are new discoveries. We characterize most of them, providing the
periods, amplitudes and classes of variability. The catalog of all these
variables is presented along with the discussion of their statistical
properties.Comment: 38 pages, 11 figures, 4 tables; Accepted for publication in ApJ
The First Release of the CSTAR Point Source Catalog from Dome A, Antarctica
In 2008 January the 24th Chinese expedition team successfully deployed the
Chinese Small Telescope ARray (CSTAR) to DomeA, the highest point on the
Antarctic plateau. CSTAR consists of four 14.5cm optical telescopes, each with
a different filter (g, r, i and open) and has a 4.5degree x 4.5degree field of
view (FOV). It operates robotically as part of the Plateau Observatory, PLATO,
with each telescope taking an image every 30 seconds throughout the year
whenever it is dark. During 2008, CSTAR #1 performed almost flawlessly,
acquiring more than 0.3 million i-band images for a total integration time of
1728 hours during 158 days of observations. For each image taken under good sky
conditions, more than 10,000 sources down to 16 mag could be detected. We
performed aperture photometry on all the sources in the field to create the
catalog described herein. Since CSTAR has a fixed pointing centered on the
South Celestial Pole (Dec =-90 degree), all the sources within the FOV of CSTAR
were monitored continuously for several months. The photometric catalog can be
used for studying any variability in these sources, and for the discovery of
transient sources such as supernovae, gamma-ray bursts and minor planets.Comment: 1 latex file and 9 figures The paper is accepted by PAS
Eclipsing Binaries From the CSTAR Project at Dome A, Antarctica
The Chinese Small Telescope ARray (CSTAR) has observed an area around the
Celestial South Pole at Dome A since 2008. About light curves in the i
band were obtained lasting from March to July, 2008. The photometric precision
achieves about 4 mmag at i = 7.5 and 20 mmag at i = 12 within a 30 s exposure
time. These light curves are analyzed using Lomb--Scargle, Phase Dispersion
Minimization, and Box Least Squares methods to search for periodic signals.
False positives may appear as a variable signature caused by contaminating
stars and the observation mode of CSTAR. Therefore the period and position of
each variable candidate are checked to eliminate false positives. Eclipsing
binaries are removed by visual inspection, frequency spectrum analysis and
locally linear embedding technique. We identify 53 eclipsing binaries in the
field of view of CSTAR, containing 24 detached binaries, 8 semi-detached
binaries, 18 contact binaries, and 3 ellipsoidal variables. To derive the
parameters of these binaries, we use the Eclipsing Binaries via Artificial
Intelligence (EBAI) method. The primary and the secondary eclipse timing
variations (ETVs) for semi-detached and contact systems are analyzed.
Correlated primary and secondary ETVs confirmed by false alarm tests may
indicate an unseen perturbing companion. Through ETV analysis, we identify two
triple systems (CSTAR J084612.64-883342.9 and CSTAR J220502.55-895206.7). The
orbital parameters of the third body in CSTAR J220502.55-895206.7 are derived
using a simple dynamical model.Comment: 41 pages, 12 figures; published online in ApJ
The sky brightness and transparency in i-band at Dome A, Antarctica
The i-band observing conditions at Dome A on the Antarctic plateau have been
investigated using data acquired during 2008 with the Chinese Small Telescope
ARray. The sky brightness, variations in atmospheric transparency, cloud cover,
and the presence of aurorae are obtained from these images. The median sky
brightness of moonless clear nights is 20.5 mag arcsec^{-2} in the SDSS
band at the South Celestial Pole (which includes a contribution of about 0.06
mag from diffuse Galactic light). The median over all Moon phases in the
Antarctic winter is about 19.8 mag arcsec^{-2}. There were no thick clouds in
2008. We model contributions of the Sun and the Moon to the sky background to
obtain the relationship between the sky brightness and transparency. Aurorae
are identified by comparing the observed sky brightness to the sky brightness
expected from this model. About 2% of the images are affected by relatively
strong aurorae.Comment: There are 1 Latex file and 14 figures accepted by A
- …