43 research outputs found
A Closer Look at the Adversarial Robustness of Deep Equilibrium Models
Deep equilibrium models (DEQs) refrain from the traditional layer-stacking
paradigm and turn to find the fixed point of a single layer. DEQs have achieved
promising performance on different applications with featured memory
efficiency. At the same time, the adversarial vulnerability of DEQs raises
concerns. Several works propose to certify robustness for monotone DEQs.
However, limited efforts are devoted to studying empirical robustness for
general DEQs. To this end, we observe that an adversarially trained DEQ
requires more forward steps to arrive at the equilibrium state, or even
violates its fixed-point structure. Besides, the forward and backward tracks of
DEQs are misaligned due to the black-box solvers. These facts cause gradient
obfuscation when applying the ready-made attacks to evaluate or adversarially
train DEQs. Given this, we develop approaches to estimate the intermediate
gradients of DEQs and integrate them into the attacking pipelines. Our
approaches facilitate fully white-box evaluations and lead to effective
adversarial defense for DEQs. Extensive experiments on CIFAR-10 validate the
adversarial robustness of DEQs competitive with deep networks of similar sizes.Comment: Accepted at NeurIPS 2022. Our code is available at
https://github.com/minicheshire/DEQ-White-Box-Robustnes
Improving Adversarial Robustness of DEQs with Explicit Regulations Along the Neural Dynamics
Deep equilibrium (DEQ) models replace the multiple-layer stacking of
conventional deep networks with a fixed-point iteration of a single-layer
transformation. Having been demonstrated to be competitive in a variety of
real-world scenarios, the adversarial robustness of general DEQs becomes
increasingly crucial for their reliable deployment. Existing works improve the
robustness of general DEQ models with the widely-used adversarial training (AT)
framework, but they fail to exploit the structural uniquenesses of DEQ models.
To this end, we interpret DEQs through the lens of neural dynamics and find
that AT under-regulates intermediate states. Besides, the intermediate states
typically provide predictions with a high prediction entropy. Informed by the
correlation between the entropy of dynamical systems and their stability
properties, we propose reducing prediction entropy by progressively updating
inputs along the neural dynamics. During AT, we also utilize random
intermediate states to compute the loss function. Our methods regulate the
neural dynamics of DEQ models in this manner. Extensive experiments demonstrate
that our methods substantially increase the robustness of DEQ models and even
outperform the strong deep network baselines.Comment: Accepted at ICML 2023. Our code is available at
https://github.com/minicheshire/DEQ-Regulating-Neural-Dynamic
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Warning: this paper contains model outputs exhibiting offensiveness and
biases. Recently pre-trained language models (PLMs) have prospered in various
natural language generation (NLG) tasks due to their ability to generate fairly
fluent text. Nevertheless, these models are observed to capture and reproduce
harmful contents in training corpora, typically toxic language and social
biases, raising severe moral issues. Prior works on ethical NLG tackle
detoxifying and debiasing separately, which is problematic since we find
debiased models still exhibit toxicity while detoxified ones even exacerbate
biases. To address such a challenge, we propose the first unified framework of
detoxifying and debiasing called UDDIA, which jointly formalizes these two
problems as rectifying the output space. We theoretically interpret our
framework as learning a text distribution mixing weighted attributes. Besides,
UDDIA conducts adaptive optimization of only a few parameters during decoding
based on a parameter-efficient tuning schema without any training data. This
leads to minimal generation quality loss and improved rectification performance
with acceptable computational cost. Experimental results demonstrate that
compared to several strong baselines, UDDIA achieves debiasing and detoxifying
simultaneously and better balances efficiency and effectiveness, taking a
further step towards practical ethical NLG.Comment: Work in Progress. Preprin
Adversarial Robust Memory-Based Continual Learner
Despite the remarkable advances that have been made in continual learning,
the adversarial vulnerability of such methods has not been fully discussed. We
delve into the adversarial robustness of memory-based continual learning
algorithms and observe limited robustness improvement by directly applying
adversarial training techniques. Preliminary studies reveal the twin challenges
for building adversarial robust continual learners: accelerated forgetting in
continual learning and gradient obfuscation in adversarial robustness. In this
study, we put forward a novel adversarial robust memory-based continual learner
that adjusts data logits to mitigate the forgetting of pasts caused by
adversarial samples. Furthermore, we devise a gradient-based data selection
mechanism to overcome the gradient obfuscation caused by limited stored data.
The proposed approach can widely integrate with existing memory-based continual
learning as well as adversarial training algorithms in a plug-and-play way.
Extensive experiments on Split-CIFAR10/100 and Split-Tiny-ImageNet demonstrate
the effectiveness of our approach, achieving up to 8.13% higher accuracy for
adversarial data
Elucidating the surface geometric design of hydrophobic Australian Eucalyptus leaves: experimental and modeling studies
Three Australian native Eucalyptus species, i.e., Eucalyptus woodwardii, Eucalyptus pachyphylla and Eucalyptus dolorosa, were investigated, for the first time, with respect to the hydrophobicity of their leaves. It is well established that these leaves exhibit exceptionally high water repellency, in addition to an extraordinary ability to retain water, albeit their specific wetting mechanisms are still poorly understood. To identify the critical factors underlying this phenomenon, the surface topography of these leaves was subjected to micro-examination (SEM). Micro- and nanometer scale surface roughness was revealed, resembling that of the quintessential “lotus effect”. Surface free energy analysis was performed on two models based on the surface topographies of the study Eucalyptus species and lotus, in order to study wetting transitions on these specific microscopic surface features. The influence of surface geometrical parameters, such as edge-to-edge distance, base radius and cylindrical height, on surface free energy with different liquid penetration depths was studied with these two models. Larger energy barriers and smaller liquid-solid contact areas were more influential in the calculations for the lotus than for Eucalyptus. The information obtained from these two models may be useful for guiding the design of novel artificial surfaces in the collection and transport of micro-volume liquids. © 2019 The Author
Arbitrary Few Parameters are Good Enough for Adapting Large-scale Pre-trained Language Models
Parameter-efficient tuning (PET) methods can effectively drive extremely
large pre-trained language models (PLMs) by only training minimal parameters.
Different PET methods utilize different manually designed modules. In a small
PLM, there are usually noticeable performance differences among PET methods.
Nevertheless, when a PLM's scale grows up to tens of billions of parameters,
all PET methods achieve almost the same performance and even perform on par
with the full-parameter fine-tuning method. Hence, we hypothesize that model
scaling can mitigate the design differences (the module structures and the
number of trainable parameters) among PET methods. To study this hypothesis, we
introduce a more flexible PET method - arbitrary PET (APET) method - to be
compatible with arbitrary module structures and any number of trainable
parameters. Then, we experiment on NLP tasks of types and
representative PLMs. From our investigations, we find that the model scaling
(1) mitigates the effects of the arbitrary module structure on the performance
of tuning methods, and (2) enables the tuning methods to optimize fewer
parameters to achieve the full-parameter fine-tuning performance. Intriguingly,
we also observe that all tuning methods require almost the same number of
trainable parameters to drive PLMs. We discuss this phenomenon and the above
two findings collectively from optimization perspectives to fathom the
mechanisms behind them. These conclusions not only demonstrate the positive
impact of model scaling on tuning methods but disclose its mechanisms, which
help us design more effective and efficient tuning methods on larger-scale
PLMs
Severe Acute Respiratory Syndrome, Beijing, 2003
The largest outbreak of severe acute respiratory syndrome (SARS) struck Beijing in spring 2003. Multiple importations of SARS to Beijing initiated transmission in several healthcare facilities. Beijing’s outbreak began March 5; by late April, daily hospital admissions for SARS exceeded 100 for several days; 2,521 cases of probable SARS occurred. Attack rates were highest in those 20–39 years of age; 1% of cases occurred in children <10 years. The case-fatality rate was highest among patients >65 years (27.7% vs. 4.8% for those 20–64 years, p < 0.001). Healthcare workers accounted for 16% of probable cases. The proportion of case-patients without known contact to a SARS patient increased significantly in May. Implementation of early detection, isolation, contact tracing, quarantine, triage of case-patients to designated SARS hospitals, and community mobilization ended the outbreak
The small molecule raptinal can simultaneously induce apoptosis and inhibit PANX1 activity
Discovery of new small molecules that can activate distinct programmed cell death pathway is of significant interest as a research tool and for the development of novel therapeutics for pathological conditions such as cancer and infectious diseases. The small molecule raptinal was discovered as a pro-apoptotic compound that can rapidly trigger apoptosis by promoting the release of cytochrome c from the mitochondria and subsequently activating the intrinsic apoptotic pathway. As raptinal is very effective at inducing apoptosis in a variety of different cell types in vitro and in vivo, it has been used in many studies investigating cell death as well as the clearance of dying cells. While examining raptinal as an apoptosis inducer, we unexpectedly identified that in addition to its pro-apoptotic activities, raptinal can also inhibit the activity of caspase-activated Pannexin 1 (PANX1), a ubiquitously expressed transmembrane channel that regulates many cell death-associated processes. By implementing numerous biochemical, cell biological and electrophysiological approaches, we discovered that raptinal can simultaneously induce apoptosis and inhibit PANX1 activity. Surprisingly, raptinal was found to inhibit cleavage-activated PANX1 via a mechanism distinct to other well-described PANX1 inhibitors such as carbenoxolone and trovafloxacin. Furthermore, raptinal also interfered with PANX1-regulated apoptotic processes including the release of the 'find-me' signal ATP, the formation of apoptotic cell-derived extracellular vesicles, as well as NLRP3 inflammasome activation. Taken together, these data identify raptinal as the first compound that can simultaneously induce apoptosis and inhibit PANX1 channels. This has broad implications for the use of raptinal in cell death studies as well as in the development new PANX1 inhibitors