2,367 research outputs found
Revisiting adapters with adversarial training
While adversarial training is generally used as a defense mechanism, recent
works show that it can also act as a regularizer. By co-training a neural
network on clean and adversarial inputs, it is possible to improve
classification accuracy on the clean, non-adversarial inputs. We demonstrate
that, contrary to previous findings, it is not necessary to separate batch
statistics when co-training on clean and adversarial inputs, and that it is
sufficient to use adapters with few domain-specific parameters for each type of
input. We establish that using the classification token of a Vision Transformer
(ViT) as an adapter is enough to match the classification performance of dual
normalization layers, while using significantly less additional parameters.
First, we improve upon the top-1 accuracy of a non-adversarially trained
ViT-B16 model by +1.12% on ImageNet (reaching 83.76% top-1 accuracy). Second,
and more importantly, we show that training with adapters enables model soups
through linear combinations of the clean and adversarial tokens. These model
soups, which we call adversarial model soups, allow us to trade-off between
clean and robust accuracy without sacrificing efficiency. Finally, we show that
we can easily adapt the resulting models in the face of distribution shifts.
Our ViT-B16 obtains top-1 accuracies on ImageNet variants that are on average
+4.00% better than those obtained with Masked Autoencoders
Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models
While adversarial training has been extensively studied for ResNet
architectures and low resolution datasets like CIFAR, much less is known for
ImageNet. Given the recent debate about whether transformers are more robust
than convnets, we revisit adversarial training on ImageNet comparing ViTs and
ConvNeXts. Extensive experiments show that minor changes in architecture, most
notably replacing PatchStem with ConvStem, and training scheme have a
significant impact on the achieved robustness. These changes not only increase
robustness in the seen -threat model, but even more so improve
generalization to unseen -attacks. Our modified ConvNeXt,
ConvNeXt + ConvStem, yields the most robust -models across
different ranges of model parameters and FLOPs, while our ViT + ConvStem yields
the best generalization to unseen threat models.Comment: Accepted at NeurIPS 202
Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models
While a large amount of work has focused on designing adversarial attacks
against image classifiers, only a few methods exist to attack semantic
segmentation models. We show that attacking segmentation models presents
task-specific challenges, for which we propose novel solutions. Our final
evaluation protocol outperforms existing methods, and shows that those can
overestimate the robustness of the models. Additionally, so far adversarial
training, the most successful way for obtaining robust image classifiers, could
not be successfully applied to semantic segmentation. We argue that this is
because the task to be learned is more challenging, and requires significantly
higher computational effort than for image classification. As a remedy, we show
that by taking advantage of recent advances in robust ImageNet classifiers, one
can train adversarially robust segmentation models at limited computational
cost by fine-tuning robust backbones
Shape of extremal functions for weighted Sobolev-type inequalities
We study the shape of solutions to some variational problems in Sobolev
spaces with weights that are powers of |x|. In particular, we detect situations
when the extremal functions lack symmetry properties such as radial symmetry
and antisymmetry. We also prove an isoperimetric inequality for the first
non-zero eigenvalue of a weighted Neumann problem
MicroRNAs in melanoma development and resistance to target therapy
microRNAs constitute a complex class of pleiotropic post-transcriptional regulators of gene expression involved in the control of several physiologic and pathologic processes. Their mechanism of action is primarily based on the imperfect matching of a seed region located at the 5' end of a 21-23 nt sequence with a partially complementary sequence located in the 3' untranslated region of target mRNAs. This leads to inhibition of mRNA translation and eventually to its degradation. Individual miRNAs are capable of binding to several mRNAs and several miRNAs are capable of influencing the function of the same mRNAs. In recent years networks of miRNAs are emerging as capable of controlling key signaling pathways responsible for the growth and propagation of cancer cells. Furthermore several examples have been provided which highlight the involvement of miRNAs in the development of resistance to targeted drug therapies. In this review we provide an updated overview of the role of miRNAs in the development of melanoma and the identification of the main downstream pathways controlled by these miRNAs. Furthermore we discuss a group of miRNAs capable to influence through their respective up- or down-modulation the development of resistance to BRAF and MEK inhibitors
A modern look at the relationship between sharpness and generalization
Sharpness of minima is a promising quantity that can correlate with
generalization in deep networks and, when optimized during training, can
improve generalization. However, standard sharpness is not invariant under
reparametrizations of neural networks, and, to fix this,
reparametrization-invariant sharpness definitions have been proposed, most
prominently adaptive sharpness (Kwon et al., 2021). But does it really capture
generalization in modern practical settings? We comprehensively explore this
question in a detailed study of various definitions of adaptive sharpness in
settings ranging from training from scratch on ImageNet and CIFAR-10 to
fine-tuning CLIP on ImageNet and BERT on MNLI. We focus mostly on transformers
for which little is known in terms of sharpness despite their widespread usage.
Overall, we observe that sharpness does not correlate well with generalization
but rather with some training parameters like the learning rate that can be
positively or negatively correlated with generalization depending on the setup.
Interestingly, in multiple cases, we observe a consistent negative correlation
of sharpness with out-of-distribution error implying that sharper minima can
generalize better. Finally, we illustrate on a simple model that the right
sharpness measure is highly data-dependent, and that we do not understand well
this aspect for realistic data distributions. The code of our experiments is
available at https://github.com/tml-epfl/sharpness-vs-generalization
Correlation of Ac-impedance and in situ X-ray spectra of LiCoO2
In-situ X-ray and AC-impedance spectra have been obtained simultaneously during the deintercalation of lithium from LiCoO2 using a specially designed electrochemical cell. The AC-dispersions have been correlated with the cell parameters obtained from the X-ray spectra. The correlation confirms previous hypothesis on the interpretation of the AC-dispersions in terms of an equivalent circuit comprising an element that relates the change of the intrinsic electronic conductivity, occurring at the early stages of deintercalation, to the semiconductor to metal transition caused by the change of the cell parameters
Intensifying Air Separation Units
This research activity shows the possibility to further intensify air separation units (ASUs). Several modifications to the traditional process layout are proposed and simulated by means of well-established simulation suites. The proposed modifications deal with the upgrading of oxygen purity, the recycle of rich argon stream, and the possibility to generate energy. The novel process solutions are compared with the traditional process and techno-economic assessment is provided. The payback time on the revamping investment is estimated in the order of 5 years for Terni’s ASU, owned by Linde Gas Italia S.r.l., which has been selected as test-case
Knowledge discovery in databases of biomechanical variables: application to the sit to stand motor task
BACKGROUND: The interpretation of data obtained in a movement analysis laboratory is a crucial issue in clinical contexts. Collection of such data in large databases might encourage the use of modern techniques of data mining to discover additional knowledge with automated methods. In order to maximise the size of the database, simple and low-cost experimental set-ups are preferable. The aim of this study was to extract knowledge inherent in the sit-to-stand task as performed by healthy adults, by searching relationships among measured and estimated biomechanical quantities. An automated method was applied to a large amount of data stored in a database. The sit-to-stand motor task was already shown to be adequate for determining the level of individual motor ability. METHODS: The technique of search for association rules was chosen to discover patterns as part of a Knowledge Discovery in Databases (KDD) process applied to a sit-to-stand motor task observed with a simple experimental set-up and analysed by means of a minimum measured input model. Selected parameters and variables of a database containing data from 110 healthy adults, of both genders and of a large range of age, performing the task were considered in the analysis. RESULTS: A set of rules and definitions were found characterising the patterns shared by the investigated subjects. Time events of the task turned out to be highly interdependent at least in their average values, showing a high level of repeatability of the timing of the performance of the task. CONCLUSIONS: The distinctive patterns of the sit-to-stand task found in this study, associated to those that could be found in similar studies focusing on subjects with pathologies, could be used as a reference for the functional evaluation of specific subjects performing the sit-to-stand motor task
- …