2,367 research outputs found

    Revisiting adapters with adversarial training

    Full text link
    While adversarial training is generally used as a defense mechanism, recent works show that it can also act as a regularizer. By co-training a neural network on clean and adversarial inputs, it is possible to improve classification accuracy on the clean, non-adversarial inputs. We demonstrate that, contrary to previous findings, it is not necessary to separate batch statistics when co-training on clean and adversarial inputs, and that it is sufficient to use adapters with few domain-specific parameters for each type of input. We establish that using the classification token of a Vision Transformer (ViT) as an adapter is enough to match the classification performance of dual normalization layers, while using significantly less additional parameters. First, we improve upon the top-1 accuracy of a non-adversarially trained ViT-B16 model by +1.12% on ImageNet (reaching 83.76% top-1 accuracy). Second, and more importantly, we show that training with adapters enables model soups through linear combinations of the clean and adversarial tokens. These model soups, which we call adversarial model soups, allow us to trade-off between clean and robust accuracy without sacrificing efficiency. Finally, we show that we can easily adapt the resulting models in the face of distribution shifts. Our ViT-B16 obtains top-1 accuracies on ImageNet variants that are on average +4.00% better than those obtained with Masked Autoencoders

    Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

    Full text link
    While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen ℓ∞\ell_\infty-threat model, but even more so improve generalization to unseen ℓ1/ℓ2\ell_1/\ell_2-attacks. Our modified ConvNeXt, ConvNeXt + ConvStem, yields the most robust ℓ∞\ell_\infty-models across different ranges of model parameters and FLOPs, while our ViT + ConvStem yields the best generalization to unseen threat models.Comment: Accepted at NeurIPS 202

    Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models

    Full text link
    While a large amount of work has focused on designing adversarial attacks against image classifiers, only a few methods exist to attack semantic segmentation models. We show that attacking segmentation models presents task-specific challenges, for which we propose novel solutions. Our final evaluation protocol outperforms existing methods, and shows that those can overestimate the robustness of the models. Additionally, so far adversarial training, the most successful way for obtaining robust image classifiers, could not be successfully applied to semantic segmentation. We argue that this is because the task to be learned is more challenging, and requires significantly higher computational effort than for image classification. As a remedy, we show that by taking advantage of recent advances in robust ImageNet classifiers, one can train adversarially robust segmentation models at limited computational cost by fine-tuning robust backbones

    Shape of extremal functions for weighted Sobolev-type inequalities

    Full text link
    We study the shape of solutions to some variational problems in Sobolev spaces with weights that are powers of |x|. In particular, we detect situations when the extremal functions lack symmetry properties such as radial symmetry and antisymmetry. We also prove an isoperimetric inequality for the first non-zero eigenvalue of a weighted Neumann problem

    MicroRNAs in melanoma development and resistance to target therapy

    Get PDF
    microRNAs constitute a complex class of pleiotropic post-transcriptional regulators of gene expression involved in the control of several physiologic and pathologic processes. Their mechanism of action is primarily based on the imperfect matching of a seed region located at the 5' end of a 21-23 nt sequence with a partially complementary sequence located in the 3' untranslated region of target mRNAs. This leads to inhibition of mRNA translation and eventually to its degradation. Individual miRNAs are capable of binding to several mRNAs and several miRNAs are capable of influencing the function of the same mRNAs. In recent years networks of miRNAs are emerging as capable of controlling key signaling pathways responsible for the growth and propagation of cancer cells. Furthermore several examples have been provided which highlight the involvement of miRNAs in the development of resistance to targeted drug therapies. In this review we provide an updated overview of the role of miRNAs in the development of melanoma and the identification of the main downstream pathways controlled by these miRNAs. Furthermore we discuss a group of miRNAs capable to influence through their respective up- or down-modulation the development of resistance to BRAF and MEK inhibitors

    A modern look at the relationship between sharpness and generalization

    Full text link
    Sharpness of minima is a promising quantity that can correlate with generalization in deep networks and, when optimized during training, can improve generalization. However, standard sharpness is not invariant under reparametrizations of neural networks, and, to fix this, reparametrization-invariant sharpness definitions have been proposed, most prominently adaptive sharpness (Kwon et al., 2021). But does it really capture generalization in modern practical settings? We comprehensively explore this question in a detailed study of various definitions of adaptive sharpness in settings ranging from training from scratch on ImageNet and CIFAR-10 to fine-tuning CLIP on ImageNet and BERT on MNLI. We focus mostly on transformers for which little is known in terms of sharpness despite their widespread usage. Overall, we observe that sharpness does not correlate well with generalization but rather with some training parameters like the learning rate that can be positively or negatively correlated with generalization depending on the setup. Interestingly, in multiple cases, we observe a consistent negative correlation of sharpness with out-of-distribution error implying that sharper minima can generalize better. Finally, we illustrate on a simple model that the right sharpness measure is highly data-dependent, and that we do not understand well this aspect for realistic data distributions. The code of our experiments is available at https://github.com/tml-epfl/sharpness-vs-generalization

    Correlation of Ac-impedance and in situ X-ray spectra of LiCoO2

    Get PDF
    In-situ X-ray and AC-impedance spectra have been obtained simultaneously during the deintercalation of lithium from LiCoO2 using a specially designed electrochemical cell. The AC-dispersions have been correlated with the cell parameters obtained from the X-ray spectra. The correlation confirms previous hypothesis on the interpretation of the AC-dispersions in terms of an equivalent circuit comprising an element that relates the change of the intrinsic electronic conductivity, occurring at the early stages of deintercalation, to the semiconductor to metal transition caused by the change of the cell parameters

    Intensifying Air Separation Units

    Get PDF
    This research activity shows the possibility to further intensify air separation units (ASUs). Several modifications to the traditional process layout are proposed and simulated by means of well-established simulation suites. The proposed modifications deal with the upgrading of oxygen purity, the recycle of rich argon stream, and the possibility to generate energy. The novel process solutions are compared with the traditional process and techno-economic assessment is provided. The payback time on the revamping investment is estimated in the order of 5 years for Terni’s ASU, owned by Linde Gas Italia S.r.l., which has been selected as test-case

    Knowledge discovery in databases of biomechanical variables: application to the sit to stand motor task

    Get PDF
    BACKGROUND: The interpretation of data obtained in a movement analysis laboratory is a crucial issue in clinical contexts. Collection of such data in large databases might encourage the use of modern techniques of data mining to discover additional knowledge with automated methods. In order to maximise the size of the database, simple and low-cost experimental set-ups are preferable. The aim of this study was to extract knowledge inherent in the sit-to-stand task as performed by healthy adults, by searching relationships among measured and estimated biomechanical quantities. An automated method was applied to a large amount of data stored in a database. The sit-to-stand motor task was already shown to be adequate for determining the level of individual motor ability. METHODS: The technique of search for association rules was chosen to discover patterns as part of a Knowledge Discovery in Databases (KDD) process applied to a sit-to-stand motor task observed with a simple experimental set-up and analysed by means of a minimum measured input model. Selected parameters and variables of a database containing data from 110 healthy adults, of both genders and of a large range of age, performing the task were considered in the analysis. RESULTS: A set of rules and definitions were found characterising the patterns shared by the investigated subjects. Time events of the task turned out to be highly interdependent at least in their average values, showing a high level of repeatability of the timing of the performance of the task. CONCLUSIONS: The distinctive patterns of the sit-to-stand task found in this study, associated to those that could be found in similar studies focusing on subjects with pathologies, could be used as a reference for the functional evaluation of specific subjects performing the sit-to-stand motor task
    • …
    corecore