686 research outputs found
The Power of Linear Combinations: Learning with Random Convolutions
Following the traditional paradigm of convolutional neural networks (CNNs),
modern CNNs manage to keep pace with more recent, for example
transformer-based, models by not only increasing model depth and width but also
the kernel size. This results in large amounts of learnable model parameters
that need to be handled during training. While following the convolutional
paradigm with the according spatial inductive bias, we question the
significance of \emph{learned} convolution filters. In fact, our findings
demonstrate that many contemporary CNN architectures can achieve high test
accuracies without ever updating randomly initialized (spatial) convolution
filters. Instead, simple linear combinations (implemented through efficient
convolutions) suffice to effectively recombine even random filters
into expressive network operators. Furthermore, these combinations of random
filters can implicitly regularize the resulting operations, mitigating
overfitting and enhancing overall performance and robustness. Conversely,
retaining the ability to learn filter updates can impair network performance.
Lastly, although we only observe relatively small gains from learning convolutions, the learning gains increase proportionally with kernel size,
owing to the non-idealities of the independent and identically distributed
(\textit{i.i.d.}) nature of default initialization techniques
Don't Look into the Sun: Adversarial Solarization Attacks on Image Classifiers
Assessing the robustness of deep neural networks against out-of-distribution
inputs is crucial, especially in safety-critical domains like autonomous
driving, but also in safety systems where malicious actors can digitally alter
inputs to circumvent safety guards. However, designing effective
out-of-distribution tests that encompass all possible scenarios while
preserving accurate label information is a challenging task. Existing
methodologies often entail a compromise between variety and constraint levels
for attacks and sometimes even both. In a first step towards a more holistic
robustness evaluation of image classification models, we introduce an attack
method based on image solarization that is conceptually straightforward yet
avoids jeopardizing the global structure of natural images independent of the
intensity. Through comprehensive evaluations of multiple ImageNet models, we
demonstrate the attack's capacity to degrade accuracy significantly, provided
it is not integrated into the training augmentations. Interestingly, even then,
no full immunity to accuracy deterioration is achieved. In other settings, the
attack can often be simplified into a black-box attack with model-independent
parameters. Defenses against other corruptions do not consistently extend to be
effective against our specific attack.
Project website: https://github.com/paulgavrikov/adversarial_solarizatio
- …