167 research outputs found
The Power of Linear Combinations: Learning with Random Convolutions
Following the traditional paradigm of convolutional neural networks (CNNs),
modern CNNs manage to keep pace with more recent, for example
transformer-based, models by not only increasing model depth and width but also
the kernel size. This results in large amounts of learnable model parameters
that need to be handled during training. While following the convolutional
paradigm with the according spatial inductive bias, we question the
significance of \emph{learned} convolution filters. In fact, our findings
demonstrate that many contemporary CNN architectures can achieve high test
accuracies without ever updating randomly initialized (spatial) convolution
filters. Instead, simple linear combinations (implemented through efficient
convolutions) suffice to effectively recombine even random filters
into expressive network operators. Furthermore, these combinations of random
filters can implicitly regularize the resulting operations, mitigating
overfitting and enhancing overall performance and robustness. Conversely,
retaining the ability to learn filter updates can impair network performance.
Lastly, although we only observe relatively small gains from learning convolutions, the learning gains increase proportionally with kernel size,
owing to the non-idealities of the independent and identically distributed
(\textit{i.i.d.}) nature of default initialization techniques
Don't Look into the Sun: Adversarial Solarization Attacks on Image Classifiers
Assessing the robustness of deep neural networks against out-of-distribution
inputs is crucial, especially in safety-critical domains like autonomous
driving, but also in safety systems where malicious actors can digitally alter
inputs to circumvent safety guards. However, designing effective
out-of-distribution tests that encompass all possible scenarios while
preserving accurate label information is a challenging task. Existing
methodologies often entail a compromise between variety and constraint levels
for attacks and sometimes even both. In a first step towards a more holistic
robustness evaluation of image classification models, we introduce an attack
method based on image solarization that is conceptually straightforward yet
avoids jeopardizing the global structure of natural images independent of the
intensity. Through comprehensive evaluations of multiple ImageNet models, we
demonstrate the attack's capacity to degrade accuracy significantly, provided
it is not integrated into the training augmentations. Interestingly, even then,
no full immunity to accuracy deterioration is achieved. In other settings, the
attack can often be simplified into a black-box attack with model-independent
parameters. Defenses against other corruptions do not consistently extend to be
effective against our specific attack.
Project website: https://github.com/paulgavrikov/adversarial_solarizatio
On the Interplay of Convolutional Padding and Adversarial Robustness
It is common practice to apply padding prior to convolution operations to
preserve the resolution of feature-maps in Convolutional Neural Networks (CNN).
While many alternatives exist, this is often achieved by adding a border of
zeros around the inputs. In this work, we show that adversarial attacks often
result in perturbation anomalies at the image boundaries, which are the areas
where padding is used. Consequently, we aim to provide an analysis of the
interplay between padding and adversarial attacks and seek an answer to the
question of how different padding modes (or their absence) affect adversarial
robustness in various scenarios.Comment: Accepted as full paper at ICCV-W 2023 BRAV
Adversarial Robustness through the Lens of Convolutional Filters
Deep learning models are intrinsically sensitive to distribution shifts in
the input data. In particular, small, barely perceivable perturbations to the
input data can force models to make wrong predictions with high confidence. An
common defense mechanism is regularization through adversarial training which
injects worst-case perturbations back into training to strengthen the decision
boundaries, and to reduce overfitting. In this context, we perform an
investigation of 3x3 convolution filters that form in adversarially-trained
models. Filters are extracted from 71 public models of the linf-RobustBench
CIFAR-10/100 and ImageNet1k leaderboard and compared to filters extracted from
models built on the same architectures but trained without robust
regularization. We observe that adversarially-robust models appear to form more
diverse, less sparse, and more orthogonal convolution filters than their normal
counterparts. The largest differences between robust and normal models are
found in the deepest layers, and the very first convolution layer, which
consistently and predominantly forms filters that can partially eliminate
perturbations, irrespective of the architecture. Data & Project website:
https://github.com/paulgavrikov/cvpr22w_RobustnessThroughTheLensComment: Accepted at the CVPR 2022 "The Art of Robustness" Worksho
CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters
Currently, many theoretical as well as practically relevant questions towards
the transferability and robustness of Convolutional Neural Networks (CNNs)
remain unsolved. While ongoing research efforts are engaging these problems
from various angles, in most computer vision related cases these approaches can
be generalized to investigations of the effects of distribution shifts in image
data. In this context, we propose to study the shifts in the learned weights of
trained CNN models. Here we focus on the properties of the distributions of
dominantly used 3x3 convolution filter kernels. We collected and publicly
provide a dataset with over 1.4 billion filters from hundreds of trained CNNs,
using a wide range of datasets, architectures, and vision tasks. In a first use
case of the proposed dataset, we can show highly relevant properties of many
publicly available pre-trained models for practical applications: I) We analyze
distribution shifts (or the lack thereof) between trained filters along
different axes of meta-parameters, like visual category of the dataset, task,
architecture, or layer depth. Based on these results, we conclude that model
pre-training can succeed on arbitrary datasets if they meet size and variance
conditions. II) We show that many pre-trained models contain degenerated
filters which make them less robust and less suitable for fine-tuning on target
applications.
Data & Project website: https://github.com/paulgavrikov/cnn-filter-dbComment: significantly reduced PDF size in v2; Accepted as ORAL at IEEE/CVF
Conference on Computer Vision and Pattern Recognition 2022 (CVPR
Are Vision Language Models Texture or Shape Biased and Can We Steer Them?
Vision language models (VLMs) have drastically changed the computer vision
model landscape in only a few years, opening an exciting array of new
applications from zero-shot image classification, over to image captioning, and
visual question answering. Unlike pure vision models, they offer an intuitive
way to access visual content through language prompting. The wide applicability
of such models encourages us to ask whether they also align with human vision -
specifically, how far they adopt human-induced visual biases through multimodal
fusion, or whether they simply inherit biases from pure vision models. One
important visual bias is the texture vs. shape bias, or the dominance of local
over global information. In this paper, we study this bias in a wide range of
popular VLMs. Interestingly, we find that VLMs are often more shape-biased than
their vision encoders, indicating that visual biases are modulated to some
extent through text in multimodal models. If text does indeed influence visual
biases, this suggests that we may be able to steer visual biases not just
through visual input but also through language: a hypothesis that we confirm
through extensive experiments. For instance, we are able to steer shape bias
from as low as 49% to as high as 72% through prompting alone. For now, the
strong human bias towards shape (96%) remains out of reach for all tested VLMs
Performance of the CMS Cathode Strip Chambers with Cosmic Rays
The Cathode Strip Chambers (CSCs) constitute the primary muon tracking device
in the CMS endcaps. Their performance has been evaluated using data taken
during a cosmic ray run in fall 2008. Measured noise levels are low, with the
number of noisy channels well below 1%. Coordinate resolution was measured for
all types of chambers, and fall in the range 47 microns to 243 microns. The
efficiencies for local charged track triggers, for hit and for segments
reconstruction were measured, and are above 99%. The timing resolution per
layer is approximately 5 ns
Performance and Operation of the CMS Electromagnetic Calorimeter
The operation and general performance of the CMS electromagnetic calorimeter
using cosmic-ray muons are described. These muons were recorded after the
closure of the CMS detector in late 2008. The calorimeter is made of lead
tungstate crystals and the overall status of the 75848 channels corresponding
to the barrel and endcap detectors is reported. The stability of crucial
operational parameters, such as high voltage, temperature and electronic noise,
is summarised and the performance of the light monitoring system is presented
Potential of Core-Collapse Supernova Neutrino Detection at JUNO
JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve
Detection of the Diffuse Supernova Neutrino Background with JUNO
As an underground multi-purpose neutrino detector with 20 kton liquid scintillator, Jiangmen Underground Neutrino Observatory (JUNO) is competitive with and complementary to the water-Cherenkov detectors on the search for the diffuse supernova neutrino background (DSNB). Typical supernova models predict 2-4 events per year within the optimal observation window in the JUNO detector. The dominant background is from the neutral-current (NC) interaction of atmospheric neutrinos with 12C nuclei, which surpasses the DSNB by more than one order of magnitude. We evaluated the systematic uncertainty of NC background from the spread of a variety of data-driven models and further developed a method to determine NC background within 15\% with {\it{in}} {\it{situ}} measurements after ten years of running. Besides, the NC-like backgrounds can be effectively suppressed by the intrinsic pulse-shape discrimination (PSD) capabilities of liquid scintillators. In this talk, I will present in detail the improvements on NC background uncertainty evaluation, PSD discriminator development, and finally, the potential of DSNB sensitivity in JUNO
- âŠ