1,441 research outputs found
Scale Alone Does not Improve Mechanistic Interpretability in Vision Models
In light of the recent widespread adoption of AI systems, understanding the
internal information processing of neural networks has become increasingly
critical. Most recently, machine vision has seen remarkable progress by scaling
neural networks to unprecedented levels in dataset and model size. We here ask
whether this extraordinary increase in scale also positively impacts the field
of mechanistic interpretability. In other words, has our understanding of the
inner workings of scaled neural networks improved as well? We use a
psychophysical paradigm to quantify one form of mechanistic interpretability
for a diverse suite of nine models and find no scaling effect for
interpretability - neither for model nor dataset size. Specifically, none of
the investigated state-of-the-art models are easier to interpret than the
GoogLeNet model from almost a decade ago. Latest-generation vision models
appear even less interpretable than older architectures, hinting at a
regression rather than improvement, with modern models sacrificing
interpretability for accuracy. These results highlight the need for models
explicitly designed to be mechanistically interpretable and the need for more
helpful interpretability methods to increase our understanding of networks at
an atomic level. We release a dataset containing more than 130'000 human
responses from our psychophysical evaluation of 767 units across nine models.
This dataset facilitates research on automated instead of human-based
interpretability evaluations, which can ultimately be leveraged to directly
optimize the mechanistic interpretability of models.Comment: Spotlight at NeurIPS 2023. The first two authors contributed equally.
Code available at https://brendel-group.github.io/imi
Analysis of the exciton-exciton interaction in semiconductor quantum wells
The exciton-exciton interaction is investigated for quasi-two-dimensional
quantum structures. A bosonization scheme is applied including the full spin
structure. For generating the effective interaction potentials, the
Hartree-Fock and Heitler-London approaches are improved by a full two-exciton
calculation which includes the van der Waals effect. With these potentials the
biexciton formation in bilayer systems is investigated. For coupled quantum
wells the two-body scattering matrix is calculated and employed to give a
modified relation between exciton density and blue shift. Such a relation is of
central importance for gauging exciton densities in experiments which pave the
way toward Bose-Einstein condensation of excitons
Don't trust your eyes: on the (un)reliability of feature visualizations
How do neural networks extract patterns from pixels? Feature visualizations
attempt to answer this important question by visualizing highly activating
patterns through optimization. Today, visualization methods form the foundation
of our knowledge about the internal workings of neural networks, as a type of
mechanistic interpretability. Here we ask: How reliable are feature
visualizations? We start our investigation by developing network circuits that
trick feature visualizations into showing arbitrary patterns that are
completely disconnected from normal network behavior on natural input. We then
provide evidence for a similar phenomenon occurring in standard, unmanipulated
networks: feature visualizations are processed very differently from standard
input, casting doubt on their ability to "explain" how neural networks process
natural images. We underpin this empirical finding by theory proving that the
set of functions that can be reliably understood by feature visualization is
extremely small and does not include general black-box neural networks.
Therefore, a promising way forward could be the development of networks that
enforce certain structures in order to ensure more reliable feature
visualizations
A multiphase model for the crossâlinking of ultraâhigh viscous alginate hydrogels
In this study, a model for the cross-linking of ultra-high viscous alginate hydrogels is provided. The model consists of four
kinetic equations describing the process, including the local accumulation and the depletion of mobile alginate, cross-linked
alginate and cross-linking cations. For an efficient simulation, finite difference schemes with predictor-corrector algorithms
were implemente
Sensitivity of Slot-Based Object-Centric Models to their Number of Slots
Self-supervised methods for learning object-centric representations have
recently been applied successfully to various datasets. This progress is
largely fueled by slot-based methods, whose ability to cluster visual scenes
into meaningful objects holds great promise for compositional generalization
and downstream learning. In these methods, the number of slots (clusters)
is typically chosen to match the number of ground-truth objects in the data,
even though this quantity is unknown in real-world settings. Indeed, the
sensitivity of slot-based methods to , and how this affects their learned
correspondence to objects in the data has largely been ignored in the
literature. In this work, we address this issue through a systematic study of
slot-based methods. We propose using analogs to precision and recall based on
the Adjusted Rand Index to accurately quantify model behavior over a large
range of . We find that, especially during training, incorrect choices of
do not yield the desired object decomposition and, in fact, cause
substantial oversegmentation or merging of separate objects
(undersegmentation). We demonstrate that the choice of the objective function
and incorporating instance-level annotations can moderately mitigate this
behavior while still falling short of fully resolving this issue. Indeed, we
show how this issue persists across multiple methods and datasets and stress
its importance for future slot-based models
- âŠ