1,441 research outputs found

    Scale Alone Does not Improve Mechanistic Interpretability in Vision Models

    Full text link
    In light of the recent widespread adoption of AI systems, understanding the internal information processing of neural networks has become increasingly critical. Most recently, machine vision has seen remarkable progress by scaling neural networks to unprecedented levels in dataset and model size. We here ask whether this extraordinary increase in scale also positively impacts the field of mechanistic interpretability. In other words, has our understanding of the inner workings of scaled neural networks improved as well? We use a psychophysical paradigm to quantify one form of mechanistic interpretability for a diverse suite of nine models and find no scaling effect for interpretability - neither for model nor dataset size. Specifically, none of the investigated state-of-the-art models are easier to interpret than the GoogLeNet model from almost a decade ago. Latest-generation vision models appear even less interpretable than older architectures, hinting at a regression rather than improvement, with modern models sacrificing interpretability for accuracy. These results highlight the need for models explicitly designed to be mechanistically interpretable and the need for more helpful interpretability methods to increase our understanding of networks at an atomic level. We release a dataset containing more than 130'000 human responses from our psychophysical evaluation of 767 units across nine models. This dataset facilitates research on automated instead of human-based interpretability evaluations, which can ultimately be leveraged to directly optimize the mechanistic interpretability of models.Comment: Spotlight at NeurIPS 2023. The first two authors contributed equally. Code available at https://brendel-group.github.io/imi

    Analysis of the exciton-exciton interaction in semiconductor quantum wells

    Full text link
    The exciton-exciton interaction is investigated for quasi-two-dimensional quantum structures. A bosonization scheme is applied including the full spin structure. For generating the effective interaction potentials, the Hartree-Fock and Heitler-London approaches are improved by a full two-exciton calculation which includes the van der Waals effect. With these potentials the biexciton formation in bilayer systems is investigated. For coupled quantum wells the two-body scattering matrix is calculated and employed to give a modified relation between exciton density and blue shift. Such a relation is of central importance for gauging exciton densities in experiments which pave the way toward Bose-Einstein condensation of excitons

    Don't trust your eyes: on the (un)reliability of feature visualizations

    Full text link
    How do neural networks extract patterns from pixels? Feature visualizations attempt to answer this important question by visualizing highly activating patterns through optimization. Today, visualization methods form the foundation of our knowledge about the internal workings of neural networks, as a type of mechanistic interpretability. Here we ask: How reliable are feature visualizations? We start our investigation by developing network circuits that trick feature visualizations into showing arbitrary patterns that are completely disconnected from normal network behavior on natural input. We then provide evidence for a similar phenomenon occurring in standard, unmanipulated networks: feature visualizations are processed very differently from standard input, casting doubt on their ability to "explain" how neural networks process natural images. We underpin this empirical finding by theory proving that the set of functions that can be reliably understood by feature visualization is extremely small and does not include general black-box neural networks. Therefore, a promising way forward could be the development of networks that enforce certain structures in order to ensure more reliable feature visualizations

    A multiphase model for the cross‐linking of ultra‐high viscous alginate hydrogels

    Get PDF
    In this study, a model for the cross-linking of ultra-high viscous alginate hydrogels is provided. The model consists of four kinetic equations describing the process, including the local accumulation and the depletion of mobile alginate, cross-linked alginate and cross-linking cations. For an efficient simulation, finite difference schemes with predictor-corrector algorithms were implemente

    Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

    Full text link
    Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets. This progress is largely fueled by slot-based methods, whose ability to cluster visual scenes into meaningful objects holds great promise for compositional generalization and downstream learning. In these methods, the number of slots (clusters) KK is typically chosen to match the number of ground-truth objects in the data, even though this quantity is unknown in real-world settings. Indeed, the sensitivity of slot-based methods to KK, and how this affects their learned correspondence to objects in the data has largely been ignored in the literature. In this work, we address this issue through a systematic study of slot-based methods. We propose using analogs to precision and recall based on the Adjusted Rand Index to accurately quantify model behavior over a large range of KK. We find that, especially during training, incorrect choices of KK do not yield the desired object decomposition and, in fact, cause substantial oversegmentation or merging of separate objects (undersegmentation). We demonstrate that the choice of the objective function and incorporating instance-level annotations can moderately mitigate this behavior while still falling short of fully resolving this issue. Indeed, we show how this issue persists across multiple methods and datasets and stress its importance for future slot-based models
    • 

    corecore