21 research outputs found
HARD: Hard Augmentations for Robust Distillation
Knowledge distillation (KD) is a simple and successful method to transfer
knowledge from a teacher to a student model solely based on functional
activity. However, current KD has a few shortcomings: it has recently been
shown that this method is unsuitable to transfer simple inductive biases like
shift equivariance, struggles to transfer out of domain generalization, and
optimization time is magnitudes longer compared to default non-KD model
training. To improve these aspects of KD, we propose Hard Augmentations for
Robust Distillation (HARD), a generally applicable data augmentation framework,
that generates synthetic data points for which the teacher and the student
disagree. We show in a simple toy example that our augmentation framework
solves the problem of transferring simple equivariances with KD. We then apply
our framework in real-world tasks for a variety of augmentation models, ranging
from simple spatial transformations to unconstrained image manipulations with a
pretrained variational autoencoder. We find that our learned augmentations
significantly improve KD performance on in-domain and out-of-domain evaluation.
Moreover, our method outperforms even state-of-the-art data augmentations and
since the augmented training inputs can be visualized, they offer a qualitative
insight into the properties that are transferred from the teacher to the
student. Thus HARD represents a generally applicable, dynamically optimized
data augmentation technique tailored to improve the generalization and
convergence speed of models trained with KD
Image retrieval outperforms diffusion models on data augmentation
Many approaches have been proposed to use diffusion models to augment
training datasets for downstream tasks, such as classification. However,
diffusion models are themselves trained on large datasets, often with noisy
annotations, and it remains an open question to which extent these models
contribute to downstream classification performance. In particular, it remains
unclear if they generalize enough to improve over directly using the additional
data of their pre-training process for augmentation. We systematically evaluate
a range of existing methods to generate images from diffusion models and study
new extensions to assess their benefit for data augmentation. Personalizing
diffusion models towards the target data outperforms simpler prompting
strategies. However, using the pre-training data of the diffusion model alone,
via a simple nearest-neighbor retrieval procedure, leads to even stronger
downstream performance. Our study explores the potential of diffusion models in
generating new training data, and surprisingly finds that these sophisticated
models are not yet able to beat a simple and strong image retrieval baseline on
simple downstream vision tasks
Image retrieval outperforms diffusion models on data augmentation
Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generating new training data, and surprisingly finds that these sophisticated models are not yet able to beat a simple and strong image retrieval baseline on simple downstream vision tasks
Most discriminative stimuli for functional cell type clustering
Identifying cell types and understanding their functional properties is
crucial for unraveling the mechanisms underlying perception and cognition. In
the retina, functional types can be identified by carefully selected stimuli,
but this requires expert domain knowledge and biases the procedure towards
previously known cell types. In the visual cortex, it is still unknown what
functional types exist and how to identify them. Thus, for unbiased
identification of the functional cell types in retina and visual cortex, new
approaches are needed. Here we propose an optimization-based clustering
approach using deep predictive models to obtain functional clusters of neurons
using Most Discriminative Stimuli (MDS). Our approach alternates between
stimulus optimization with cluster reassignment akin to an
expectation-maximization algorithm. The algorithm recovers functional clusters
in mouse retina, marmoset retina and macaque visual area V4. This demonstrates
that our approach can successfully find discriminative stimuli across species,
stages of the visual system and recording techniques. The resulting most
discriminative stimuli can be used to assign functional cell types fast and on
the fly, without the need to train complex predictive models or show a large
natural scene dataset, paving the way for experiments that were previously
limited by experimental time. Crucially, MDS are interpretable: they visualize
the distinctive stimulus patterns that most unambiguously identify a specific
type of neuron
Frequency, prognostic impact, and subtype association of 8p12, 8q24, 11q13, 12p13, 17q12, and 20q13 amplifications in breast cancers
BACKGROUND: Oncogene amplification and overexpression occur in tumor cells. Amplification status may provide diagnostic and prognostic information and may lead to new treatment strategies. Chromosomal regions 8p12, 8q24, 11q13, 17q12 and 20q13 are recurrently amplified in breast cancers. METHODS: To assess the frequencies and clinical impact of amplifications, we analyzed 547 invasive breast tumors organized in a tissue microarray (TMA) by fluorescence in situ hybridization (FISH) and calculated correlations with histoclinical features and prognosis. BAC probes were designed for: (i) two 8p12 subregions centered on RAB11FIP1 and FGFR1 loci, respectively; (ii) 11q13 region centered on CCND1; (iii) 12p13 region spanning NOL1; and (iv) three 20q13 subregions centered on MYBL2, ZNF217 and AURKA, respectively. Regions 8q24 and 17q12 were analyzed with MYC and ERBB2 commercial probes, respectively. RESULTS: We observed amplification of 8p12 (amplified at RAB11FIP1 and/or FGFR1) in 22.8%, 8q24 in 6.1%, 11q13 in 19.6%, 12p13 in 4.1%, 17q12 in 9.9%, 20q13(Z )(amplified at ZNF217 only) in 9.9%, and 20q13(Co )(co-amplification of two or three 20q13 loci) in 8.5% of cases. The 8q24, 12p13, and 17q12 amplifications were correlated with high grade. The most frequent single amplifications were 8p12 (9.8%), 8q24 (3.3%) and 12p13 (3.3%), 20q13(Z )and 20q13(Co )(1.6%) regions. The 17q12 and 11q13 regions were never found amplified alone. The most frequent co-amplification was 8p12/11q13. Amplifications of 8p12 and 17q12 were associated with poor outcome. Amplification of 12p13 was associated with basal molecular subtype. CONCLUSION: Our results establish the frequencies, prognostic impacts and subtype associations of various amplifications and co-amplifications in breast cancers
Leading by example: Guiding knowledge transfer with adversarial data augmentation
Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, it has recently been shown that this method is unable to transfer simple inductive biases like shift equivariance. To extend existing functional transfer methods like KD, we propose a general data augmentation framework that generates synthetic
data points where the teacher and the student disagree. We generate new input data through a learned distribution of spatial transformations of the original images. Through these synthetic inputs, our augmentation framework solves the problem of transferring simple equivariances with KD, leading to better generalization. Additionally, we generate new data points with a fine-tuned Very Deep Variational Autoencoder model allowing for more abstract augmentations. Our learned augmentations significantly improve KD performance, even when compared to classical data augmentations. In addition, the augmented inputs are interpretable and offer a unique insight into the properties that are transferred to the student