21 research outputs found

    HARD: Hard Augmentations for Robust Distillation

    Full text link
    Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, current KD has a few shortcomings: it has recently been shown that this method is unsuitable to transfer simple inductive biases like shift equivariance, struggles to transfer out of domain generalization, and optimization time is magnitudes longer compared to default non-KD model training. To improve these aspects of KD, we propose Hard Augmentations for Robust Distillation (HARD), a generally applicable data augmentation framework, that generates synthetic data points for which the teacher and the student disagree. We show in a simple toy example that our augmentation framework solves the problem of transferring simple equivariances with KD. We then apply our framework in real-world tasks for a variety of augmentation models, ranging from simple spatial transformations to unconstrained image manipulations with a pretrained variational autoencoder. We find that our learned augmentations significantly improve KD performance on in-domain and out-of-domain evaluation. Moreover, our method outperforms even state-of-the-art data augmentations and since the augmented training inputs can be visualized, they offer a qualitative insight into the properties that are transferred from the teacher to the student. Thus HARD represents a generally applicable, dynamically optimized data augmentation technique tailored to improve the generalization and convergence speed of models trained with KD

    Image retrieval outperforms diffusion models on data augmentation

    Full text link
    Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generating new training data, and surprisingly finds that these sophisticated models are not yet able to beat a simple and strong image retrieval baseline on simple downstream vision tasks

    Image retrieval outperforms diffusion models on data augmentation

    Get PDF
    Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generating new training data, and surprisingly finds that these sophisticated models are not yet able to beat a simple and strong image retrieval baseline on simple downstream vision tasks

    Most discriminative stimuli for functional cell type clustering

    Full text link
    Identifying cell types and understanding their functional properties is crucial for unraveling the mechanisms underlying perception and cognition. In the retina, functional types can be identified by carefully selected stimuli, but this requires expert domain knowledge and biases the procedure towards previously known cell types. In the visual cortex, it is still unknown what functional types exist and how to identify them. Thus, for unbiased identification of the functional cell types in retina and visual cortex, new approaches are needed. Here we propose an optimization-based clustering approach using deep predictive models to obtain functional clusters of neurons using Most Discriminative Stimuli (MDS). Our approach alternates between stimulus optimization with cluster reassignment akin to an expectation-maximization algorithm. The algorithm recovers functional clusters in mouse retina, marmoset retina and macaque visual area V4. This demonstrates that our approach can successfully find discriminative stimuli across species, stages of the visual system and recording techniques. The resulting most discriminative stimuli can be used to assign functional cell types fast and on the fly, without the need to train complex predictive models or show a large natural scene dataset, paving the way for experiments that were previously limited by experimental time. Crucially, MDS are interpretable: they visualize the distinctive stimulus patterns that most unambiguously identify a specific type of neuron

    Frequency, prognostic impact, and subtype association of 8p12, 8q24, 11q13, 12p13, 17q12, and 20q13 amplifications in breast cancers

    Get PDF
    BACKGROUND: Oncogene amplification and overexpression occur in tumor cells. Amplification status may provide diagnostic and prognostic information and may lead to new treatment strategies. Chromosomal regions 8p12, 8q24, 11q13, 17q12 and 20q13 are recurrently amplified in breast cancers. METHODS: To assess the frequencies and clinical impact of amplifications, we analyzed 547 invasive breast tumors organized in a tissue microarray (TMA) by fluorescence in situ hybridization (FISH) and calculated correlations with histoclinical features and prognosis. BAC probes were designed for: (i) two 8p12 subregions centered on RAB11FIP1 and FGFR1 loci, respectively; (ii) 11q13 region centered on CCND1; (iii) 12p13 region spanning NOL1; and (iv) three 20q13 subregions centered on MYBL2, ZNF217 and AURKA, respectively. Regions 8q24 and 17q12 were analyzed with MYC and ERBB2 commercial probes, respectively. RESULTS: We observed amplification of 8p12 (amplified at RAB11FIP1 and/or FGFR1) in 22.8%, 8q24 in 6.1%, 11q13 in 19.6%, 12p13 in 4.1%, 17q12 in 9.9%, 20q13(Z )(amplified at ZNF217 only) in 9.9%, and 20q13(Co )(co-amplification of two or three 20q13 loci) in 8.5% of cases. The 8q24, 12p13, and 17q12 amplifications were correlated with high grade. The most frequent single amplifications were 8p12 (9.8%), 8q24 (3.3%) and 12p13 (3.3%), 20q13(Z )and 20q13(Co )(1.6%) regions. The 17q12 and 11q13 regions were never found amplified alone. The most frequent co-amplification was 8p12/11q13. Amplifications of 8p12 and 17q12 were associated with poor outcome. Amplification of 12p13 was associated with basal molecular subtype. CONCLUSION: Our results establish the frequencies, prognostic impacts and subtype associations of various amplifications and co-amplifications in breast cancers

    Leading by example: Guiding knowledge transfer with adversarial data augmentation

    No full text
    Knowledge distillation (KD) is a simple and successful method to transfer knowledge from a teacher to a student model solely based on functional activity. However, it has recently been shown that this method is unable to transfer simple inductive biases like shift equivariance. To extend existing functional transfer methods like KD, we propose a general data augmentation framework that generates synthetic data points where the teacher and the student disagree. We generate new input data through a learned distribution of spatial transformations of the original images. Through these synthetic inputs, our augmentation framework solves the problem of transferring simple equivariances with KD, leading to better generalization. Additionally, we generate new data points with a fine-tuned Very Deep Variational Autoencoder model allowing for more abstract augmentations. Our learned augmentations significantly improve KD performance, even when compared to classical data augmentations. In addition, the augmented inputs are interpretable and offer a unique insight into the properties that are transferred to the student
    corecore