31 research outputs found
LAVAE: Disentangling Location and Appearance
We propose a probabilistic generative model for unsupervised learning of
structured, interpretable, object-based representations of visual scenes. We
use amortized variational inference to train the generative model end-to-end.
The learned representations of object location and appearance are fully
disentangled, and objects are represented independently of each other in the
latent space. Unlike previous approaches that disentangle location and
appearance, ours generalizes seamlessly to scenes with many more objects than
encountered in the training regime. We evaluate the proposed model on
multi-MNIST and multi-dSprites data sets
Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds
This paper introduces novel results for the score function gradient estimator
of the importance weighted variational bound (IWAE). We prove that in the limit
of large (number of importance samples) one can choose the control variate
such that the Signal-to-Noise ratio (SNR) of the estimator grows as .
This is in contrast to the standard pathwise gradient estimator where the SNR
decreases as . Based on our theoretical findings we develop a novel
control variate that extends on VIMCO. Empirically, for the training of both
continuous and discrete generative models, the proposed method yields superior
variance reduction, resulting in an SNR for IWAE that increases with
without relying on the reparameterization trick. The novel estimator is
competitive with state-of-the-art reparameterization-free gradient estimators
such as Reweighted Wake-Sleep (RWS) and the thermodynamic variational objective
(TVO) when training generative models
Semi-Supervised Variational Autoencoder for Survival Prediction
In this paper we propose a semi-supervised variational autoencoder for
classification of overall survival groups from tumor segmentation masks. The
model can use the output of any tumor segmentation algorithm, removing all
assumptions on the scanning platform and the specific type of pulse sequences
used, thereby increasing its generalization properties. Due to its
semi-supervised nature, the method can learn to classify survival time by using
a relatively small number of labeled subjects. We validate our model on the
publicly available dataset from the Multimodal Brain Tumor Segmentation
Challenge (BraTS) 2019.Comment: Published in the pre-conference proceeding of "2019 International
MICCAI BraTS Challenge
DiffEnc: Variational Diffusion with a Learned Encoder
Diffusion models may be viewed as hierarchical variational autoencoders
(VAEs) with two improvements: parameter sharing for the conditional
distributions in the generative process and efficient computation of the loss
as independent terms over the hierarchy. We consider two changes to the
diffusion model that retain these advantages while adding flexibility to the
model. Firstly, we introduce a data- and depth-dependent mean function in the
diffusion process, which leads to a modified diffusion loss. Our proposed
framework, DiffEnc, achieves state-of-the-art likelihood on CIFAR-10. Secondly,
we let the ratio of the noise variance of the reverse encoder process and the
generative process be a free weight parameter rather than being fixed to 1.
This leads to theoretical insights: For a finite depth hierarchy, the evidence
lower bound (ELBO) can be used as an objective for a weighted diffusion loss
approach and for optimizing the noise schedule specifically for inference. For
the infinite-depth hierarchy, on the other hand, the weight parameter has to be
1 to have a well-defined ELBO
Assessing Neural Network Robustness via Adversarial Pivotal Tuning
The robustness of image classifiers is essential to their deployment in the
real world. The ability to assess this resilience to manipulations or
deviations from the training data is thus crucial. These modifications have
traditionally consisted of minimal changes that still manage to fool
classifiers, and modern approaches are increasingly robust to them. Semantic
manipulations that modify elements of an image in meaningful ways have thus
gained traction for this purpose. However, they have primarily been limited to
style, color, or attribute changes. While expressive, these manipulations do
not make use of the full capabilities of a pretrained generative model. In this
work, we aim to bridge this gap. We show how a pretrained image generator can
be used to semantically manipulate images in a detailed, diverse, and
photorealistic way while still preserving the class of the original image.
Inspired by recent GAN-based image inversion methods, we propose a method
called Adversarial Pivotal Tuning (APT). Given an image, APT first finds a
pivot latent space input that reconstructs the image using a pretrained
generator. It then adjusts the generator's weights to create small yet semantic
manipulations in order to fool a pretrained classifier. APT preserves the full
expressive editing capabilities of the generative model. We demonstrate that
APT is capable of a wide range of class-preserving semantic image manipulations
that fool a variety of pretrained classifiers. Finally, we show that
classifiers that are robust to other benchmarks are not robust to APT
manipulations and suggest a method to improve them. Code available at:
https://captaine.github.io/apt/Comment: Major changes include new experiments in Table 1 on page 5 and Table
2-4 on page 6, new figure 5 on page 8. Paper accepted at WACV (oral
Generalization and Robustness Implications in Object-Centric Learning
The idea behind object-centric representation learning is that natural scenes
can better be modeled as compositions of objects and their relations as opposed
to distributed representations. This inductive bias can be injected into neural
networks to potentially improve systematic generalization and learning
efficiency of downstream tasks in scenes with multiple objects. In this paper,
we train state-of-the-art unsupervised models on five common multi-object
datasets and evaluate segmentation accuracy and downstream object property
prediction. In addition, we study systematic generalization and robustness by
investigating the settings where either single objects are out-of-distribution
-- e.g., having unseen colors, textures, and shapes -- or global properties of
the scene are altered -- e.g., by occlusions, cropping, or increasing the
number of objects. From our experimental study, we find object-centric
representations to be generally useful for downstream tasks and robust to
shifts in the data distribution, especially if shifts affect single objects
On the Transfer of Disentangled Representations in Realistic Settings
Learning meaningful representations that disentangle the underlying structure
of the data generating process is considered to be of key importance in machine
learning. While disentangled representations were found to be useful for
diverse tasks such as abstract reasoning and fair classification, their
scalability and real-world impact remain questionable. We introduce a new
high-resolution dataset with 1M simulated images and over 1,800 annotated
real-world images of the same setup. In contrast to previous work, this new
dataset exhibits correlations, a complex underlying structure, and allows to
evaluate transfer to unseen simulated and real-world settings where the encoder
i) remains in distribution or ii) is out of distribution. We propose new
architectures in order to scale disentangled representation learning to
realistic high-resolution settings and conduct a large-scale empirical study of
disentangled representations on this dataset. We observe that disentanglement
is a good predictor for out-of-distribution (OOD) task performance.Comment: Published at ICLR 202
Assaying Out-Of-Distribution Generalization in Transfer Learning
Since out-of-distribution generalization is a generally ill-posed problem,
various proxy targets (e.g., calibration, adversarial robustness, algorithmic
corruptions, invariance across shifts) were studied across different research
programs resulting in different recommendations. While sharing the same
aspirational goal, these approaches have never been tested under the same
experimental conditions on real data. In this paper, we take a unified view of
previous work, highlighting message discrepancies that we address empirically,
and providing recommendations on how to measure the robustness of a model and
how to improve it. To this end, we collect 172 publicly available dataset pairs
for training and out-of-distribution evaluation of accuracy, calibration error,
adversarial attacks, environment invariance, and synthetic corruptions. We
fine-tune over 31k networks, from nine different architectures in the many- and
few-shot setting. Our findings confirm that in- and out-of-distribution
accuracies tend to increase jointly, but show that their relation is largely
dataset-dependent, and in general more nuanced and more complex than posited by
previous, smaller scale studies
Use of high-sensitivity cardiac troponins in the emergency department for the early rule-in and rule-out of acute myocardial infarction without persistent ST-segment elevation (NSTEMI) in Italy
: Serial measurements of cardiac troponin are recommended by international guidelines to diagnose myocardial infarction (MI) since 2000. However, some relevant differences exist between the three different international guidelines published between 2020 and 2021 for the management of patients with chest pain and no ST-segment elevation. In particular, there is no agreement on the cut-offs or absolute change values to diagnose non-ST-segment elevation MI (NSTEMI). Other controversial issues concern the diagnostic accuracy and cost-effectiveness of cut-off values for the most rapid algorithms (0 h/1 h or 0 h/2 h) to rule-in and rule-out NSTEMI. Finally, another important point is the possible differences between demographic and clinical characteristics of patients enrolled in multicenter trials compared to those routinely admitted to the Emergency Department in Italy. The Study Group of Cardiac Biomarkers, supported by the Italian Scientific Societies Società Italiana di Biochimica Clinica, Italian Society of the European Ligand Assay Society, and Società Italiana di Patolgia Clinica e Medicina di Laboratorio decided to revise the document previously published in 2013 about the management of patients with suspected NSTEMI, and to provide some suggestions for the use of these biomarkers in clinical practice, with a particular focus on the Italian setting