58 research outputs found
Learning Invariant Representations for Deep Latent Variable Models
Deep latent variable models introduce a new class of generative models which are able to handle unstructured data and encode non-linear dependencies. Despite their known flexibility, these models are frequently not invariant against target-specific transformations. Therefore, they suffer from model mismatches and are challenging to interpret or control. We employ the concept of symmetry transformations from physics to formally describe these invariances. In this thesis, we investigate how we can model invariances when a symmetry transformation is either known or unknown. As a consequence, we make contributions in the domain of variable compression under side information and generative modelling. In our first contribution, we investigate the problem where a symmetry transformation is known yet not implicitly learned by the model. Specifically, we consider the task of estimating mutual information in the context of the deep information bottleneck which is not invariant against monotone transformations. To address this limitation, we extend the deep information bottleneck with a copula construction. In our second contribution, we address the problem of learning target-invariant subspaces for generative models. In this case, the symmetry transformation is unknown and has to be learned from data. We achieve this by formulating a deep information bottleneck with a target and a target-invariant subspace. To ensure invariance, we provide a continuous mutual information regulariser based on adversarial training. In our last contribution, we introduce an improved method for learning unknown symmetry transformations with cycle-consistency. To do so, we employ the equivalent deep information bottleneck method with a partitioned latent space. However, we ensure target-invariance by utilizing a cycle-consistency loss in the latent space. As a result, we overcome potential convergence issues introduced by adversarial training and are able to deal with mixed data. In summary, each of our presented models provide an attempt to better control and understand deep latent variables models by learning symmetry transformations. We demonstrated the effectiveness of our contributions with an extensive evaluation on both artificial and real-world experiments
Informed MCMC with Bayesian Neural Networks for Facial Image Analysis
Computer vision tasks are difficult because of the large variability in the
data that is induced by changes in light, background, partial occlusion as well
as the varying pose, texture, and shape of objects. Generative approaches to
computer vision allow us to overcome this difficulty by explicitly modeling the
physical image formation process. Using generative object models, the analysis
of an observed image is performed via Bayesian inference of the posterior
distribution. This conceptually simple approach tends to fail in practice
because of several difficulties stemming from sampling the posterior
distribution: high-dimensionality and multi-modality of the posterior
distribution as well as expensive simulation of the rendering process. The main
difficulty of sampling approaches in a computer vision context is choosing the
proposal distribution accurately so that maxima of the posterior are explored
early and the algorithm quickly converges to a valid image interpretation. In
this work, we propose to use a Bayesian Neural Network for estimating an image
dependent proposal distribution. Compared to a standard Gaussian random walk
proposal, this accelerates the sampler in finding regions of the posterior with
high value. In this way, we can significantly reduce the number of samples
needed to perform facial image analysis.Comment: Accepted to the Bayesian Deep Learning Workshop at NeurIPS 201
Learning Channel Importance for High Content Imaging with Interpretable Deep Input Channel Mixing
Uncovering novel drug candidates for treating complex diseases remain one of
the most challenging tasks in early discovery research. To tackle this
challenge, biopharma research established a standardized high content imaging
protocol that tags different cellular compartments per image channel. In order
to judge the experimental outcome, the scientist requires knowledge about the
channel importance with respect to a certain phenotype for decoding the
underlying biology. In contrast to traditional image analysis approaches, such
experiments are nowadays preferably analyzed by deep learning based approaches
which, however, lack crucial information about the channel importance. To
overcome this limitation, we present a novel approach which utilizes
multi-spectral information of high content images to interpret a certain aspect
of cellular biology. To this end, we base our method on image blending concepts
with alpha compositing for an arbitrary number of channels. More specifically,
we introduce DCMIX, a lightweight, scaleable and end-to-end trainable mixing
layer which enables interpretable predictions in high content imaging while
retaining the benefits of deep learning based methods. We employ an extensive
set of experiments on both MNIST and RXRX1 datasets, demonstrating that DCMIX
learns the biologically relevant channel importance without scarifying
prediction performance.Comment: Accepted @ DAGM German Conference on Pattern Recognition (GCPR) 202
Learning Sparse Latent Representations with the Deep Copula Information Bottleneck
Deep latent variable models are powerful tools for representation learning.
In this paper, we adopt the deep information bottleneck model, identify its
shortcomings and propose a model that circumvents them. To this end, we apply a
copula transformation which, by restoring the invariance properties of the
information bottleneck method, leads to disentanglement of the features in the
latent space. Building on that, we show how this transformation translates to
sparsity of the latent space in the new model. We evaluate our method on
artificial and real data.Comment: Published as a conference paper at ICLR 2018. Aleksander Wieczorek
and Mario Wieser contributed equally to this wor
Learning Extremal Representations with Deep Archetypal Analysis
Archetypes are typical population representatives in an extremal sense, where
typicality is understood as the most extreme manifestation of a trait or
feature. In linear feature space, archetypes approximate the data convex hull
allowing all data points to be expressed as convex mixtures of archetypes.
However, it might not always be possible to identify meaningful archetypes in a
given feature space. Learning an appropriate feature space and identifying
suitable archetypes simultaneously addresses this problem. This paper
introduces a generative formulation of the linear archetype model,
parameterized by neural networks. By introducing the distance-dependent
archetype loss, the linear archetype model can be integrated into the latent
space of a variational autoencoder, and an optimal representation with respect
to the unknown archetypes can be learned end-to-end. The reformulation of
linear Archetypal Analysis as deep variational information bottleneck, allows
the incorporation of arbitrarily complex side information during training.
Furthermore, an alternative prior, based on a modified Dirichlet distribution,
is proposed. The real-world applicability of the proposed method is
demonstrated by exploring archetypes of female facial expressions while using
multi-rater based emotion scores of these expressions as side information. A
second application illustrates the exploration of the chemical space of small
organic molecules. In this experiment, it is demonstrated that exchanging the
side information but keeping the same set of molecules, e. g. using as side
information the heat capacity of each molecule instead of the band gap energy,
will result in the identification of different archetypes. As an application,
these learned representations of chemical space might reveal distinct starting
points for de novo molecular design.Comment: Under review for publication at the International Journal of Computer
Vision (IJCV). Extended version of our GCPR2019 paper "Deep Archetypal
Analysis
Micropatterning of Plasma Membrane Proteins to Analyze Raft Localization in Living Cells
International audienc
Burden of micronutrient deficiencies by socio-economic strata in children aged 6Â months to 5Â years in the Philippines
Background: Micronutrient deficiencies (MNDs) are a chronic lack of vitamins and minerals and constitute a huge public health problem. MNDs have severe health consequences and are particularly harmful during early childhood due to their impact on the physical and cognitive development. We estimate the costs of illness due to iron deficiency (IDA), vitamin A deficiency (VAD) and zinc deficiency (ZnD) in 2 age groups (6-23 and 24-59 months) of Filipino children by socio-economic strata in 2008.
Methods: We build a health economic model simulating the consequences of MNDs in childhood over the entire lifetime. The model is based on a health survey and a nutrition survey carried out in 2008. The sample populations are first structured into 10 socio-economic strata (SES) and 2 age groups. Health consequences of MNDs are modelled based on information extracted from literature. Direct medical costs, production losses and intangible costs are computed and long term costs are discounted to present value.
Results: Total lifetime costs of IDA, VAD and ZnD amounted to direct medical costs of 30 million dollars, production losses of 618 million dollars and intangible costs of 122,138 disability adjusted life years (DALYs). These costs can be interpreted as the lifetime costs of a 1-year cohort affected by MNDs between the age of 6–59 months. Direct medical costs are dominated by costs due to ZnD (89% of total), production losses by losses in future lifetime (90% of total) and intangible costs by premature death (47% of total DALY losses) and losses in future lifetime (43%). Costs of MNDs differ considerably between SES as costs in the poorest third of the households are 5 times higher than in the wealthiest third.
Conclusions: MNDs lead to substantial costs in 6-59-month-old children in the Philippines. Costs are highly concentrated in the lower SES and in children 6-23 months old. These results may have important implications for the design, evaluation and choice of the most effective and cost-effective policies aimed at the reduction of MNDs
- …