615 research outputs found
Shakeout: A New Approach to Regularized Deep Neural Network Training
Recent years have witnessed the success of deep neural networks in dealing
with a plenty of practical problems. Dropout has played an essential role in
many successful deep neural networks, by inducing regularization in the model
training. In this paper, we present a new regularized training approach:
Shakeout. Instead of randomly discarding units as Dropout does at the training
stage, Shakeout randomly chooses to enhance or reverse each unit's contribution
to the next layer. This minor modification of Dropout has the statistical
trait: the regularizer induced by Shakeout adaptively combines , and
regularization terms. Our classification experiments with representative
deep architectures on image datasets MNIST, CIFAR-10 and ImageNet show that
Shakeout deals with over-fitting effectively and outperforms Dropout. We
empirically demonstrate that Shakeout leads to sparser weights under both
unsupervised and supervised settings. Shakeout also leads to the grouping
effect of the input units in a layer. Considering the weights in reflecting the
importance of connections, Shakeout is superior to Dropout, which is valuable
for the deep model compression. Moreover, we demonstrate that Shakeout can
effectively reduce the instability of the training process of the deep
architecture.Comment: Appears at T-PAMI 201
High-Fidelity Image Compression with Score-based Generative Models
Despite the tremendous success of diffusion generative models in
text-to-image generation, replicating this success in the domain of image
compression has proven difficult. In this paper, we demonstrate that diffusion
can significantly improve perceptual quality at a given bit-rate, outperforming
state-of-the-art approaches PO-ELIC and HiFiC as measured by FID score. This is
achieved using a simple but theoretically motivated two-stage approach
combining an autoencoder targeting MSE followed by a further score-based
decoder. However, as we will show, implementation details matter and the
optimal design decisions can differ greatly from typical text-to-image models
Learning to sample from noise with deep generative models
L’apprentissage automatique et spécialement l’apprentissage profond se sont imposés ces
dernières années pour résoudre une large variété de tâches. Une des applications les plus
remarquables concerne la vision par ordinateur. Les systèmes de détection ou de classification ont connu des avancées majeurs grâce a l’apprentissage profond. Cependant, il reste de
nombreux obstacles à une compréhension du monde similaire aux être vivants. Ces derniers
n’ont pas besoin de labels pour classifier, pour extraire des caractéristiques du monde réel.
L’apprentissage non supervisé est un des axes de recherche qui se concentre sur la résolution
de ce problème.
Dans ce mémoire, je présente un nouveau moyen d’entrainer des réseaux de neurones de
manière non supervisée. Je présente une méthode permettant d’échantillonner de manière
itérative a partir de bruit afin de générer des données qui se rapprochent des données
d’entrainement. Cette procédure itérative s’appelle l’entrainement par infusion qui est une
nouvelle approche permettant d’apprendre l’opérateur de transition d’une chaine de Markov.
Dans le premier chapitre, j’introduis des bases concernant l’apprentissage automatique et la
théorie des probabilités. Dans le second chapitre, j’expose les modèles génératifs qui ont
inspiré ce travail. Dans le troisième et dernier chapitre, je présente comment améliorer
l’échantillonnage dans les modèles génératifs avec l’entrainement par infusion.Machine learning and specifically deep learning has made significant breakthroughs in recent
years concerning different tasks. One well known application of deep learning is computer vision. Tasks such as detection or classification are nearly considered solved by the community.
However, training state-of-the-art models for such tasks requires to have labels associated
to the data we want to classify. A more general goal is, similarly to animal brains, to be
able to design algorithms that can extract meaningful features from data that aren’t labeled.
Unsupervised learning is one of the axes that try to solve this problem.
In this thesis, I present a new way to train a neural network as a generative model capable of
generating quality samples (a task akin to imagining). I explain how by starting from noise,
it is possible to get samples which are close to the training data. This iterative procedure
is called Infusion training and is a novel approach to learning the transition operator of a
generative Markov chain.
In the first chapter, I present some background about machine learning and probabilistic
models. The second chapter presents generative models that inspired this work. The third
and last chapter presents and investigates our novel approach to learn a generative model
with Infusion training
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
Inverse problems arise in a multitude of applications, where the goal is to
recover a clean signal from noisy and possibly (non)linear observations. The
difficulty of a reconstruction problem depends on multiple factors, such as the
structure of the ground truth signal, the severity of the degradation, the
implicit bias of the reconstruction model and the complex interactions between
the above factors. This results in natural sample-by-sample variation in the
difficulty of a reconstruction task, which is often overlooked by contemporary
techniques. Recently, diffusion-based inverse problem solvers have established
new state-of-the-art in various reconstruction tasks. However, they have the
drawback of being computationally prohibitive. Our key observation in this
paper is that most existing solvers lack the ability to adapt their compute
power to the difficulty of the reconstruction task, resulting in long inference
times, subpar performance and wasteful resource allocation. We propose a novel
method that we call severity encoding, to estimate the degradation severity of
noisy, degraded signals in the latent space of an autoencoder. We show that the
estimated severity has strong correlation with the true corruption level and
can give useful hints at the difficulty of reconstruction problems on a
sample-by-sample basis. Furthermore, we propose a reconstruction method based
on latent diffusion models that leverages the predicted degradation severities
to fine-tune the reverse diffusion sampling trajectory and thus achieve
sample-adaptive inference times. We utilize latent diffusion posterior sampling
to maintain data consistency with observations. We perform experiments on both
linear and nonlinear inverse problems and demonstrate that our technique
achieves performance comparable to state-of-the-art diffusion-based techniques,
with significant improvements in computational efficiency.Comment: 14 pages, 6 figures, preliminary versio
Structured Dropout for Weak Label and Multi-Instance Learning and Its Application to Score-Informed Source Separation
Many success stories involving deep neural networks are instances of
supervised learning, where available labels power gradient-based learning
methods. Creating such labels, however, can be expensive and thus there is
increasing interest in weak labels which only provide coarse information, with
uncertainty regarding time, location or value. Using such labels often leads to
considerable challenges for the learning process. Current methods for
weak-label training often employ standard supervised approaches that
additionally reassign or prune labels during the learning process. The
information gain, however, is often limited as only the importance of labels
where the network already yields reasonable results is boosted. We propose
treating weak-label training as an unsupervised problem and use the labels to
guide the representation learning to induce structure. To this end, we propose
two autoencoder extensions: class activity penalties and structured dropout. We
demonstrate the capabilities of our approach in the context of score-informed
source separation of music
- …