16 research outputs found
SGAN: An Alternative Training of Generative Adversarial Networks
The Generative Adversarial Networks (GANs) have demonstrated impressive
performance for data synthesis, and are now used in a wide range of computer
vision tasks. In spite of this success, they gained a reputation for being
difficult to train, what results in a time-consuming and human-involved
development process to use them.
We consider an alternative training process, named SGAN, in which several
adversarial "local" pairs of networks are trained independently so that a
"global" supervising pair of networks can be trained against them. The goal is
to train the global pair with the corresponding ensemble opponent for improved
performances in terms of mode coverage. This approach aims at increasing the
chances that learning will not stop for the global pair, preventing both to be
trapped in an unsatisfactory local minimum, or to face oscillations often
observed in practice. To guarantee the latter, the global pair never affects
the local ones.
The rules of SGAN training are thus as follows: the global generator and
discriminator are trained using the local discriminators and generators,
respectively, whereas the local networks are trained with their fixed local
opponent.
Experimental results on both toy and real-world problems demonstrate that
this approach outperforms standard training in terms of better mitigating mode
collapse, stability while converging and that it surprisingly, increases the
convergence speed as well
Continuous-time Analysis for Variational Inequalities: An Overview and Desiderata
Algorithms that solve zero-sum games, multi-objective agent objectives, or,
more generally, variational inequality (VI) problems are notoriously unstable
on general problems. Owing to the increasing need for solving such problems in
machine learning, this instability has been highlighted in recent years as a
significant research challenge. In this paper, we provide an overview of recent
progress in the use of continuous-time perspectives in the analysis and design
of methods targeting the broad VI problem class. Our presentation draws
parallels between single-objective problems and multi-objective problems,
highlighting the challenges of the latter. We also formulate various desiderata
for algorithms that apply to general VIs and we argue that achieving these
desiderata may profit from an understanding of the associated continuous-time
dynamics
Taming GANs with Lookahead
Generative Adversarial Networks are notoriously challenging to train. The
underlying minimax optimization is highly susceptible to the variance of the
stochastic gradient and the rotational component of the associated game vector
field. We empirically demonstrate the effectiveness of the Lookahead
meta-optimization method for optimizing games, originally proposed for standard
minimization. The backtracking step of Lookahead naturally handles the
rotational game dynamics, which in turn enables the gradient ascent descent
method to converge on challenging toy games often analyzed in the literature.
Moreover, it implicitly handles high variance without using large mini-batches,
known to be essential for reaching state of the art performance. Experimental
results on MNIST, SVHN, and CIFAR-10, demonstrate a clear advantage of
combining Lookahead with Adam or extragradient, in terms of performance, memory
footprint, and improved stability. Using 30-fold fewer parameters and 16-fold
smaller minibatches we outperform the reported performance of the
class-dependent BigGAN on CIFAR-10 by obtaining FID of \emph{without}
using the class labels, bringing state-of-the-art GAN training within reach of
common computational resources
Revisiting the ACVI Method for Constrained Variational Inequalities
ACVI is a recently proposed first-order method for solving variational
inequalities (VIs) with general constraints. Yang et al. (2022) showed that the
gap function of the last iterate decreases at a rate of
when the operator is -Lipschitz, monotone,
and at least one constraint is active.
In this work, we show that the same guarantee holds when only assuming that
the operator is monotone.
To our knowledge, this is the first analytically derived last-iterate
convergence rate for general monotone VIs, and overall the only one that does
not rely on the assumption that the operator is -Lipschitz.
Furthermore, when the sub-problems of ACVI are solved approximately, we show
that by using a standard warm-start technique the convergence rate stays the
same, provided that the errors decrease at appropriate rates.
We further provide empirical analyses and insights on its implementation for
the latter case
Deep Generative Models and Applications
Over the past few years, there have been fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. The amount of annotated data drastically increased and supervised deep discriminative models exceeded human-level performances in certain object detection tasks. The increasing availability in quantity and complexity of unlabelled data also opens up exciting possibilities for the development of unsupervised learning methods.
Among the family of unsupervised methods, deep generative models find numerous applications. Moreover, as real-world applications include high dimensional data, the ability of generative models to automatically learn semantically meaningful subspaces makes their advancement an essential step toward developing more efficient algorithms.
Generative Adversarial Networks (GANs) are a family of unsupervised generative algorithms that have demonstrated impressive performance for data synthesis and are now used in a wide range of computer vision tasks. Despite this success, they gained a reputation for being difficult to train, which results in a time-consuming and human-involved development process to use them. In the first part of this thesis, we focus on improving the stability and the performances of GANs.
Foremost, we consider an alternative training process to the standard one, named SGAN, in which several adversarial âlocalâ pairs of networks are trained independently so that a âglobalâ supervising pair of networks can be trained against them. Experimental results on both toy and real-world problems demonstrate that this approach outperforms standard training in terms of better mitigating mode collapse, stability while converging and that it surprisingly, increases the convergence speed as well.
To further reduce the computational footprint while maintaining the stability and performance advantages of SGAN, we focus on training a single pair of adversarial networks using variance reduced gradient. More precisely, we study the effect of the stochastic gradient noise on the training of generative adversarial networks (GANs) and show that it can prevent the convergence of standard game optimization methods, while the batch version converges. We address this issue with two stochastic variance-reduced gradient and extragradient optimization algorithms for GANs, named SVRG-GAN and SVRE, respectively. We observe empirically that SVRE performs similarly to a batch method on the MNIST dataset, while being computationally cheaper, and that SVRE yields more stable GAN training on standard datasets.
In the second part of the thesis we present our work on people detection. People detection methods are highly sensitive to occlusions between pedestrians, and using joint visual information from multiple synchronized cameras gives the opportunity to improve detection performance. We address the problem of multi-view people occupancy map estimation using an endâtoâend deep learning algorithm called DeepMCD that jointly utilizes the correlated streams of visual information. DeepMCD empirically outperformed the classical approaches by a large margin. Finally, we present a new large-scale and high-resolution dataset, named WILDTRACK. We provide an accurate joint calibration, as well as a series of benchmark results using baseline algorithms published over the recent months for multi-view detection with deep neural networks, and trajectory estimation using a non-Markovian model
Parallel Architecture Prototype for 60 GHz High Data Rate Wireless Single Carrier Receiver
Nowadays a huge attention of the academia and research teams is attracted to the potential of the usage of the 60 GHz frequency band in the wireless communications. The use of the 60GHz frequency band offers great possibilities for wide variety of applications that are yet to be implemented. These applications also imply huge implementation challenges. Such example is building a high data rate transceiver which at the same time would have very low power consumption. In this paper we present a prototype of Single Carrier -SC transceiver system, illustrating a brief overview of the baseband design, emphasizing the most important decisions that need to be done. A brief overview of the possible approaches when implementing the equalizer, as the most complex module in the SC transceiver, is also presented. The main focus of this paper is to suggest a parallel architecture for the receiver in a Single Carrier communication system. This would provide higher data rates that the communication system canachieve, for a price of higher power consumption. The suggested architecture of such receiver is illustrated in this paper,giving the results of its implementation in comparison with its corresponding serial implementation