1,243 research outputs found
Linking Image and Text with 2-Way Nets
Linking two data sources is a basic building block in numerous computer
vision problems. Canonical Correlation Analysis (CCA) achieves this by
utilizing a linear optimizer in order to maximize the correlation between the
two views. Recent work makes use of non-linear models, including deep learning
techniques, that optimize the CCA loss in some feature space. In this paper, we
introduce a novel, bi-directional neural network architecture for the task of
matching vectors from two data sources. Our approach employs two tied neural
network channels that project the two views into a common, maximally correlated
space using the Euclidean loss. We show a direct link between the
correlation-based loss and Euclidean loss, enabling the use of Euclidean loss
for correlation maximization. To overcome common Euclidean regression
optimization problems, we modify well-known techniques to our problem,
including batch normalization and dropout. We show state of the art results on
a number of computer vision matching tasks including MNIST image matching and
sentence-image matching on the Flickr8k, Flickr30k and COCO datasets.Comment: 14 pages, 2 figures, 6 table
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
- …