36,132 research outputs found
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Word Recognition with Deep Conditional Random Fields
Recognition of handwritten words continues to be an important problem in
document analysis and recognition. Existing approaches extract hand-engineered
features from word images--which can perform poorly with new data sets.
Recently, deep learning has attracted great attention because of the ability to
learn features from raw data. Moreover they have yielded state-of-the-art
results in classification tasks including character recognition and scene
recognition. On the other hand, word recognition is a sequential problem where
we need to model the correlation between characters. In this paper, we propose
using deep Conditional Random Fields (deep CRFs) for word recognition.
Basically, we combine CRFs with deep learning, in which deep features are
learned and sequences are labeled in a unified framework. We pre-train the deep
structure with stacked restricted Boltzmann machines (RBMs) for feature
learning and optimize the entire network with an online learning algorithm. The
proposed model was evaluated on two datasets, and seen to perform significantly
better than competitive baseline models. The source code is available at
https://github.com/ganggit/deepCRFs.Comment: 5 pages, published in ICIP 2016. arXiv admin note: substantial text
overlap with arXiv:1412.339
- …