1,543 research outputs found
Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions
Generative Adversarial Networks (GANs) is a novel class of deep generative
models which has recently gained significant attention. GANs learns complex and
high-dimensional distributions implicitly over images, audio, and data.
However, there exists major challenges in training of GANs, i.e., mode
collapse, non-convergence and instability, due to inappropriate design of
network architecture, use of objective function and selection of optimization
algorithm. Recently, to address these challenges, several solutions for better
design and optimization of GANs have been investigated based on techniques of
re-engineered network architectures, new objective functions and alternative
optimization algorithms. To the best of our knowledge, there is no existing
survey that has particularly focused on broad and systematic developments of
these solutions. In this study, we perform a comprehensive survey of the
advancements in GANs design and optimization solutions proposed to handle GANs
challenges. We first identify key research issues within each design and
optimization technique and then propose a new taxonomy to structure solutions
by key research issues. In accordance with the taxonomy, we provide a detailed
discussion on different GANs variants proposed within each solution and their
relationships. Finally, based on the insights gained, we present the promising
research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table
Joint-SRVDNet: Joint Super Resolution and Vehicle Detection Network
In many domestic and military applications, aerial vehicle detection and
super-resolutionalgorithms are frequently developed and applied independently.
However, aerial vehicle detection on super-resolved images remains a
challenging task due to the lack of discriminative information in the
super-resolved images. To address this problem, we propose a Joint
Super-Resolution and Vehicle DetectionNetwork (Joint-SRVDNet) that tries to
generate discriminative, high-resolution images of vehicles fromlow-resolution
aerial images. First, aerial images are up-scaled by a factor of 4x using a
Multi-scaleGenerative Adversarial Network (MsGAN), which has multiple
intermediate outputs with increasingresolutions. Second, a detector is trained
on super-resolved images that are upscaled by factor 4x usingMsGAN architecture
and finally, the detection loss is minimized jointly with the super-resolution
loss toencourage the target detector to be sensitive to the subsequent
super-resolution training. The network jointlylearns hierarchical and
discriminative features of targets and produces optimal super-resolution
results. Weperform both quantitative and qualitative evaluation of our proposed
network on VEDAI, xView and DOTAdatasets. The experimental results show that
our proposed framework achieves better visual quality than thestate-of-the-art
methods for aerial super-resolution with 4x up-scaling factor and improves the
accuracy ofaerial vehicle detection
Research on Symbolic Inference in Computational Vision
This paper provides an overview of ongoing research in the GRASP laboratory which focuses on the general problem of symbolic inference in computational vision. In this report we describe a conceptual framework for this research, and describe our current research programs in the component areas which support this work
Hyperspectral Data Acquisition and Its Application for Face Recognition
Current face recognition systems are rife with serious challenges in uncontrolled conditions: e.g., unrestrained lighting, pose variations, accessories, etc. Hyperspectral imaging (HI) is typically employed to counter many of those challenges, by incorporating the spectral information within different bands. Although numerous methods based on hyperspectral imaging have been developed for face recognition with promising results, three fundamental challenges remain: 1) low signal to noise ratios and low intensity values in the bands of the hyperspectral image specifically near blue bands; 2) high dimensionality of hyperspectral data; and 3) inter-band misalignment (IBM) correlated with subject motion during data acquisition.
This dissertation concentrates mainly on addressing the aforementioned challenges in HI. First, to address low quality of the bands of the hyperspectral image, we utilize a custom light source that has more radiant power at shorter wavelengths and properly adjust camera exposure times corresponding to lower transmittance of the filter and lower radiant power of our light source.
Second, the high dimensionality of spectral data imposes limitations on numerical analysis. As such, there is an emerging demand for robust data compression techniques with lows of less relevant information to manage real spectral data. To cope with these challenging problems, we describe a reduced-order data modeling technique based on local proper orthogonal decomposition in order to compute low-dimensional models by projecting high-dimensional clusters onto subspaces spanned by local reduced-order bases.
Third, we investigate 11 leading alignment approaches to address IBM correlated with subject motion during data acquisition. To overcome the limitations of the considered alignment approaches, we propose an accurate alignment approach ( A3) by incorporating the strengths of point correspondence and a low-rank model. In addition, we develop two qualitative prediction models to assess the alignment quality of hyperspectral images in determining improved alignment among the conducted alignment approaches. Finally, we show that the proposed alignment approach leads to promising improvement on face recognition performance of a probabilistic linear discriminant analysis approach
Supervised Learning in Time-dependent Environments with Performance Guarantees
In practical scenarios, it is common to learn from a sequence of related problems (tasks).
Such tasks are usually time-dependent in the sense that consecutive tasks are often
significantly more similar. Time-dependency is common in multiple applications such
as load forecasting, spam main filtering, and face emotion recognition. For instance, in
the problem of load forecasting, the consumption patterns in consecutive time periods
are significantly more similar since human habits and weather factors change gradually
over time. Learning from a sequence tasks holds promise to enable accurate performance
even with few samples per task by leveraging information from different tasks. However,
harnessing the benefits of learning from a sequence of tasks is challenging since tasks
are characterized by different underlying distributions.
Most existing techniques are designed for situations where the tasksā similarities
do not depend on their order in the sequence. Existing techniques designed for timedependent
tasks adapt to changes between consecutive tasks accounting for a scalar
rate of change by using a carefully chosen parameter such as a learning rate or a weight
factor. However, the tasksā changes are commonly multidimensional, i.e., the timedependency
often varies across different statistical characteristics describing the tasks.
For instance, in the problem of load forecasting, the statistical characteristics related
to weather factors often change differently from those related to generation.
In this dissertation, we establish methodologies for supervised learning from a sequence
of time-dependent tasks that effectively exploit information from all tasks,
provide multidimensional adaptation to tasksā changes, and provide computable tight
performance guarantees. We develop methods for supervised learning settings where
tasks arrive over time including techniques for supervised classification under concept
drift (SCD) and techniques for continual learning (CL). In addition, we present techniques
for load forecasting that can adapt to time changes in consumption patterns
and assess intrinsic uncertainties in load demand. The numerical results show that the
proposed methodologies can significantly improve the performance of existing methods
using multiple benchmark datasets. This dissertation makes theoretical contributions
leading to efficient algorithms for multiple machine learning scenarios that provide computable
performance guarantees and superior performance than state-of-the-art techniques
- ā¦