1,543 research outputs found

    Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

    Full text link
    Generative Adversarial Networks (GANs) is a novel class of deep generative models which has recently gained significant attention. GANs learns complex and high-dimensional distributions implicitly over images, audio, and data. However, there exists major challenges in training of GANs, i.e., mode collapse, non-convergence and instability, due to inappropriate design of network architecture, use of objective function and selection of optimization algorithm. Recently, to address these challenges, several solutions for better design and optimization of GANs have been investigated based on techniques of re-engineered network architectures, new objective functions and alternative optimization algorithms. To the best of our knowledge, there is no existing survey that has particularly focused on broad and systematic developments of these solutions. In this study, we perform a comprehensive survey of the advancements in GANs design and optimization solutions proposed to handle GANs challenges. We first identify key research issues within each design and optimization technique and then propose a new taxonomy to structure solutions by key research issues. In accordance with the taxonomy, we provide a detailed discussion on different GANs variants proposed within each solution and their relationships. Finally, based on the insights gained, we present the promising research directions in this rapidly growing field.Comment: 42 pages, Figure 13, Table

    Joint-SRVDNet: Joint Super Resolution and Vehicle Detection Network

    Get PDF
    In many domestic and military applications, aerial vehicle detection and super-resolutionalgorithms are frequently developed and applied independently. However, aerial vehicle detection on super-resolved images remains a challenging task due to the lack of discriminative information in the super-resolved images. To address this problem, we propose a Joint Super-Resolution and Vehicle DetectionNetwork (Joint-SRVDNet) that tries to generate discriminative, high-resolution images of vehicles fromlow-resolution aerial images. First, aerial images are up-scaled by a factor of 4x using a Multi-scaleGenerative Adversarial Network (MsGAN), which has multiple intermediate outputs with increasingresolutions. Second, a detector is trained on super-resolved images that are upscaled by factor 4x usingMsGAN architecture and finally, the detection loss is minimized jointly with the super-resolution loss toencourage the target detector to be sensitive to the subsequent super-resolution training. The network jointlylearns hierarchical and discriminative features of targets and produces optimal super-resolution results. Weperform both quantitative and qualitative evaluation of our proposed network on VEDAI, xView and DOTAdatasets. The experimental results show that our proposed framework achieves better visual quality than thestate-of-the-art methods for aerial super-resolution with 4x up-scaling factor and improves the accuracy ofaerial vehicle detection

    Research on Symbolic Inference in Computational Vision

    Get PDF
    This paper provides an overview of ongoing research in the GRASP laboratory which focuses on the general problem of symbolic inference in computational vision. In this report we describe a conceptual framework for this research, and describe our current research programs in the component areas which support this work

    Hyperspectral Data Acquisition and Its Application for Face Recognition

    Get PDF
    Current face recognition systems are rife with serious challenges in uncontrolled conditions: e.g., unrestrained lighting, pose variations, accessories, etc. Hyperspectral imaging (HI) is typically employed to counter many of those challenges, by incorporating the spectral information within different bands. Although numerous methods based on hyperspectral imaging have been developed for face recognition with promising results, three fundamental challenges remain: 1) low signal to noise ratios and low intensity values in the bands of the hyperspectral image specifically near blue bands; 2) high dimensionality of hyperspectral data; and 3) inter-band misalignment (IBM) correlated with subject motion during data acquisition. This dissertation concentrates mainly on addressing the aforementioned challenges in HI. First, to address low quality of the bands of the hyperspectral image, we utilize a custom light source that has more radiant power at shorter wavelengths and properly adjust camera exposure times corresponding to lower transmittance of the filter and lower radiant power of our light source. Second, the high dimensionality of spectral data imposes limitations on numerical analysis. As such, there is an emerging demand for robust data compression techniques with lows of less relevant information to manage real spectral data. To cope with these challenging problems, we describe a reduced-order data modeling technique based on local proper orthogonal decomposition in order to compute low-dimensional models by projecting high-dimensional clusters onto subspaces spanned by local reduced-order bases. Third, we investigate 11 leading alignment approaches to address IBM correlated with subject motion during data acquisition. To overcome the limitations of the considered alignment approaches, we propose an accurate alignment approach ( A3) by incorporating the strengths of point correspondence and a low-rank model. In addition, we develop two qualitative prediction models to assess the alignment quality of hyperspectral images in determining improved alignment among the conducted alignment approaches. Finally, we show that the proposed alignment approach leads to promising improvement on face recognition performance of a probabilistic linear discriminant analysis approach

    Supervised Learning in Time-dependent Environments with Performance Guarantees

    Get PDF
    In practical scenarios, it is common to learn from a sequence of related problems (tasks). Such tasks are usually time-dependent in the sense that consecutive tasks are often significantly more similar. Time-dependency is common in multiple applications such as load forecasting, spam main filtering, and face emotion recognition. For instance, in the problem of load forecasting, the consumption patterns in consecutive time periods are significantly more similar since human habits and weather factors change gradually over time. Learning from a sequence tasks holds promise to enable accurate performance even with few samples per task by leveraging information from different tasks. However, harnessing the benefits of learning from a sequence of tasks is challenging since tasks are characterized by different underlying distributions. Most existing techniques are designed for situations where the tasksā€™ similarities do not depend on their order in the sequence. Existing techniques designed for timedependent tasks adapt to changes between consecutive tasks accounting for a scalar rate of change by using a carefully chosen parameter such as a learning rate or a weight factor. However, the tasksā€™ changes are commonly multidimensional, i.e., the timedependency often varies across different statistical characteristics describing the tasks. For instance, in the problem of load forecasting, the statistical characteristics related to weather factors often change differently from those related to generation. In this dissertation, we establish methodologies for supervised learning from a sequence of time-dependent tasks that effectively exploit information from all tasks, provide multidimensional adaptation to tasksā€™ changes, and provide computable tight performance guarantees. We develop methods for supervised learning settings where tasks arrive over time including techniques for supervised classification under concept drift (SCD) and techniques for continual learning (CL). In addition, we present techniques for load forecasting that can adapt to time changes in consumption patterns and assess intrinsic uncertainties in load demand. The numerical results show that the proposed methodologies can significantly improve the performance of existing methods using multiple benchmark datasets. This dissertation makes theoretical contributions leading to efficient algorithms for multiple machine learning scenarios that provide computable performance guarantees and superior performance than state-of-the-art techniques
    • ā€¦
    corecore