5,117 research outputs found

    ON LEARNING COMPOSABLE AND DECOMPOSABLE GENERATIVE MODELS USING PRIOR INFORMATION

    Get PDF
    Within the field of machining learning, supervised learning has gained much success recently, and the research focus moves towards unsupervised learning. A generative model is a powerful way of unsupervised learning that models data distribution. Deep generative models like generative adversarial networks (GANs), can generate high-quality samples for various applications. However, these generative models are not easy to understand. While it is easy to generate samples from these models, the breadth of the samples that can be generated is difficult to ascertain. Further, most existing models are trained from scratch and do not take advantage of the compositional nature of the data. To address these deficiencies, I propose a composition and decomposition framework for generative models. This framework includes three types of components: part generators, composition operation, and decomposition operation. In the framework, a generative model could have multiple part generators that generate different parts of a sample independently. What a part generator should generate is explicitly defined by users. This explicit ”division of responsibility” provides more modularity to the whole system. Similar to software design, this modular modeling makes each module (part generators) more reusable and allows users to build increasingly complex generative models from simpler ones. The composition operation composes the parts from the part generators into a whole sample, whereas the decomposition operation is an inversed operation of composition. On the other hand, given the composed data, components of the framework are not necessarily identifiable. Inspired by other signal decomposition methods, we incorporate prior information to the model to solve this problem. We show that we can identify all of the components by incorporating prior information about one or more of the components. Furthermore, we show both theoretically and experimentally how much prior information is needed to identify the components of the model. Concerning the applications of this framework, we apply the framework to sparse dictionary learning (SDL) and offer our dictionary learning method, MOLDL. With MOLDL, we can easily include prior information about part generators; thus, we learn a generative model that results in a better signal decomposition operation. The experiments show our method decomposes ion mass signals more accurately than other signal decomposition methods. Further, we apply the framework to generative adversarial networks (GANs). Our composition/decomposition GAN learns the foreground part and background part generators that are responsible for different parts of the data. The resulting generators are easier to control and understand. Also, we show both theoretically and experimentally how much prior information is needed to identify different components of the framework. Precisely, we show that we can learn a reasonable part generator given only the composed data and composition operation. Moreover, we show the composable generators has better performance than their non-composable generative counterparts. Lastly, we propose two use cases that show transfer learning is feasible under this framework.Doctor of Philosoph

    Hierarchy Composition GAN for High-fidelity Image Synthesis

    Full text link
    Despite the rapid progress of generative adversarial networks (GANs) in image synthesis in recent years, the existing image synthesis approaches work in either geometry domain or appearance domain alone which often introduces various synthesis artifacts. This paper presents an innovative Hierarchical Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network and achieves superior synthesis realism in both domains simultaneously. We design an innovative hierarchical composition mechanism that is capable of learning realistic composition geometry and handling occlusions while multiple foreground objects are involved in image composition. In addition, we introduce a novel attention mask mechanism that guides to adapt the appearance of foreground objects which also helps to provide better training reference for learning in geometry domain. Extensive experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure
    corecore