12 research outputs found

    Self Sparse Generative Adversarial Networks

    Get PDF
    Generative Adversarial Networks (GANs) are an unsupervised generative model that learns data distribution through adversarial training. However, recent experiments indicated that GANs are difficult to train due to the requirement of optimization in the high dimensional parameter space and the zero gradient problem. In this work, we propose a Self Sparse Generative Adversarial Network (Self-Sparse GAN) that reduces the parameter space and alleviates the zero gradient problem. In the Self-Sparse GAN, we design a Self-Adaptive Sparse Transform Module (SASTM) comprising the sparsity decomposition and feature-map recombination, which can be applied on multi-channel feature maps to obtain sparse feature maps. The key idea of Self-Sparse GAN is to add the SASTM following every deconvolution layer in the generator, which can adaptively reduce the parameter space by utilizing the sparsity in multi-channel feature maps. We theoretically prove that the SASTM can not only reduce the search space of the convolution kernel weight of the generator but also alleviate the zero gradient problem by maintaining meaningful features in the Batch Normalization layer and driving the weight of deconvolution layers away from being negative. The experimental results show that our method achieves the best FID scores for image generation compared with WGAN-GP on MNIST, Fashion-MNIST, CIFAR-10, STL-10, mini-ImageNet, CELEBA-HQ, and LSUN bedrooms, and the relative decrease of FID is 4.76% ~ 21.84%

    Learning Generalizable Visual Patterns Without Human Supervision

    Get PDF
    Owing to the existence of large labeled datasets, Deep Convolutional Neural Networks have ushered in a renaissance in computer vision. However, almost all of the visual data we generate daily - several human lives worth of it - remains unlabeled and thus out of reach of today’s dominant supervised learning paradigm. This thesis focuses on techniques that steer deep models towards learning generalizable visual patterns without human supervision. Our primary tool in this endeavor is the design of Self-Supervised Learning tasks, i.e., pretext-tasks for which labels do not involve human labor. Besides enabling the learning from large amounts of unlabeled data, we demonstrate how self-supervision can capture relevant patterns that supervised learning largely misses. For example, we design learning tasks that learn deep representations capturing shape from images, motion from video, and 3D pose features from multi-view data. Notably, these tasks’ design follows a common principle: The recognition of data transformations. The strong performance of the learned representations on downstream vision tasks such as classification, segmentation, action recognition, or pose estimation validate this pretext-task design. This thesis also explores the use of Generative Adversarial Networks (GANs) for unsupervised representation learning. Besides leveraging generative adversarial learning to define image transformation for self-supervised learning tasks, we also address training instabilities of GANs through the use of noise. While unsupervised techniques can significantly reduce the burden of supervision, in the end, we still rely on some annotated examples to fine-tune learned representations towards a target task. To improve the learning from scarce or noisy labels, we describe a supervised learning algorithm with improved generalization in these challenging settings

    Synthesis of normal and abnormal heart sounds using Generative Adversarial Networks

    Get PDF
    En esta tesis doctoral se presentan diferentes métodos propuestos para el análisis y síntesis de sonidos cardíacos normales y anormales, logrando los siguientes aportes al estado del arte: i) Se implementó un algoritmo basado en la transformada wavelet empírica (EWT) y la energía promedio normalizada de Shannon (NASE) para mejorar la etapa de segmentación automática de los sonidos cardíacos; ii) Se implementaron diferentes técnicas de extracción de características para las señales cardíacas utilizando los coeficientes cepstrales de frecuencia Mel (MFCC), los coeficientes de predicción lineal (LPC) y los valores de potencia. Además, se probaron varios modelos de Machine Learning para la clasificación automática de sonidos cardíacos normales y anormales; iii) Se diseñó un modelo basado en Redes Adversarias Generativas (GAN) para generar sonidos cardíacos sintéticos normales. Además, se implementa un algoritmo de eliminación de ruido utilizando EWT, lo que permite una disminución en la cantidad de épocas y el costo computacional que requiere el modelo GAN; iv) Finalmente, se propone un modelo basado en la arquitectura GAN, que consiste en refinar señales cardíacas sintéticas obtenidas por un modelo matemático con características de señales cardíacas reales. Este modelo se ha denominado FeaturesGAN y no requiere una gran base de datos para generar diferentes tipos de sonidos cardíacos. Cada uno de estos aportes fueron validados con diferentes métodos objetivos y comparados con trabajos publicados en el estado del arte, obteniendo resultados favorables.DoctoradoDoctor en Ingeniería Eléctrica y Electrónic

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy
    corecore