67 research outputs found

    Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings, and Future Directions

    Full text link
    The success of deep learning is largely due to the availability of large amounts of training data that cover a wide range of examples of a particular concept or meaning. In the field of medicine, having a diverse set of training data on a particular disease can lead to the development of a model that is able to accurately predict the disease. However, despite the potential benefits, there have not been significant advances in image-based diagnosis due to a lack of high-quality annotated data. This article highlights the importance of using a data-centric approach to improve the quality of data representations, particularly in cases where the available data is limited. To address this "small-data" issue, we discuss four methods for generating and aggregating training data: data augmentation, transfer learning, federated learning, and GANs (generative adversarial networks). We also propose the use of knowledge-guided GANs to incorporate domain knowledge in the training data generation process. With the recent progress in large pre-trained language models, we believe it is possible to acquire high-quality knowledge that can be used to improve the effectiveness of knowledge-guided generative methods.Comment: 21 pages, 13 figures, 4 table

    Improving Model Robustness with Latent Distribution Locally and Globally

    Full text link
    In this work, we consider model robustness of deep neural networks against adversarial attacks from a global manifold perspective. Leveraging both the local and global latent information, we propose a novel adversarial training method through robust optimization, and a tractable way to generate Latent Manifold Adversarial Examples (LMAEs) via an adversarial game between a discriminator and a classifier. The proposed adversarial training with latent distribution (ATLD) method defends against adversarial attacks by crafting LMAEs with the latent manifold in an unsupervised manner. ATLD preserves the local and global information of latent manifold and promises improved robustness against adversarial attacks. To verify the effectiveness of our proposed method, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW), and show that our method substantially outperforms the state-of-the-art (e.g., Feature Scattering) in adversarial robustness by a large accuracy margin. The source codes are available at https://github.com/LitterQ/ATLD-pytorch

    Generative models for natural images

    Full text link
    Nous traitons de modeĢ€les geĢneĢratifs construits avec des reĢseaux de neurones dans le contexte de la modeĢlisation dā€™images. De nos jours, trois types de modeĢ€les sont particulieĢ€rement preĢdominants: les modeĢ€les aĢ€ variables latentes, tel que lā€™auto-encodeur variationnel (VAE), les modeĢ€les autoreĢgressifs, tel que le reĢseau de neurones reĢcurrent pixel (PixelRNN), et les modeĢ€les geĢneĢratifs antagonistes (GANs), qui sont des modeĢ€les aĢ€ transformation de bruit entraineĢs aĢ€ lā€™aide dā€™un adversaire. Cette theĢ€se traite de chacun de ces modeĢ€les. Le premier chapitre couvre la base des modeĢ€les geĢneĢratifs, ainsi que les reĢseaux de neurones pro- fonds, qui constituent la technologie principalement utiliseĢe aĢ€ lā€™heure actuelle pour lā€™impleĢmentation de modeĢ€les statistiques puissants. Dans le deuxieĢ€me chapitre, nous impleĢmentons un auto-encodeur variationnel avec un deĢcodeur auto-reĢgressif. Cela permet de se libeĢrer de lā€™hypotheĢ€se dā€™indeĢpendance des dimensions de sortie du deĢcodeur variationnel, en modeĢlisant une distribution jointe tracĢ§able aĢ€ la place, et de doter le modeĢ€le auto-reĢgressif dā€™un code latent. De plus, notre impleĢmentation a un couĢ‚t computationnel significativement reĢduit, si on le compare aĢ€ un modeĢ€le purement auto-reĢgressif ayant les meĢ‚mes hypotheĢ€ses de modeĢlisation et la meĢ‚me performance. Nous deĢcrivons lā€™espace latent de facĢ§on hieĢrarchique, et montrons de manieĢ€re qualitative la deĢcomposition seĢmantique des causes latente induites par ce design. Finalement, nous preĢsentons des reĢsultats obtenus avec des jeux de donneĢes standards et deĢmontrant que la performance de notre impleĢmentation est fortement compeĢtitive. Dans le troisieĢ€me chapitre, nous preĢsentons une proceĢdure dā€™entrainement ameĢlioreĢe pour une variante reĢcente de modeĢ€les geĢneĢratifs antagoniste. Le Ā«Wasserstein GANĀ» minimise la distance, mesureĢe avec la meĢtrique de Wasserstein, entre la distribution reĢelle et celle geĢneĢreĢe par le modeĢ€le, ce qui le rend plus facile aĢ€ entrainer quā€™un GAN avec un objectif minimax. Cependant, en fonction des parameĢ€tres, il preĢsente toujours des cas dā€™eĢchecs avec certain modes dā€™entrainement. Nous avons deĢcouvert que le coupable est le coupage des poids, et nous le remplacĢ§ons par une peĢnaliteĢ sur la norme des gradients. Ceci ameĢliore et stabilise lā€™entrainement, et ce sur diffeĢrents types du parameĢ€tres (incluant des modeĢ€les de langue sur des donneĢes discreĢ€tes), et permet de geĢneĢrer des eĢchantillons de haute qualiteĢs sur CIFAR-10 et LSUN bedrooms. Finalement, dans le quatrieĢ€me chapitre, nous consideĢrons lā€™usage de modeĢ€les geĢneĢratifs modernes comme modeĢ€les de normaliteĢ dans un cadre de deĢtection hors-distribution Ā«zero-shotĀ». Nous avons eĢvalueĢ certains des modeĢ€les preĢceĢdemment preĢsenteĢs dans la theĢ€se, et avons trouveĢ que les VAEs sont les plus prometteurs, bien que leurs performances laissent encore un large place aĢ€ lā€™ameĢlioration. Cette partie de la theĢ€se constitue un travail en cours. Nous concluons en reĢpeĢtant lā€™importance des modeĢ€les geĢneĢratifs dans le deĢveloppement de lā€™intelligence artificielle et mentionnons quelques deĢfis futurs.We discuss modern generative modelling of natural images based on neural networks. Three varieties of such models are particularly predominant at the time of writing: latent variable models such as variational autoencoders (VAE), autoregressive models such as pixel recurrent neural networks (PixelRNN), and generative adversarial networks (GAN), which are noise-transformation models trained with an adversary. This thesis touches on all three kinds. The first chapter covers background on generative models, along with relevant discussions about deep neural networks, which are currently the dominant technology for implementing powerful statistical models. In the second chapter, we implement variational autoencoders with autoregressive decoders. This removes the strong assumption of output dimensions being conditionally independent in variational autoencoders, instead tractably modelling a joint distribution, while also endowing autoregressive models with a latent code. Additionally, this model has significantly reduced computational cost compared to that of a purely autoregressive model with similar modelling assumptions and performance. We express the latent space as a hierarchy, and qualitatively demonstrate the semantic decomposition of latent causes induced by this design. Finally, we present results on standard datasets that demonstrate strongly competitive performance. In the third chapter, we present an improved training procedure for a recent variant on generative adversarial networks. Wasserstein GANs minimize the Earth-Moverā€™s distance between the real and generated distributions and have been shown to be much easier to train than with the standard minimax objective of GANs. However, they still exhibit some failure modes in training for some settings. We identify weight clipping as a culprit and replace it with a penalty on the gradient norm. This improves training further, and we demonstrate stability on a wide variety of settings (including language models over discrete data), and samples of high quality on the CIFAR-10 and LSUN bedrooms datasets. Finally, in the fourth chapter, we present work in development, where we consider the use of modern generative models as normality models in a zero-shot out-of-distribution detection setting. We evaluate some of the models we have discussed previously in the thesis, and find that VAEs are the most promising, although their overall performance leaves a lot of room for improvement. We conclude by reiterating the significance of generative modelling in the development of artificial intelligence, and mention some of the challenges ahead

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Choice cell architecture and stable neighbor match training to increase interpretability and stability of deep generative modeling

    Get PDF
    Although Generative Adversarial Networks (GANs) have achieved much success in various unsupervised learning tasks, their training is unstable. Another limitation of GAN and deep neural networks in general is their lack of interpretability. To help address these gaps, we aim to improve training stability of GAN and interpretability of deep learning models. To improve stability of GAN, we propose a Stable Neighbor Match (SNM) training. SNM searches for a stable match between generated and real samples, and then approximates a Wasserstein distance based on the stable match. Our experimental results show that SNM is a stable and effective training method for unsupervised learning. To develop more explainable neural components, we propose an interpretable architecture called the Choice Cell (CC). An advantage of CC is that its hidden representation can be reduced to intuitive interpretation of probability distribution. We then combine CC with other subgenerators to build the Choice Generator (CG). Experimental results indicate that CG is not only more explainable but also maintains comparable performance with other popular generators. In addition, to help subgenerators of CG learn more homogeneous representations, we apply within and between subgenerator regularization to the training of CG. We find that regularization improves the performance of CG in learning imbalanced data. Finally, we extend CC to an interpretable conditional model called the Conditional Choice Cell (CCC). The results indicate the potential of CCC as an effective conditional model with an advantage of being more explainable

    Neural distribution estimation as a two-part problem

    Get PDF
    Given a dataset of examples, distribution estimation is the task of approximating the assumed underlying probability distribution from which those samples were drawn. Neural distribution estimation relies on the powerful function approximation capabilities of deep neural networks to build models for this purpose, and excels when data is high-dimensional and exhibits complex, nonlinear dependencies. In this thesis, we explore several approaches to neural distribution estimation, and present a unified perspective for these methods based on a two-part design principle. In particular, we examine how many models iteratively break down the task of distribution estimation into a series of tractable sub-tasks, before fitting a multi-step generative process which combines solutions to these sub-tasks in order to approximate the data distribution of interest. Framing distribution estimation as a two-part problem provides a shared language in which to compare and contrast prevalent models in the literature, and also allows for discussion of alternative approaches which do not follow this structure. We first present the Autoregressive Energy Machine, an energy-based model which is trained by approximate maximum likelihood through an autoregressive decomposition. The method demonstrates the flexibility of an energy-based model over an explicitly normalized model, and the novel application of autoregressive importance sampling highlights the benefit of an autoregressive approach to distribution estimation which recursively transforms the problem into a series of univariate tasks. Next, we present Neural Spline Flows, a class of normalizing flow models based on monotonic spline transformations which admit both an explicit inverse and a tractable Jacobian determinant. Normalizing flows tackle distribution estimation by searching for an invertible map between the data distribution and a more tractable base distribution, and this map is typically constructed as the composition of a series of invertible building blocks. We demonstrate that spline flows can be used to enhance density estimation of tabular data, variational inference in latent variable models, and generative modeling of natural images. The third chapter presents Maximum Likelihood Training of Score-Based Diffusion Models. Generative models based on estimation of the gradient of the logarithm of the probability density---or score function---have recently gained traction as a powerful modeling paradigm, in which the data distribution is gradually transformed toward a tractable base distribution by means of a stochastic process. The paper illustrates how this class of models can be trained by maximum likelihood, resulting in a model which is functionally equivalent to a continuous normalizing flow, and which bridges the gap between two branches of the literature. We also discuss latent-variable generative models more broadly, of which diffusion models are a structured special case. Finally, we present On Contrastive Learning for Likelihood-Free Inference, a unifying perspective for likelihood-free inference methods which perform Bayesian inference using either density estimation or density-ratio estimation. Likelihood-free inference focuses on inference in stochastic simulator models where the likelihood of parameters given observations is computationally intractable, and traditional inference methods fall short. In addition to illustrating the power of normalizing flows as generic tools for density estimation, this chapter also gives us the opportunity to discuss likelihood-free models more broadly. These so-called implicit generative models form a large part of the distribution estimation literature under the umbrella of generative adversarial networks, and are distinct in how they treat distribution estimation as a one-part problem
    • ā€¦
    corecore