217 research outputs found

    Learning with Limited Labeled Data in Biomedical Domain by Disentanglement and Semi-Supervised Learning

    Get PDF
    In this dissertation, we are interested in improving the generalization of deep neural networks for biomedical data (e.g., electrocardiogram signal, x-ray images, etc). Although deep neural networks have attained state-of-the-art performance and, thus, deployment across a variety of domains, similar performance in the clinical setting remains challenging due to its ineptness to generalize across unseen data (e.g., new patient cohort). We address this challenge of generalization in the deep neural network from two perspectives: 1) learning disentangled representations from the deep network, and 2) developing efficient semi-supervised learning (SSL) algorithms using the deep network. In the former, we are interested in designing specific architectures and objective functions to learn representations, where variations in the data are well separated, i.e., disentangled. In the latter, we are interested in designing regularizers that encourage the underlying neural function\u27s behavior toward a common inductive bias to avoid over-fitting the function to small labeled data. Our end goal is to improve the generalization of the deep network for the diagnostic model in both of these approaches. In disentangled representations, this translates to appropriately learning latent representations from the data, capturing the observed input\u27s underlying explanatory factors in an independent and interpretable way. With data\u27s expository factors well separated, such disentangled latent space can then be useful for a large variety of tasks and domains within data distribution even with a small amount of labeled data, thus improving generalization. In developing efficient semi-supervised algorithms, this translates to utilizing a large volume of the unlabelled dataset to assist the learning from the limited labeled dataset, commonly encountered situation in the biomedical domain. By drawing ideas from different areas within deep learning like representation learning (e.g., autoencoder), variational inference (e.g., variational autoencoder), Bayesian nonparametric (e.g., beta-Bernoulli process), learning theory (e.g., analytical learning theory), function smoothing (Lipschitz Smoothness), etc., we propose several leaning algorithms to improve generalization in the associated task. We test our algorithms on real-world clinical data and show that our approach yields significant improvement over existing methods. Moreover, we demonstrate the efficacy of the proposed models in the benchmark data and simulated data to understand different aspects of the proposed learning methods. We conclude by identifying some of the limitations of the proposed methods, areas of further improvement, and broader future directions for the successful adoption of AI models in the clinical environment

    D³Net: Dual-Branch Disturbance Disentangling Network for Facial Expression Recognition

    Get PDF
    One of the main challenges in facial expression recognition (FER) is to address the disturbance caused by various disturbing factors, including common ones (such as identity, pose, and illumination) and potential ones (such as hairstyle, accessory, and occlusion). Recently, a number of FER methods have been developed to explicitly or implicitly alleviate the disturbance involved in facial images. However, these methods either consider only a few common disturbing factors or neglect the prior information of these disturbing factors, thus resulting in inferior recognition performance. In this paper, we propose a novel Dual-branch Disturbance Disentangling Network (D3Net), mainly consisting of an expression branch and a disturbance branch, to perform effective FER. In the disturbance branch, a label-aware sub-branch (LAS) and a label-free sub-branch (LFS) are elaborately designed to cope with different types of disturbing factors. On the one hand, LAS explicitly captures the disturbance due to some common disturbing factors by transfer learning on a pretrained model. On the other hand, LFS implicitly encodes the information of potential disturbing factors in an unsupervised manner. In particular, we introduce an Indian buffet process (IBP) prior to model the distribution of potential disturbing factors in LFS. Moreover, we leverage adversarial training to increase the differences between disturbance features and expression features, thereby enhancing the disentanglement of disturbing factors. By disentangling the disturbance from facial images, we are able to extract discriminative expression features. Extensive experiments demonstrate that our proposed method performs favorably against several state-of-the-art FER methods on both in-the-lab and in-the-wild databases

    C2^2VAE: Gaussian Copula-based VAE Differing Disentangled from Coupled Representations with Contrastive Posterior

    Full text link
    We present a self-supervised variational autoencoder (VAE) to jointly learn disentangled and dependent hidden factors and then enhance disentangled representation learning by a self-supervised classifier to eliminate coupled representations in a contrastive manner. To this end, a Contrastive Copula VAE (C2^2VAE) is introduced without relying on prior knowledge about data in the probabilistic principle and involving strong modeling assumptions on the posterior in the neural architecture. C2^2VAE simultaneously factorizes the posterior (evidence lower bound, ELBO) with total correlation (TC)-driven decomposition for learning factorized disentangled representations and extracts the dependencies between hidden features by a neural Gaussian copula for copula coupled representations. Then, a self-supervised contrastive classifier differentiates the disentangled representations from the coupled representations, where a contrastive loss regularizes this contrastive classification together with the TC loss for eliminating entangled factors and strengthening disentangled representations. C2^2VAE demonstrates a strong effect in enhancing disentangled representation learning. C2^2VAE further contributes to improved optimization addressing the TC-based VAE instability and the trade-off between reconstruction and representation

    Measuring axiomatic soundness of counterfactual image models

    Get PDF
    We use the axiomatic definition of counterfactual to derive metrics that enable quantifying the correctness of approximate counterfactual inference models. Abstract: We present a general framework for evaluating image counterfactuals. The power and flexibility of deep generative models make them valuable tools for learning mechanisms in structural causal models. However, their flexibility makes counterfactual identifiability impossible in the general case. Motivated by these issues, we revisit Pearl's axiomatic definition of counterfactuals to determine the necessary constraints of any counterfactual inference model: composition, reversibility, and effectiveness. We frame counterfactuals as functions of an input variable, its parents, and counterfactual parents and use the axiomatic constraints to restrict the set of functions that could represent the counterfactual, thus deriving distance metrics between the approximate and ideal functions. We demonstrate how these metrics can be used to compare and choose between different approximate counterfactual inference models and to provide insight into a model's shortcomings and trade-offs

    Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

    Full text link
    The key idea behind the unsupervised learning of disentangled representations is that real-world data is generated by a few explanatory factors of variation which can be recovered by unsupervised learning algorithms. In this paper, we provide a sober look at recent progress in the field and challenge some common assumptions. We first theoretically show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data. Then, we train more than 12000 models covering most prominent methods and evaluation metrics in a reproducible large-scale experimental study on seven different data sets. We observe that while the different methods successfully enforce properties ``encouraged'' by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Furthermore, increased disentanglement does not seem to lead to a decreased sample complexity of learning for downstream tasks. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets
    • …
    corecore