137,559 research outputs found

    Concept-based explainability for an EEG transformer model

    Full text link
    Deep learning models are complex due to their size, structure, and inherent randomness in training procedures. Additional complexity arises from the selection of datasets and inductive biases. Addressing these challenges for explainability, Kim et al. (2018) introduced Concept Activation Vectors (CAVs), which aim to understand deep models' internal states in terms of human-aligned concepts. These concepts correspond to directions in latent space, identified using linear discriminants. Although this method was first applied to image classification, it was later adapted to other domains, including natural language processing. In this work, we attempt to apply the method to electroencephalogram (EEG) data for explainability in Kostas et al.'s BENDR (2021), a large-scale transformer model. A crucial part of this endeavor involves defining the explanatory concepts and selecting relevant datasets to ground concepts in the latent space. Our focus is on two mechanisms for EEG concept formation: the use of externally labeled EEG datasets, and the application of anatomically defined concepts. The former approach is a straightforward generalization of methods used in image classification, while the latter is novel and specific to EEG. We present evidence that both approaches to concept formation yield valuable insights into the representations learned by deep EEG models.Comment: To appear in proceedings of 2023 IEEE International workshop on Machine Learning for Signal Processin

    Disentangling Factors of Variation with Cycle-Consistent Variational Auto-Encoders

    Full text link
    Generative models that learn disentangled representations for different factors of variation in an image can be very useful for targeted data augmentation. By sampling from the disentangled latent subspace of interest, we can efficiently generate new data necessary for a particular task. Learning disentangled representations is a challenging problem, especially when certain factors of variation are difficult to label. In this paper, we introduce a novel architecture that disentangles the latent space into two complementary subspaces by using only weak supervision in form of pairwise similarity labels. Inspired by the recent success of cycle-consistent adversarial architectures, we use cycle-consistency in a variational auto-encoder framework. Our non-adversarial approach is in contrast with the recent works that combine adversarial training with auto-encoders to disentangle representations. We show compelling results of disentangled latent subspaces on three datasets and compare with recent works that leverage adversarial training

    Neural Face Editing with Intrinsic Image Disentangling

    Full text link
    Traditional face editing methods often require a number of sophisticated and task specific algorithms to be applied one after the other --- a process that is tedious, fragile, and computationally intensive. In this paper, we propose an end-to-end generative adversarial network that infers a face-specific disentangled representation of intrinsic face properties, including shape (i.e. normals), albedo, and lighting, and an alpha matte. We show that this network can be trained on "in-the-wild" images by incorporating an in-network physically-based image formation module and appropriate loss functions. Our disentangling latent representation allows for semantically relevant edits, where one aspect of facial appearance can be manipulated while keeping orthogonal properties fixed, and we demonstrate its use for a number of facial editing applications.Comment: CVPR 2017 ora

    Exploring galaxy evolution with generative models

    Full text link
    Context. Generative models open up the possibility to interrogate scientific data in a more data-driven way. Aims: We propose a method that uses generative models to explore hypotheses in astrophysics and other areas. We use a neural network to show how we can independently manipulate physical attributes by encoding objects in latent space. Methods: By learning a latent space representation of the data, we can use this network to forward model and explore hypotheses in a data-driven way. We train a neural network to generate artificial data to test hypotheses for the underlying physical processes. Results: We demonstrate this process using a well-studied process in astrophysics, the quenching of star formation in galaxies as they move from low-to high-density environments. This approach can help explore astrophysical and other phenomena in a way that is different from current methods based on simulations and observations.Comment: Published in A&A. For code and further details, see http://space.ml/proj/explor
    corecore