104 research outputs found
Sharing deep generative representation for perceived image reconstruction from human brain activity
Decoding human brain activities via functional magnetic resonance imaging
(fMRI) has gained increasing attention in recent years. While encouraging
results have been reported in brain states classification tasks, reconstructing
the details of human visual experience still remains difficult. Two main
challenges that hinder the development of effective models are the perplexing
fMRI measurement noise and the high dimensionality of limited data instances.
Existing methods generally suffer from one or both of these issues and yield
dissatisfactory results. In this paper, we tackle this problem by casting the
reconstruction of visual stimulus as the Bayesian inference of missing view in
a multiview latent variable model. Sharing a common latent representation, our
joint generative model of external stimulus and brain response is not only
"deep" in extracting nonlinear features from visual images, but also powerful
in capturing correlations among voxel activities of fMRI recordings. The
nonlinearity and deep structure endow our model with strong representation
ability, while the correlations of voxel activities are critical for
suppressing noise and improving prediction. We devise an efficient variational
Bayesian method to infer the latent variables and the model parameters. To
further improve the reconstruction accuracy, the latent representations of
testing instances are enforced to be close to that of their neighbours from the
training set via posterior regularization. Experiments on three fMRI recording
datasets demonstrate that our approach can more accurately reconstruct visual
stimuli
Double-Flow GAN model for the reconstruction of perceived faces from brain activities
Face plays an important role in human's visual perception, and reconstructing
perceived faces from brain activities is challenging because of its difficulty
in extracting high-level features and maintaining consistency of multiple face
attributes, such as expression, identity, gender, etc. In this study, we
proposed a novel reconstruction framework, which we called Double-Flow GAN,
that can enhance the capability of discriminator and handle imbalances in
images from certain domains that are too easy for generators. We also designed
a pretraining process that uses features extracted from images as conditions
for making it possible to pretrain the conditional reconstruction model from
fMRI in a larger pure image dataset. Moreover, we developed a simple pretrained
model to perform fMRI alignment to alleviate the problem of cross-subject
reconstruction due to the variations of brain structure among different
subjects. We conducted experiments by using our proposed method and
state-of-the-art reconstruction models. Our results demonstrated that our
method showed significant reconstruction performance, outperformed the previous
reconstruction models, and exhibited a good generation ability
Robust Transcoding Sensory Information With Neural Spikes
Neural coding, including encoding and decoding, is one of the key problems in neuroscience for understanding how the brain uses neural signals to relate sensory perception and motor behaviors with neural systems. However, most of the existed studies only aim at dealing with the continuous signal of neural systems, while lacking a unique feature of biological neurons, termed spike, which is the fundamental information unit for neural computation as well as a building block for brain-machine interface. Aiming at these limitations, we propose a transcoding framework to encode multi-modal sensory information into neural spikes and then reconstruct stimuli from spikes. Sensory information can be compressed into 10% in terms of neural spikes, yet re-extract 100% of information by reconstruction. Our framework can not only feasibly and accurately reconstruct dynamical visual and auditory scenes, but also rebuild the stimulus patterns from functional magnetic resonance imaging (fMRI) brain activities. More importantly, it has a superb ability of noise immunity for various types of artificial noises and background signals. The proposed framework provides efficient ways to perform multimodal feature representation and reconstruction in a high-throughput fashion, with potential usage for efficient neuromorphic computing in a noisy environment
Estructuración bibliográfica acerca de Multiview learning para clasificación de imágenes
This article shows a bibliographic review of the academic literature related to "Image classification with Multiview learning" together with an analysis of the information present in each of the reviewed bibliographic sources, to propose a conceptual basis, theoretical and statistical for research works that develop or contain this theme. In the same way, the way in which the MVL is approached in the different application scenarios, both academic and practical, is briefly presented.En el presente artículo se muestra una revisión bibliográfica de la literatura académica relacionada con “Clasificación de imágenes con Multiview learning” junto con un análisis de la información presente en cada una de las fuentes bibliográficas revisadas, con la finalidad de proponer una base conceptual, teórica y estadística para trabajos de investigación que desarrollen o contengan esta temática. De igual manera se presenta brevemente la forma en la que se aborda el MVL en los diferentes escenarios de aplicación tanto académicos como prácticos
Robust Decoding of Rich Dynamical Visual Scenes With Retinal Spikes
Sensory information transmitted to the brain activates neurons to create a series of coping behaviors. Understanding the mechanisms of neural computation and reverse engineering the brain to build intelligent machines requires establishing a robust relationship between stimuli and neural responses. Neural decoding aims to reconstruct the original stimuli that trigger neural responses. With the recent upsurge of artificial intelligence, neural decoding provides an insightful perspective for designing novel algorithms of brain-machine interface. For humans, vision is the dominant contributor to the interaction between the external environment and the brain. In this study, utilizing the retinal neural spike data collected over multi trials with visual stimuli of two movies with different levels of scene complexity, we used a neural network decoder to quantify the decoded visual stimuli with six different metrics for image quality assessment establishing comprehensive inspection of decoding. With the detailed and systematical study of the effect and single and multiple trials of data, different noise in spikes, and blurred images, our results provide an in-depth investigation of decoding dynamical visual scenes using retinal spikes. These results provide insights into the neural coding of visual scenes and services as a guideline for designing next-generation decoding algorithms of neuroprosthesis and other devices of brain-machine interface.</p
Decoding Pixel-Level Image Features from Two-Photon Calcium Signals of Macaque Visual Cortex
Images of visual scenes comprise essential features important for visual cognition of the brain. The complexity of visual features lies at different levels, from simple artificial patterns to natural images with different scenes. It has been a focus of using stimulus images to predict neural responses. However, it remains unclear how to extract features from neuronal responses. Here we address this question by leveraging two-photon calcium neural data recorded from the visual cortex of awake macaque monkeys. With stimuli including various categories of artificial patterns and diverse scenes of natural images, we employed a deep neural network decoder inspired by image segmentation technique. Consistent with the notation of sparse coding for natural images, a few neurons with stronger responses dominated the decoding performance, whereas decoding of ar tificial patterns needs a large number of neurons. When natural images using the model pretrained on artificial patterns are decoded, salient features of natural scenes can be extracted, as well as the conventional category information. Altogether, our results give a new perspective on studying neural encoding principles using reverse-engineering decoding strategies
Neural Encoding and Decoding with a Flow-based Invertible Generative Model
Recent studies on visual neural encoding and decoding have made significant progress, benefiting from the latest advances in deep neural networks having powerful representations. However, two challenges remain. First, the current decoding algorithms based on deep generative models always struggle with information losses, which may cause blurry reconstruction. Second, most studies model the neural encoding and decoding processes separately, neglecting the inherent dual relationship between the two tasks. In this paper, we propose a novel neural encoding and decoding method with a two-stage flow-based invertible generative model to tackle the above issues. First, a convolutional auto-encoder is trained to bridge the stimuli space and the feature space. Second, an adversarial cross-modal normalizing flow is trained to build up a bijective transformation between image features and neural signals, with local and global constraints imposed on the latent space to render cross-modal alignment. The method eventually achieves bi-directional generation of visual stimuli and neural responses with a combination of the flow-based generator and the auto-encoder. The flow-based invertible generative model can minimize information losses and unify neural encoding and decoding into a single framework. Experimental results on different neural signals containing spike signals and functional magnetic resonance imaging demonstrate that our model achieves the best comprehensive performance among the comparison models
- …