Search CORE

1,190 research outputs found

Comparing brain-like representations learned by vanilla, residual, and recurrent CNN architectures

Author: Van Uden Cara E
Publication venue: Dartmouth Digital Commons
Publication date: 30/05/2019
Field of study

Though it has been hypothesized that state-of-the art residual networks approximate the recurrent visual system, it is yet to be seen if the representations learned by these biologically inspired CNNs actually have closer representations to neural data. It is likely that CNNs and DNNs that are most functionally similar to the brain will contain mechanisms that are most like those used by the brain. In this thesis, we investigate how different CNN architectures approximate the representations learned through the ventral-object recognition and processing-stream of the brain. We specifically evaluate how recent approximations of biological neural recurrence-such as residual connections, dense residual connections, and a biologically-inspired implemen- tation of recurrence-affect the representations learned by each CNN. We first investigate the representations learned by layers throughout a few state-of-the-art CNNs-VGG-19 (vanilla CNN), ResNet-152 (CNN with residual connections), and DenseNet-161 (CNN with dense connections). To control for differences in model depth, we then extend this analysis to the CORnet family of biologically-inspired CNN models with matching high-level architectures. The CORnet family has three models: a vanilla CNN (CORnet-Z), a CNN with biologically-valid recurrent dynamics (CORnet-R), and a CNN with both recurrent and residual connections (CORnet-S). We compare the representations of these six models to functionally aligned (with hyperalignment) fMRI brain data acquired during a naturalistic visual task. We take two approaches to comparing these CNN and brain representations. We first use forward encoding, a predictive approach that uses CNN features to predict neural responses across the whole brain. We next use representational similarity analysis (RSA) and centered kernel alignment (CKA) to measure the similarities in representation within CNN layers and specific brain ROIs. We show that, compared to vanilla CNNs, CNNs with residual and recurrent connections exhibit representations that are even more similar to those learned by the human ventral visual stream. We also achieve state-of-the-art forward encoding and RSA performance with the residual and recurrent CNN models

Neural Encoding and Decoding with Deep Learning for Natural Vision

Author: Wen Haiguang
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2018
Field of study

The overarching objective of this work is to bridge neuroscience and artificial intelligence to ultimately build machines that learn, act, and think like humans. In the context of vision, the brain enables humans to readily make sense of the visual world, e.g. recognizing visual objects. Developing human-like machines requires understanding the working principles underlying the human vision. In this dissertation, I ask how the brain encodes and represents dynamic visual information from the outside world, whether brain activity can be directly decoded to reconstruct and categorize what a person is seeing, and whether neuroscience theory can be applied to artificial models to advance computer vision. To address these questions, I used deep neural networks (DNN) to establish encoding and decoding models for describing the relationships between the brain and the visual stimuli. Using the DNN, the encoding models were able to predict the functional magnetic resonance imaging (fMRI) responses throughout the visual cortex given video stimuli; the decoding models were able to reconstruct and categorize the visual stimuli based on fMRI activity. To further advance the DNN model, I have implemented a new bidirectional and recurrent neural network based on the predictive coding theory. As a theory in neuroscience, predictive coding explains the interaction among feedforward, feedback, and recurrent connections. The results showed that this brain-inspired model significantly outperforms feedforward-only DNNs in object recognition. These studies have positive impact on understanding the neural computations under human vision and improving computer vision with the knowledge from neuroscience

Kervolutional Neural Networks

Author: Wang Chen
Xie Lihua
Yang Jianfei
Yuan Junsong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/01/2020
Field of study

Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the activation layers, which can only provide point-wise non-linearity. To solve this problem, a new operation, kervolution (kernel convolution), is introduced to approximate complex behaviors of human perception systems leveraging on the kernel trick. It generalizes convolution, enhances the model capacity, and captures higher order interactions of features, via patch-wise kernel functions, but without introducing additional parameters. Extensive experiments show that kervolutional neural networks (KNN) achieve higher accuracy and faster convergence than baseline CNN.Comment: oral paper in CVPR 201

arXiv.org e-Print Archive

Investigation the Relationship between Human Visual Brain Activity and Emotions

Author: Amelie Schmidt-Colberg
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 김건희.인코딩 모델은 자극으로부터 촉발된 뇌 활동을 예측하고, 뇌가 정보를 어떻 게 처리하는지 분석하기 위해 사용된다.반면 디코딩 모델은 뇌 활동으로부터 자극에 대한 정보를 예측하고, 현재 특정 자극이 존재하는지를 판단하는 것 을 목표로 한다. 두 모델은 종종 함께 사용된다. 뇌의 시각 체계는 자극에 대한 감정 정보를 담고 있고 [15, 20], 픽셀들이 무작위로 섞여 있는 자극으 로부터 유도된 시각 체계의 활동으로부터도 같은 감정 정보를 추출해낼 수 있다는 것이 알려져 있다 [20]. 이런 연구들을 고려하여, 우리는 시각 체계가 어느 수준까지 감정 정보를 담고 있는지 탐구한다. 우리는 인코딩 모델을 사 용하여 상위/중위/하위 시각 특성(feature)과 각각 관련이 있는 뇌 영역을 선택하고, 이 뇌 영역들로부터 감정 정보를 디코딩 한다. 우리는 후두엽뿐만 아니라 안와전두피질까지 이어지는 영역들이 이런 특성들을 인코딩 하고 있 다는 것을 밝힌다. 다른 뇌 영역들과 단순한 CNN 특성들과는 달리, 이러한 뇌 영역들로부터는 감정 정보를 디코딩 할 수 없었다. 이 결과들은 상위/ 중위/하위 시각 특성들을 인코딩 하고 있는 뇌 영역들이 앞서 밝혀진 감정 정보 디코딩과 관련이 없음을 보여주며, 따라서 후두엽과 관련된 감정 정보 디코딩 성능은 시각과 관련 없는 정보 처리에 기인한다.Encoding models predict brain activity elicited by stimuli and are used to investigate how information is processed in the brain. Whereas decod- ing models predict information about the stimuli using brain activity and aim to identify whether such information is present. Both models are of- ten used in conjunction. The brains visual system has shown to decode stimuli related emotional information [15, 20]. However brain activity in the visual system induced by the same visual stimuli but scrambled, has also been able to decode the same emotional information [20]. Consid- ering these results, we raise the question to what extent encoded visual information also encodes emotional information. We use encoding models to select brain regions related to low-, mid- and high- level visual features and use these brain regions to decode related emotional information. We found that these features are encoded not only in the occipital lobe, but also in later regions extending to the orbito-frontal cortex. Said brain re- gions were not able to decode emotion information, whereas other brain regions and plain CNN features were. These results show that brain re- gions encoding low-, mid- and high- level visual features are not related to the previously found emotional decoding performance and thus, the decoding performance related to the occipital lobe should be contributed to non-vision related processing.Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Emotions and the Visual System 4 2.1.1 Visualsystem 4 2.1.2 Emotions 6 2.2 functional Magnetic Resonance Imaging 7 2.2.1 BOLDsignal 8 2.2.2 Analysis of fMRI 9 2.2.3 EncodingModel 10 2.2.4 DecodingModel 11 2.3 RelatedWork 13 Chapter 3 Materials & Methods 17 3.1 Experimental data 18 3.2 Encoding model 19 3.3 Decoding Model 22 Chapter 4 Results 24 4.1 Encoding 24 4.2 Decoding 28 Chapter 5 Discussion and Limitations 31 5.1 Encoding 31 5.2 Decoding 33 5.3 Limitations and Feature Directions 35 Chapter 6 Conclusion 37 요약 42Maste

Sharing deep generative representation for perceived image reconstruction from human brain activity

Author: Du Changde
Du Changying
He Huiguang
Publication venue
Publication date: 10/07/2017
Field of study

Decoding human brain activities via functional magnetic resonance imaging (fMRI) has gained increasing attention in recent years. While encouraging results have been reported in brain states classification tasks, reconstructing the details of human visual experience still remains difficult. Two main challenges that hinder the development of effective models are the perplexing fMRI measurement noise and the high dimensionality of limited data instances. Existing methods generally suffer from one or both of these issues and yield dissatisfactory results. In this paper, we tackle this problem by casting the reconstruction of visual stimulus as the Bayesian inference of missing view in a multiview latent variable model. Sharing a common latent representation, our joint generative model of external stimulus and brain response is not only "deep" in extracting nonlinear features from visual images, but also powerful in capturing correlations among voxel activities of fMRI recordings. The nonlinearity and deep structure endow our model with strong representation ability, while the correlations of voxel activities are critical for suppressing noise and improving prediction. We devise an efficient variational Bayesian method to infer the latent variables and the model parameters. To further improve the reconstruction accuracy, the latent representations of testing instances are enforced to be close to that of their neighbours from the training set via posterior regularization. Experiments on three fMRI recording datasets demonstrate that our approach can more accurately reconstruct visual stimuli

arXiv.org e-Print Archive

Constraint-free Natural Image Reconstruction from fMRI Signals Based on Convolutional Neural Network

Author: Bin Yan
Chi Zhang
Kai Qiao
Li Tong
Linyuan Wang
Ying Zeng
Ying Zeng
Publication venue
Publication date: 01/01/2018
Field of study

In recent years, research on decoding brain activity based on functional magnetic resonance imaging (fMRI) has made remarkable achievements. However, constraint-free natural image reconstruction from brain activity is still a challenge. The existing methods simplified the problem by using semantic prior information or just reconstructing simple images such as letters and digitals. Without semantic prior information, we present a novel method to reconstruct nature images from fMRI signals of human visual cortex based on the computation model of convolutional neural network (CNN). Firstly, we extracted the units output of viewed natural images in each layer of a pre-trained CNN as CNN features. Secondly, we transformed image reconstruction from fMRI signals into the problem of CNN feature visualizations by training a sparse linear regression to map from the fMRI patterns to CNN features. By iteratively optimization to find the matched image, whose CNN unit features become most similar to those predicted from the brain activity, we finally achieved the promising results for the challenging constraint-free natural image reconstruction. As there was no use of semantic prior information of the stimuli when training decoding model, any category of images (not constraint by the training set) could be reconstructed theoretically. We found that the reconstructed images resembled the natural stimuli, especially in position and shape. The experimental results suggest that hierarchical visual features can effectively express the visual perception process of human brain

arXiv.org e-Print Archive

Directory of Open Access Journals

Frontiers - Publisher Connector