24 research outputs found
GRASS: Generative Recursive Autoencoders for Shape Structures
We introduce a novel neural network architecture for encoding and synthesis
of 3D shapes, particularly their structures. Our key insight is that 3D shapes
are effectively characterized by their hierarchical organization of parts,
which reflects fundamental intra-shape relationships such as adjacency and
symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a
flat, unlabeled, arbitrary part layout to a compact code. The code effectively
captures hierarchical structures of man-made 3D objects of varying structural
complexities despite being fixed-dimensional: an associated decoder maps a code
back to a full hierarchy. The learned bidirectional mapping is further tuned
using an adversarial setup to yield a generative model of plausible structures,
from which novel structures can be sampled. Finally, our structure synthesis
framework is augmented by a second trained module that produces fine-grained
part geometry, conditioned on global and local structural context, leading to a
full generative pipeline for 3D shapes. We demonstrate that without
supervision, our network learns meaningful structural hierarchies adhering to
perceptual grouping principles, produces compact codes which enable
applications such as shape classification and partial matching, and supports
shape synthesis and interpolation with significant variations in topology and
geometry.Comment: Corresponding author: Kai Xu ([email protected]
Recommended from our members
Neural similarity between overlapping events at learning differentially affects reinstatement across the cortex
Episodic memory often involves high overlap between the actors, locations, and objects of everyday events. Under some circumstances, it may be beneficial to distinguish, or differentiate, neural representations of similar events to avoid interference at recall. Alternatively, forming overlapping representations of similar events, or integration, may aid recall by linking shared information between memories. It is currently unclear how the brain supports these seemingly conflicting functions of differentiation and integration. We used multivoxel pattern similarity analysis (MVPA) of fMRI data and neural-network analysis of visual similarity to examine how highly overlapping naturalistic events are encoded in patterns of cortical activity, and how the degree of differentiation versus integration at encoding affects later retrieval. Participants performed an episodic memory task in which they learned and recalled naturalistic video stimuli with high feature overlap. Visually similar videos were encoded in overlapping patterns of neural activity in temporal, parietal, and occipital regions, suggesting integration. We further found that encoding processes differentially predicted later reinstatement across the cortex. In visual processing regions in occipital cortex, greater differentiation at encoding predicted later reinstatement. Higher-level sensory processing regions in temporal and parietal lobes showed the opposite pattern, whereby highly integrated stimuli showed greater reinstatement. Moreover, integration in high-level sensory processing regions during encoding predicted greater accuracy and vividness at recall. These findings provide novel evidence that encoding-related differentiation and integration processes across the cortex have divergent effects on later recall of highly similar naturalistic events
Investigation the Relationship between Human Visual Brain Activity and Emotions
ํ์๋
ผ๋ฌธ(์์ฌ)--์์ธ๋ํ๊ต ๋ํ์ :๊ณต๊ณผ๋ํ ์ปดํจํฐ๊ณตํ๋ถ,2019. 8. ๊น๊ฑดํฌ.์ธ์ฝ๋ฉ ๋ชจ๋ธ์ ์๊ทน์ผ๋ก๋ถํฐ ์ด๋ฐ๋ ๋ ํ๋์ ์์ธกํ๊ณ , ๋๊ฐ ์ ๋ณด๋ฅผ ์ด๋ป ๊ฒ ์ฒ๋ฆฌํ๋์ง ๋ถ์ํ๊ธฐ ์ํด ์ฌ์ฉ๋๋ค.๋ฐ๋ฉด ๋์ฝ๋ฉ ๋ชจ๋ธ์ ๋ ํ๋์ผ๋ก๋ถํฐ ์๊ทน์ ๋ํ ์ ๋ณด๋ฅผ ์์ธกํ๊ณ , ํ์ฌ ํน์ ์๊ทน์ด ์กด์ฌํ๋์ง๋ฅผ ํ๋จํ๋ ๊ฒ ์ ๋ชฉํ๋ก ํ๋ค. ๋ ๋ชจ๋ธ์ ์ข
์ข
ํจ๊ป ์ฌ์ฉ๋๋ค. ๋์ ์๊ฐ ์ฒด๊ณ๋ ์๊ทน์ ๋ํ ๊ฐ์ ์ ๋ณด๋ฅผ ๋ด๊ณ ์๊ณ [15, 20], ํฝ์
๋ค์ด ๋ฌด์์๋ก ์์ฌ ์๋ ์๊ทน์ผ ๋ก๋ถํฐ ์ ๋๋ ์๊ฐ ์ฒด๊ณ์ ํ๋์ผ๋ก๋ถํฐ๋ ๊ฐ์ ๊ฐ์ ์ ๋ณด๋ฅผ ์ถ์ถํด๋ผ ์ ์๋ค๋ ๊ฒ์ด ์๋ ค์ ธ ์๋ค [20]. ์ด๋ฐ ์ฐ๊ตฌ๋ค์ ๊ณ ๋ คํ์ฌ, ์ฐ๋ฆฌ๋ ์๊ฐ ์ฒด๊ณ๊ฐ ์ด๋ ์์ค๊น์ง ๊ฐ์ ์ ๋ณด๋ฅผ ๋ด๊ณ ์๋์ง ํ๊ตฌํ๋ค. ์ฐ๋ฆฌ๋ ์ธ์ฝ๋ฉ ๋ชจ๋ธ์ ์ฌ ์ฉํ์ฌ ์์/์ค์/ํ์ ์๊ฐ ํน์ฑ(feature)๊ณผ ๊ฐ๊ฐ ๊ด๋ จ์ด ์๋ ๋ ์์ญ์ ์ ํํ๊ณ , ์ด ๋ ์์ญ๋ค๋ก๋ถํฐ ๊ฐ์ ์ ๋ณด๋ฅผ ๋์ฝ๋ฉ ํ๋ค. ์ฐ๋ฆฌ๋ ํ๋์ฝ๋ฟ๋ง ์๋๋ผ ์์์ ๋ํผ์ง๊น์ง ์ด์ด์ง๋ ์์ญ๋ค์ด ์ด๋ฐ ํน์ฑ๋ค์ ์ธ์ฝ๋ฉ ํ๊ณ ์ ๋ค๋ ๊ฒ์ ๋ฐํ๋ค. ๋ค๋ฅธ ๋ ์์ญ๋ค๊ณผ ๋จ์ํ CNN ํน์ฑ๋ค๊ณผ๋ ๋ฌ๋ฆฌ, ์ด๋ฌํ ๋ ์์ญ๋ค๋ก๋ถํฐ๋ ๊ฐ์ ์ ๋ณด๋ฅผ ๋์ฝ๋ฉ ํ ์ ์์๋ค. ์ด ๊ฒฐ๊ณผ๋ค์ ์์/ ์ค์/ํ์ ์๊ฐ ํน์ฑ๋ค์ ์ธ์ฝ๋ฉ ํ๊ณ ์๋ ๋ ์์ญ๋ค์ด ์์ ๋ฐํ์ง ๊ฐ์ ์ ๋ณด ๋์ฝ๋ฉ๊ณผ ๊ด๋ จ์ด ์์์ ๋ณด์ฌ์ฃผ๋ฉฐ, ๋ฐ๋ผ์ ํ๋์ฝ๊ณผ ๊ด๋ จ๋ ๊ฐ์ ์ ๋ณด ๋์ฝ๋ฉ ์ฑ๋ฅ์ ์๊ฐ๊ณผ ๊ด๋ จ ์๋ ์ ๋ณด ์ฒ๋ฆฌ์ ๊ธฐ์ธํ๋ค.Encoding models predict brain activity elicited by stimuli and are used to investigate how information is processed in the brain. Whereas decod- ing models predict information about the stimuli using brain activity and aim to identify whether such information is present. Both models are of- ten used in conjunction. The brains visual system has shown to decode stimuli related emotional information [15, 20]. However brain activity in the visual system induced by the same visual stimuli but scrambled, has also been able to decode the same emotional information [20]. Consid- ering these results, we raise the question to what extent encoded visual information also encodes emotional information. We use encoding models to select brain regions related to low-, mid- and high- level visual features and use these brain regions to decode related emotional information. We found that these features are encoded not only in the occipital lobe, but also in later regions extending to the orbito-frontal cortex. Said brain re- gions were not able to decode emotion information, whereas other brain regions and plain CNN features were. These results show that brain re- gions encoding low-, mid- and high- level visual features are not related to the previously found emotional decoding performance and thus, the decoding performance related to the occipital lobe should be contributed to non-vision related processing.Chapter 1 Introduction 1
Chapter 2 Background 4
2.1 Emotions and the Visual System 4
2.1.1 Visualsystem 4
2.1.2 Emotions 6
2.2 functional Magnetic Resonance Imaging 7
2.2.1 BOLDsignal 8
2.2.2 Analysis of fMRI 9
2.2.3 EncodingModel 10
2.2.4 DecodingModel 11
2.3 RelatedWork 13
Chapter 3 Materials & Methods 17
3.1 Experimental data 18
3.2 Encoding model 19
3.3 Decoding Model 22
Chapter 4 Results 24
4.1 Encoding 24
4.2 Decoding 28
Chapter 5 Discussion and Limitations 31
5.1 Encoding 31
5.2 Decoding 33
5.3 Limitations and Feature Directions 35
Chapter 6 Conclusion 37
์์ฝ 42Maste
Attention as a Mechanism for Object-Object Binding in Complex Scenes
The current study attempted to determine whether direct binding between objects in complex scenes occurs as a function of directed attention at encoding. In Experiment 1, participants viewed objects in one of these different types contexts: unique scenes, similar scenes, or arrays with no contextual information. Critically, only half of the objects were attended for each encoding trial. Participants then completed an associative recognition task on pairs of items created from the previously studied scenes. Test pairs consisted of two attended or unattended objects, and were associated with a unique scene, a similar scene, or an array. Evidence of binding for attended objects was clear. Associative recognition was better for attended pairs, relative to unattended pairs, regardless of the type of context in which the objects were studied. Object-context binding was not observed in memory for attended object pairs, but was observed for unattended object pairs. Experiment 2 explored the extent to which binding strength between object relationships varies as a function of temporal and/or spatial proximity. The procedure for Experiment 2 was identical to Experiment 1, with the exception that all of the objects in the encoding trials were attended. There were no significant main effects or interactions of spatial and temporal distance on binding strength, as measured by associative recognition
Image Classification of Marine-Terminating Outlet Glaciers using Deep Learning Methods
A wealth of research has focused on elucidating the key controls on mass loss from the Greenland and Antarctic ice sheets in response to climate forcing, specifically in relation to the drivers of marine-terminating outlet glacier change. Despite the burgeoning availability of medium resolution satellite data, the manual methods traditionally used to monitor change of marine-terminating outlet glaciers from satellite imagery are time-consuming and can be subjective, especially where a mรฉlange of icebergs and sea-ice exists at the terminus. To address this, recent advances in deep learning applied to image processing have created a new frontier in the field of automated delineation of glacier termini. However, at this stage, there remains a paucity of research on the use of deep learning for pixel-level semantic image classification of outlet glacier environments. This project develops and tests a two-phase deep learning approach based on a well-established convolutional neural network (CNN) called VGG16 for automated classification of Sentinel-2 satellite images. The novel workflow, termed CNN-Supervised Classification (CSC), was originally developed for fluvial settings but is adapted here to produce multi-class outputs for test imagery of glacial environments containing marine-terminating outlet glaciers in eastern Greenland. Results show mean F1 scores up to 95% for in-sample test imagery and 93% for out-of-sample test imagery, with significant improvements over traditional pixel-based methods such as band ratio techniques. This demonstrates the robustness of the deep learning workflow for automated classification despite the complex characteristics of the imagery. Future research could focus on the integration of deep learning classification workflows with platforms such as Google Earth Engine (GEE), to classify imagery more efficiently and produce datasets for a range of glacial applications without the need for substantial prior experience in coding or deep learning
Eye-specific detection and a multi-eye integration model of biological motion perception
โBiological motionโ refers to the distinctive kinematics observed in many living organisms, where visually perceivable points on the animal move at fixed distances from each other. Across the animal kingdom, many species have developed specialized visual circuitry to recognize such biological motion and to discriminate it from other patterns. Recently, this ability has been observed in the distributed visual system of jumping spiders. These eight-eyed animals use six eyes to perceive motion, while the remaining two (the principal anterior medial eyes) are shifted across the visual scene to further inspect detected objects. When presented with a biologically moving stimulus and a random one, jumping spiders turn to face the latter, clearly demonstrating the ability to discriminate between them. However, it remains unclear whether the principal eyes are necessary for this behavior, whether all secondary eyes can perform this discrimination, or whether a single eye-pair is specialized for this task. Here, we systematically tested the ability of jumping spiders to discriminate between biological and random visual stimuli by testing each eye-pair alone. Spiders were able to discriminate stimuli only when the anterior lateral eyes were unblocked, and performed at chance levels in other configurations. Interestingly, spiders showed a preference for biological motion over random stimuli โ unlike in past work. We therefore propose a new model describing how specialization of the anterior lateral eyes for detecting biological motion contributes to multi-eye integration in this system. This integration generates more complex behavior through the combination of simple, single-eye responses. We posit that this in-built modularity may be a solution to the limited resources of these invertebrates' brains, constituting a novel approach to visual processing
Recommended from our members
Computational models of the human visual cortex: on individual differences and ecologically valid input statistics
Perception relies on cortical processes in response to sensory stimuli. Visual input entering the
eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex.
Vision science investigates these transformations that give rise to high-level processing of
visual objects, such as object recognition. In this thesis I investigate computational models
of the human visual cortex with regard to their ability to predict cortical responses to visual
objects. In particular, I describe two factors playing an important role in using deep neural
networks (DNNs) to better understand cortical functioning: the initial weight state and
ecologically more valid input statistics.
In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural
networks as a modeling framework for the visual cortex. Next, I will lay out the motivation
for the research questions investigated in this thesis and described in detail in Chapters 2, 3,
and 4.
Chapter 2 focuses on the impact of the initial weight state of a model on its ability
to predict cortical representations. I describe work in which we demonstrate that two
DNN instances identical in every aspect but their initial weights, yield very dissimilar
representations. Relying on single network instances to predict cortical activation patterns
in response to sensory stimuli poses a problem for computational neuroscience: depending
on the initial set of weights the ability to mirror the cortical representations of these stimuli
might vary. Thus, results based on single (โoff-the-shelfโ) model instances - as commonly
used in computational neuroscience - may not generalize. In contrast, using multiple DNN
instances might alleviate this problem as they allow insights in the variability of a given
model architecture to predict cortical representations. These individual differences between
model instances suggest that to allow results to generalize more easily the model instances
should be treated similar to human experimental participants.
In Chapter 3 I focus on ecologically more valid input statistics (in the form of training
images) aiming to improve a modelโs ability to predict cortical representations. The most
successful models of the human visual cortex to date are DNNs trained on object recognition
tasks designed with machine learning goals in mind. However, the image sets used for training
these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does
not contain visual object categories encountered frequently in everyday human life (e.g.
woman, man, or child). This suggests that taking into account the human visual experience
when training models of the human visual cortex on a categorization task might help to
predict cortical representations. In this Chapter I describe the creation of a set of images
aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million
images from 565 basic level categories and is the largest image set specifically designed for
computational neuroscience to date. Ecoset is freely available to allow the community to test
their own hypotheses of models trained with input statistics matched to the human visual
environment.
In Chapter 4 we build on the results from the previous two Chapters. Using multiple
DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on
ecologically more valid input statistics (ecoset) might improve its ability to predict cortical
representations. I first demonstrate that ecoset might improve an architectureโs ability to
mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe-
art computer vision and computational neuroscience models in terms of mirroring cortical
representations in the human brain. Thus, incorporating biological and ecological aspects,
such as brain-inspired architectural features and ecologically more valid input statistics, into
computational models may yield better predictions of response patterns in the human visual
cortex.
Treating DNN instances similar to human experimental participants and considering
ecological and biological factors for building these DNNs may be an important step towards
better models of the human visual cortex. Such models might allow a better understanding of
the cortical processes underlying high-level vision in the human brain.Cambridge Trust - Vice Chancellor's Award 2015
Cambridge Philosophical Society
MRC Cognition and Brain Sciences Uni