Search CORE

24 research outputs found

GRASS: Generative Recursive Autoencoders for Shape Structures

Author: Chaudhuri Siddhartha
Guibas Leonidas
Li Jun
Xu Kai
Yumer Ersin
Zhang Hao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. Finally, our structure synthesis framework is augmented by a second trained module that produces fine-grained part geometry, conditioned on global and local structural context, leading to a full generative pipeline for 3D shapes. We demonstrate that without supervision, our network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.Comment: Corresponding author: Kai Xu ([email protected]

arXiv.org e-Print Archive

Dspace at IIT Bombay

Recommended from our members

Neural similarity between overlapping events at learning differentially affects reinstatement across the cortex

Author: Bainbridge Wilma A.
Hebscher Melissa
Voss Joel L.
Publication venue: 'Elsevier BV'
Publication date: 17/06/2023
Field of study

Episodic memory often involves high overlap between the actors, locations, and objects of everyday events. Under some circumstances, it may be beneficial to distinguish, or differentiate, neural representations of similar events to avoid interference at recall. Alternatively, forming overlapping representations of similar events, or integration, may aid recall by linking shared information between memories. It is currently unclear how the brain supports these seemingly conflicting functions of differentiation and integration. We used multivoxel pattern similarity analysis (MVPA) of fMRI data and neural-network analysis of visual similarity to examine how highly overlapping naturalistic events are encoded in patterns of cortical activity, and how the degree of differentiation versus integration at encoding affects later retrieval. Participants performed an episodic memory task in which they learned and recalled naturalistic video stimuli with high feature overlap. Visually similar videos were encoded in overlapping patterns of neural activity in temporal, parietal, and occipital regions, suggesting integration. We further found that encoding processes differentially predicted later reinstatement across the cortex. In visual processing regions in occipital cortex, greater differentiation at encoding predicted later reinstatement. Higher-level sensory processing regions in temporal and parietal lobes showed the opposite pattern, whereby highly integrated stimuli showed greater reinstatement. Moreover, integration in high-level sensory processing regions during encoding predicted greater accuracy and vividness at recall. These findings provide novel evidence that encoding-related differentiation and integration processes across the cortex have divergent effects on later recall of highly similar naturalistic events

Knowledge UChicago

Investigation the Relationship between Human Visual Brain Activity and Emotions

Author: Amelie Schmidt-Colberg
Publication venue: 서울대학교 대학원
Publication date: 01/08/2019
Field of study

학위논문(석사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 김건희.인코딩 모델은 자극으로부터 촉발된 뇌 활동을 예측하고, 뇌가 정보를 어떻 게 처리하는지 분석하기 위해 사용된다.반면 디코딩 모델은 뇌 활동으로부터 자극에 대한 정보를 예측하고, 현재 특정 자극이 존재하는지를 판단하는 것 을 목표로 한다. 두 모델은 종종 함께 사용된다. 뇌의 시각 체계는 자극에 대한 감정 정보를 담고 있고 [15, 20], 픽셀들이 무작위로 섞여 있는 자극으 로부터 유도된 시각 체계의 활동으로부터도 같은 감정 정보를 추출해낼 수 있다는 것이 알려져 있다 [20]. 이런 연구들을 고려하여, 우리는 시각 체계가 어느 수준까지 감정 정보를 담고 있는지 탐구한다. 우리는 인코딩 모델을 사 용하여 상위/중위/하위 시각 특성(feature)과 각각 관련이 있는 뇌 영역을 선택하고, 이 뇌 영역들로부터 감정 정보를 디코딩 한다. 우리는 후두엽뿐만 아니라 안와전두피질까지 이어지는 영역들이 이런 특성들을 인코딩 하고 있 다는 것을 밝힌다. 다른 뇌 영역들과 단순한 CNN 특성들과는 달리, 이러한 뇌 영역들로부터는 감정 정보를 디코딩 할 수 없었다. 이 결과들은 상위/ 중위/하위 시각 특성들을 인코딩 하고 있는 뇌 영역들이 앞서 밝혀진 감정 정보 디코딩과 관련이 없음을 보여주며, 따라서 후두엽과 관련된 감정 정보 디코딩 성능은 시각과 관련 없는 정보 처리에 기인한다.Encoding models predict brain activity elicited by stimuli and are used to investigate how information is processed in the brain. Whereas decod- ing models predict information about the stimuli using brain activity and aim to identify whether such information is present. Both models are of- ten used in conjunction. The brains visual system has shown to decode stimuli related emotional information [15, 20]. However brain activity in the visual system induced by the same visual stimuli but scrambled, has also been able to decode the same emotional information [20]. Consid- ering these results, we raise the question to what extent encoded visual information also encodes emotional information. We use encoding models to select brain regions related to low-, mid- and high- level visual features and use these brain regions to decode related emotional information. We found that these features are encoded not only in the occipital lobe, but also in later regions extending to the orbito-frontal cortex. Said brain re- gions were not able to decode emotion information, whereas other brain regions and plain CNN features were. These results show that brain re- gions encoding low-, mid- and high- level visual features are not related to the previously found emotional decoding performance and thus, the decoding performance related to the occipital lobe should be contributed to non-vision related processing.Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Emotions and the Visual System 4 2.1.1 Visualsystem 4 2.1.2 Emotions 6 2.2 functional Magnetic Resonance Imaging 7 2.2.1 BOLDsignal 8 2.2.2 Analysis of fMRI 9 2.2.3 EncodingModel 10 2.2.4 DecodingModel 11 2.3 RelatedWork 13 Chapter 3 Materials & Methods 17 3.1 Experimental data 18 3.2 Encoding model 19 3.3 Decoding Model 22 Chapter 4 Results 24 4.1 Encoding 24 4.2 Decoding 28 Chapter 5 Discussion and Limitations 31 5.1 Encoding 31 5.2 Decoding 33 5.3 Limitations and Feature Directions 35 Chapter 6 Conclusion 37 요약 42Maste

SNU Open Repository and Archive

Attention as a Mechanism for Object-Object Binding in Complex Scenes

Author: Mennie Kacie
Publication venue: LSU Digital Commons
Publication date: 10/07/2019
Field of study

The current study attempted to determine whether direct binding between objects in complex scenes occurs as a function of directed attention at encoding. In Experiment 1, participants viewed objects in one of these different types contexts: unique scenes, similar scenes, or arrays with no contextual information. Critically, only half of the objects were attended for each encoding trial. Participants then completed an associative recognition task on pairs of items created from the previously studied scenes. Test pairs consisted of two attended or unattended objects, and were associated with a unique scene, a similar scene, or an array. Evidence of binding for attended objects was clear. Associative recognition was better for attended pairs, relative to unattended pairs, regardless of the type of context in which the objects were studied. Object-context binding was not observed in memory for attended object pairs, but was observed for unattended object pairs. Experiment 2 explored the extent to which binding strength between object relationships varies as a function of temporal and/or spatial proximity. The procedure for Experiment 2 was identical to Experiment 1, with the exception that all of the objects in the encoding trials were attended. There were no significant main effects or interactions of spatial and temporal distance on binding strength, as measured by associative recognition

Louisiana State University

Image Classification of Marine-Terminating Outlet Glaciers using Deep Learning Methods

Author: MAROCHOV MELANIE
Publication venue
Publication date: 01/01/2020
Field of study

A wealth of research has focused on elucidating the key controls on mass loss from the Greenland and Antarctic ice sheets in response to climate forcing, specifically in relation to the drivers of marine-terminating outlet glacier change. Despite the burgeoning availability of medium resolution satellite data, the manual methods traditionally used to monitor change of marine-terminating outlet glaciers from satellite imagery are time-consuming and can be subjective, especially where a mélange of icebergs and sea-ice exists at the terminus. To address this, recent advances in deep learning applied to image processing have created a new frontier in the field of automated delineation of glacier termini. However, at this stage, there remains a paucity of research on the use of deep learning for pixel-level semantic image classification of outlet glacier environments. This project develops and tests a two-phase deep learning approach based on a well-established convolutional neural network (CNN) called VGG16 for automated classification of Sentinel-2 satellite images. The novel workflow, termed CNN-Supervised Classification (CSC), was originally developed for fluvial settings but is adapted here to produce multi-class outputs for test imagery of glacial environments containing marine-terminating outlet glaciers in eastern Greenland. Results show mean F1 scores up to 95% for in-sample test imagery and 93% for out-of-sample test imagery, with significant improvements over traditional pixel-based methods such as band ratio techniques. This demonstrates the robustness of the deep learning workflow for automated classification despite the complex characteristics of the imagery. Future research could focus on the integration of deep learning classification workflows with platforms such as Google Earth Engine (GEE), to classify imagery more efficiently and produce datasets for a range of glacial applications without the need for substantial prior experience in coding or deep learning

Durham e-Theses

Eye-specific detection and a multi-eye integration model of biological motion perception

Author: De Agrò Massimo
Rößler Daniela C.
Shamble Paul S.
Publication venue: COB
Publication date: 26/06/2024
Field of study

‘Biological motion’ refers to the distinctive kinematics observed in many living organisms, where visually perceivable points on the animal move at fixed distances from each other. Across the animal kingdom, many species have developed specialized visual circuitry to recognize such biological motion and to discriminate it from other patterns. Recently, this ability has been observed in the distributed visual system of jumping spiders. These eight-eyed animals use six eyes to perceive motion, while the remaining two (the principal anterior medial eyes) are shifted across the visual scene to further inspect detected objects. When presented with a biologically moving stimulus and a random one, jumping spiders turn to face the latter, clearly demonstrating the ability to discriminate between them. However, it remains unclear whether the principal eyes are necessary for this behavior, whether all secondary eyes can perform this discrimination, or whether a single eye-pair is specialized for this task. Here, we systematically tested the ability of jumping spiders to discriminate between biological and random visual stimuli by testing each eye-pair alone. Spiders were able to discriminate stimuli only when the anterior lateral eyes were unblocked, and performed at chance levels in other configurations. Interestingly, spiders showed a preference for biological motion over random stimuli – unlike in past work. We therefore propose a new model describing how specialization of the anterior lateral eyes for detecting biological motion contributes to multi-eye integration in this system. This integration generates more complex behavior through the combination of simple, single-eye responses. We posit that this in-built modularity may be a solution to the limited resources of these invertebrates' brains, constituting a novel approach to visual processing

University of Regensburg Publication Server

Recommended from our members

Computational models of the human visual cortex: on individual differences and ecologically valid input statistics

Author: Mehrer Johannes
Publication venue: University of Cambridge
Publication date: 03/07/2020
Field of study

Perception relies on cortical processes in response to sensory stimuli. Visual input entering the eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex. Vision science investigates these transformations that give rise to high-level processing of visual objects, such as object recognition. In this thesis I investigate computational models of the human visual cortex with regard to their ability to predict cortical responses to visual objects. In particular, I describe two factors playing an important role in using deep neural networks (DNNs) to better understand cortical functioning: the initial weight state and ecologically more valid input statistics. In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural networks as a modeling framework for the visual cortex. Next, I will lay out the motivation for the research questions investigated in this thesis and described in detail in Chapters 2, 3, and 4. Chapter 2 focuses on the impact of the initial weight state of a model on its ability to predict cortical representations. I describe work in which we demonstrate that two DNN instances identical in every aspect but their initial weights, yield very dissimilar representations. Relying on single network instances to predict cortical activation patterns in response to sensory stimuli poses a problem for computational neuroscience: depending on the initial set of weights the ability to mirror the cortical representations of these stimuli might vary. Thus, results based on single (“off-the-shelf”) model instances - as commonly used in computational neuroscience - may not generalize. In contrast, using multiple DNN instances might alleviate this problem as they allow insights in the variability of a given model architecture to predict cortical representations. These individual differences between model instances suggest that to allow results to generalize more easily the model instances should be treated similar to human experimental participants. In Chapter 3 I focus on ecologically more valid input statistics (in the form of training images) aiming to improve a model’s ability to predict cortical representations. The most successful models of the human visual cortex to date are DNNs trained on object recognition tasks designed with machine learning goals in mind. However, the image sets used for training these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does not contain visual object categories encountered frequently in everyday human life (e.g. woman, man, or child). This suggests that taking into account the human visual experience when training models of the human visual cortex on a categorization task might help to predict cortical representations. In this Chapter I describe the creation of a set of images aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million images from 565 basic level categories and is the largest image set specifically designed for computational neuroscience to date. Ecoset is freely available to allow the community to test their own hypotheses of models trained with input statistics matched to the human visual environment. In Chapter 4 we build on the results from the previous two Chapters. Using multiple DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on ecologically more valid input statistics (ecoset) might improve its ability to predict cortical representations. I first demonstrate that ecoset might improve an architecture’s ability to mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe- art computer vision and computational neuroscience models in terms of mirroring cortical representations in the human brain. Thus, incorporating biological and ecological aspects, such as brain-inspired architectural features and ecologically more valid input statistics, into computational models may yield better predictions of response patterns in the human visual cortex. Treating DNN instances similar to human experimental participants and considering ecological and biological factors for building these DNNs may be an important step towards better models of the human visual cortex. Such models might allow a better understanding of the cortical processes underlying high-level vision in the human brain.Cambridge Trust - Vice Chancellor's Award 2015 Cambridge Philosophical Society MRC Cognition and Brain Sciences Uni

Apollo (Cambridge)