24 research outputs found

    GRASS: Generative Recursive Autoencoders for Shape Structures

    Full text link
    We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects fundamental intra-shape relationships such as adjacency and symmetry. We develop a recursive neural net (RvNN) based autoencoder to map a flat, unlabeled, arbitrary part layout to a compact code. The code effectively captures hierarchical structures of man-made 3D objects of varying structural complexities despite being fixed-dimensional: an associated decoder maps a code back to a full hierarchy. The learned bidirectional mapping is further tuned using an adversarial setup to yield a generative model of plausible structures, from which novel structures can be sampled. Finally, our structure synthesis framework is augmented by a second trained module that produces fine-grained part geometry, conditioned on global and local structural context, leading to a full generative pipeline for 3D shapes. We demonstrate that without supervision, our network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.Comment: Corresponding author: Kai Xu ([email protected]

    Investigation the Relationship between Human Visual Brain Activity and Emotions

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2019. 8. ๊น€๊ฑดํฌ.์ธ์ฝ”๋”ฉ ๋ชจ๋ธ์€ ์ž๊ทน์œผ๋กœ๋ถ€ํ„ฐ ์ด‰๋ฐœ๋œ ๋‡Œ ํ™œ๋™์„ ์˜ˆ์ธกํ•˜๊ณ , ๋‡Œ๊ฐ€ ์ •๋ณด๋ฅผ ์–ด๋–ป ๊ฒŒ ์ฒ˜๋ฆฌํ•˜๋Š”์ง€ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ๋‹ค.๋ฐ˜๋ฉด ๋””์ฝ”๋”ฉ ๋ชจ๋ธ์€ ๋‡Œ ํ™œ๋™์œผ๋กœ๋ถ€ํ„ฐ ์ž๊ทน์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์˜ˆ์ธกํ•˜๊ณ , ํ˜„์žฌ ํŠน์ • ์ž๊ทน์ด ์กด์žฌํ•˜๋Š”์ง€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๊ฒƒ ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋‘ ๋ชจ๋ธ์€ ์ข…์ข… ํ•จ๊ป˜ ์‚ฌ์šฉ๋œ๋‹ค. ๋‡Œ์˜ ์‹œ๊ฐ ์ฒด๊ณ„๋Š” ์ž๊ทน์— ๋Œ€ํ•œ ๊ฐ์ • ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ๊ณ  [15, 20], ํ”ฝ์…€๋“ค์ด ๋ฌด์ž‘์œ„๋กœ ์„ž์—ฌ ์žˆ๋Š” ์ž๊ทน์œผ ๋กœ๋ถ€ํ„ฐ ์œ ๋„๋œ ์‹œ๊ฐ ์ฒด๊ณ„์˜ ํ™œ๋™์œผ๋กœ๋ถ€ํ„ฐ๋„ ๊ฐ™์€ ๊ฐ์ • ์ •๋ณด๋ฅผ ์ถ”์ถœํ•ด๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์•Œ๋ ค์ ธ ์žˆ๋‹ค [20]. ์ด๋Ÿฐ ์—ฐ๊ตฌ๋“ค์„ ๊ณ ๋ คํ•˜์—ฌ, ์šฐ๋ฆฌ๋Š” ์‹œ๊ฐ ์ฒด๊ณ„๊ฐ€ ์–ด๋Š ์ˆ˜์ค€๊นŒ์ง€ ๊ฐ์ • ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š”์ง€ ํƒ๊ตฌํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ธ์ฝ”๋”ฉ ๋ชจ๋ธ์„ ์‚ฌ ์šฉํ•˜์—ฌ ์ƒ์œ„/์ค‘์œ„/ํ•˜์œ„ ์‹œ๊ฐ ํŠน์„ฑ(feature)๊ณผ ๊ฐ๊ฐ ๊ด€๋ จ์ด ์žˆ๋Š” ๋‡Œ ์˜์—ญ์„ ์„ ํƒํ•˜๊ณ , ์ด ๋‡Œ ์˜์—ญ๋“ค๋กœ๋ถ€ํ„ฐ ๊ฐ์ • ์ •๋ณด๋ฅผ ๋””์ฝ”๋”ฉ ํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ํ›„๋‘์—ฝ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์•ˆ์™€์ „๋‘ํ”ผ์งˆ๊นŒ์ง€ ์ด์–ด์ง€๋Š” ์˜์—ญ๋“ค์ด ์ด๋Ÿฐ ํŠน์„ฑ๋“ค์„ ์ธ์ฝ”๋”ฉ ํ•˜๊ณ  ์žˆ ๋‹ค๋Š” ๊ฒƒ์„ ๋ฐํžŒ๋‹ค. ๋‹ค๋ฅธ ๋‡Œ ์˜์—ญ๋“ค๊ณผ ๋‹จ์ˆœํ•œ CNN ํŠน์„ฑ๋“ค๊ณผ๋Š” ๋‹ฌ๋ฆฌ, ์ด๋Ÿฌํ•œ ๋‡Œ ์˜์—ญ๋“ค๋กœ๋ถ€ํ„ฐ๋Š” ๊ฐ์ • ์ •๋ณด๋ฅผ ๋””์ฝ”๋”ฉ ํ•  ์ˆ˜ ์—†์—ˆ๋‹ค. ์ด ๊ฒฐ๊ณผ๋“ค์€ ์ƒ์œ„/ ์ค‘์œ„/ํ•˜์œ„ ์‹œ๊ฐ ํŠน์„ฑ๋“ค์„ ์ธ์ฝ”๋”ฉ ํ•˜๊ณ  ์žˆ๋Š” ๋‡Œ ์˜์—ญ๋“ค์ด ์•ž์„œ ๋ฐํ˜€์ง„ ๊ฐ์ • ์ •๋ณด ๋””์ฝ”๋”ฉ๊ณผ ๊ด€๋ จ์ด ์—†์Œ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ, ๋”ฐ๋ผ์„œ ํ›„๋‘์—ฝ๊ณผ ๊ด€๋ จ๋œ ๊ฐ์ • ์ •๋ณด ๋””์ฝ”๋”ฉ ์„ฑ๋Šฅ์€ ์‹œ๊ฐ๊ณผ ๊ด€๋ จ ์—†๋Š” ์ •๋ณด ์ฒ˜๋ฆฌ์— ๊ธฐ์ธํ•œ๋‹ค.Encoding models predict brain activity elicited by stimuli and are used to investigate how information is processed in the brain. Whereas decod- ing models predict information about the stimuli using brain activity and aim to identify whether such information is present. Both models are of- ten used in conjunction. The brains visual system has shown to decode stimuli related emotional information [15, 20]. However brain activity in the visual system induced by the same visual stimuli but scrambled, has also been able to decode the same emotional information [20]. Consid- ering these results, we raise the question to what extent encoded visual information also encodes emotional information. We use encoding models to select brain regions related to low-, mid- and high- level visual features and use these brain regions to decode related emotional information. We found that these features are encoded not only in the occipital lobe, but also in later regions extending to the orbito-frontal cortex. Said brain re- gions were not able to decode emotion information, whereas other brain regions and plain CNN features were. These results show that brain re- gions encoding low-, mid- and high- level visual features are not related to the previously found emotional decoding performance and thus, the decoding performance related to the occipital lobe should be contributed to non-vision related processing.Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Emotions and the Visual System 4 2.1.1 Visualsystem 4 2.1.2 Emotions 6 2.2 functional Magnetic Resonance Imaging 7 2.2.1 BOLDsignal 8 2.2.2 Analysis of fMRI 9 2.2.3 EncodingModel 10 2.2.4 DecodingModel 11 2.3 RelatedWork 13 Chapter 3 Materials & Methods 17 3.1 Experimental data 18 3.2 Encoding model 19 3.3 Decoding Model 22 Chapter 4 Results 24 4.1 Encoding 24 4.2 Decoding 28 Chapter 5 Discussion and Limitations 31 5.1 Encoding 31 5.2 Decoding 33 5.3 Limitations and Feature Directions 35 Chapter 6 Conclusion 37 ์š”์•ฝ 42Maste

    Attention as a Mechanism for Object-Object Binding in Complex Scenes

    Get PDF
    The current study attempted to determine whether direct binding between objects in complex scenes occurs as a function of directed attention at encoding. In Experiment 1, participants viewed objects in one of these different types contexts: unique scenes, similar scenes, or arrays with no contextual information. Critically, only half of the objects were attended for each encoding trial. Participants then completed an associative recognition task on pairs of items created from the previously studied scenes. Test pairs consisted of two attended or unattended objects, and were associated with a unique scene, a similar scene, or an array. Evidence of binding for attended objects was clear. Associative recognition was better for attended pairs, relative to unattended pairs, regardless of the type of context in which the objects were studied. Object-context binding was not observed in memory for attended object pairs, but was observed for unattended object pairs. Experiment 2 explored the extent to which binding strength between object relationships varies as a function of temporal and/or spatial proximity. The procedure for Experiment 2 was identical to Experiment 1, with the exception that all of the objects in the encoding trials were attended. There were no significant main effects or interactions of spatial and temporal distance on binding strength, as measured by associative recognition

    Image Classification of Marine-Terminating Outlet Glaciers using Deep Learning Methods

    Get PDF
    A wealth of research has focused on elucidating the key controls on mass loss from the Greenland and Antarctic ice sheets in response to climate forcing, specifically in relation to the drivers of marine-terminating outlet glacier change. Despite the burgeoning availability of medium resolution satellite data, the manual methods traditionally used to monitor change of marine-terminating outlet glaciers from satellite imagery are time-consuming and can be subjective, especially where a mรฉlange of icebergs and sea-ice exists at the terminus. To address this, recent advances in deep learning applied to image processing have created a new frontier in the field of automated delineation of glacier termini. However, at this stage, there remains a paucity of research on the use of deep learning for pixel-level semantic image classification of outlet glacier environments. This project develops and tests a two-phase deep learning approach based on a well-established convolutional neural network (CNN) called VGG16 for automated classification of Sentinel-2 satellite images. The novel workflow, termed CNN-Supervised Classification (CSC), was originally developed for fluvial settings but is adapted here to produce multi-class outputs for test imagery of glacial environments containing marine-terminating outlet glaciers in eastern Greenland. Results show mean F1 scores up to 95% for in-sample test imagery and 93% for out-of-sample test imagery, with significant improvements over traditional pixel-based methods such as band ratio techniques. This demonstrates the robustness of the deep learning workflow for automated classification despite the complex characteristics of the imagery. Future research could focus on the integration of deep learning classification workflows with platforms such as Google Earth Engine (GEE), to classify imagery more efficiently and produce datasets for a range of glacial applications without the need for substantial prior experience in coding or deep learning

    Eye-specific detection and a multi-eye integration model of biological motion perception

    Get PDF
    โ€˜Biological motionโ€™ refers to the distinctive kinematics observed in many living organisms, where visually perceivable points on the animal move at fixed distances from each other. Across the animal kingdom, many species have developed specialized visual circuitry to recognize such biological motion and to discriminate it from other patterns. Recently, this ability has been observed in the distributed visual system of jumping spiders. These eight-eyed animals use six eyes to perceive motion, while the remaining two (the principal anterior medial eyes) are shifted across the visual scene to further inspect detected objects. When presented with a biologically moving stimulus and a random one, jumping spiders turn to face the latter, clearly demonstrating the ability to discriminate between them. However, it remains unclear whether the principal eyes are necessary for this behavior, whether all secondary eyes can perform this discrimination, or whether a single eye-pair is specialized for this task. Here, we systematically tested the ability of jumping spiders to discriminate between biological and random visual stimuli by testing each eye-pair alone. Spiders were able to discriminate stimuli only when the anterior lateral eyes were unblocked, and performed at chance levels in other configurations. Interestingly, spiders showed a preference for biological motion over random stimuli โ€“ unlike in past work. We therefore propose a new model describing how specialization of the anterior lateral eyes for detecting biological motion contributes to multi-eye integration in this system. This integration generates more complex behavior through the combination of simple, single-eye responses. We posit that this in-built modularity may be a solution to the limited resources of these invertebrates' brains, constituting a novel approach to visual processing
    corecore