19 research outputs found

    Deep Anchored Convolutional Neural Networks

    Full text link
    Convolutional Neural Networks (CNNs) have been proven to be extremely successful at solving computer vision tasks. State-of-the-art methods favor such deep network architectures for its accuracy performance, with the cost of having massive number of parameters and high weights redundancy. Previous works have studied how to prune such CNNs weights. In this paper, we go to another extreme and analyze the performance of a network stacked with a single convolution kernel across layers, as well as other weights sharing techniques. We name it Deep Anchored Convolutional Neural Network (DACNN). Sharing the same kernel weights across layers allows to reduce the model size tremendously, more precisely, the network is compressed in memory by a factor of L, where L is the desired depth of the network, disregarding the fully connected layer for prediction. The number of parameters in DACNN barely increases as the network grows deeper, which allows us to build deep DACNNs without any concern about memory costs. We also introduce a partial shared weights network (DACNN-mix) as well as an easy-plug-in module, coined regulators, to boost the performance of our architecture. We validated our idea on 3 datasets: CIFAR-10, CIFAR-100 and SVHN. Our results show that we can save massive amounts of memory with our model, while maintaining a high accuracy performance.Comment: This paper is accepted to 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW

    End-to-End Deep Image Reconstruction From Human Brain Activity

    Get PDF
    Deep neural networks (DNNs) have recently been applied successfully to brain decoding and image reconstruction from functional magnetic resonance imaging (fMRI) activity. However, direct training of a DNN with fMRI data is often avoided because the size of available data is thought to be insufficient for training a complex network with numerous parameters. Instead, a pre-trained DNN usually serves as a proxy for hierarchical visual representations, and fMRI data are used to decode individual DNN features of a stimulus image using a simple linear model, which are then passed to a reconstruction module. Here, we directly trained a DNN model with fMRI data and the corresponding stimulus images to build an end-to-end reconstruction model. We accomplished this by training a generative adversarial network with an additional loss term that was defined in high-level feature space (feature loss) using up to 6,000 training data samples (natural images and fMRI responses). The above model was tested on independent datasets and directly reconstructed image using an fMRI pattern as the input. Reconstructions obtained from our proposed method resembled the test stimuli (natural and artificial images) and reconstruction accuracy increased as a function of training-data size. Ablation analyses indicated that the feature loss that we employed played a critical role in achieving accurate reconstruction. Our results show that the end-to-end model can learn a direct mapping between brain activity and perception

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF
    Neuromatch Academy (https://academy.neuromatch.io; (van Viegen et al., 2021)) was designed as an online summer school to cover the basics of computational neuroscience in three weeks. The materials cover dominant and emerging computational neuroscience tools, how they complement one another, and specifically focus on how they can help us to better understand how the brain functions. An original component of the materials is its focus on modeling choices, i.e. how do we choose the right approach, how do we build models, and how can we evaluate models to determine if they provide real (meaningful) insight. This meta-modeling component of the instructional materials asks what questions can be answered by different techniques, and how to apply them meaningfully to get insight about brain function

    Neuromatch Academy: a 3-week, online summer school in computational neuroscience

    Get PDF

    Understanding how visual information is represented in humans and machines

    No full text
    In the human brain, the incoming light to the retina is transformed into meaningful representations that allow us to interact with the world. In a similar vein, the RGB pixel values are transformed by a deep neural network (DNN) into meaningful representations relevant to solving a computer vision task it was trained for. Therefore, in my research, I aim to reveal insights into the visual representations in the human visual cortex and DNNs solving vision tasks. In the previous decade, DNNs have emerged as the state-of-the-art models for predicting neural responses in the human and monkey visual cortex. Research has shown that training on a task related to a brain region’s function leads to better predictivity than a randomly initialized network. Based on this observation, we proposed that we can use DNNs trained on different computer vision tasks to identify functional mapping of the human visual cortex. To validate our proposed idea, we first investigate a brain region occipital place area (OPA) using DNNs trained on scene parsing task and scene classification task. From the previous investigations about OPA’s functions, we knew that it encodes navigational affordances that require spatial information about the scene. Therefore, we hypothesized that OPA’s representation should be closer to a scene parsing model than a scene classification model as the scene parsing task explicitly requires spatial information about the scene. Our results showed that scene parsing models had representation closer to OPA than scene classification models thus validating our approach. We then selected multiple DNNs performing a wide range of computer vision tasks ranging from low-level tasks such as edge detection, 3D tasks such as surface normals, and semantic tasks such as semantic segmentation. We compared the representations of these DNNs with all the regions in the visual cortex, thus revealing the functional representations of different regions of the visual cortex. Our results highly converged with previous investigations of these brain regions validating the feasibility of the proposed approach in finding functional representations of the human brain. Our results also provided new insights into underinvestigated brain regions that can serve as starting hypotheses and promote further investigation into those brain regions. We applied the same approach to find representational insights about the DNNs. A DNN usually consists of multiple layers with each layer performing a computation leading to the final layer that performs prediction for a given task. Training on different tasks could lead to very different representations. Therefore, we first investigate at which stage does the representation in DNNs trained on different tasks starts to differ. We further investigate if the DNNs trained on similar tasks lead to similar representations and on dissimilar tasks lead to more dissimilar representations. We selected the same set of DNNs used in the previous work that were trained on the Taskonomy dataset on a diverse range of 2D, 3D and semantic tasks. Then, given a DNN trained on a particular task, we compared the representation of multiple layers to corresponding layers in other DNNs. From this analysis, we aimed to reveal where in the network architecture task-specific representation is prominent. We found that task specificity increases as we go deeper into the DNN architecture and similar tasks start to cluster in groups. We found that the grouping we found using representational similarity was highly correlated with grouping based on transfer learning thus creating an interesting application of the approach to model selection in transfer learning. During previous works, several new measures were introduced to compare DNN representations. So, we identified the commonalities in different measures and unified different measures into a single framework referred to as duality diagram similarity. This work opens up new possibilities for similarity measures to understand DNN representations. While demonstrating a much higher correlation with transfer learning than previous state-of-the-art measures we extend it to understanding layer-wise representations of models trained on the Imagenet and Places dataset using different tasks and demonstrate its applicability to layer selection for transfer learning. In all the previous works, we used the task-specific DNN representations to understand the representations in the human visual cortex and other DNNs. We were able to interpret our findings in terms of computer vision tasks such as edge detection, semantic segmentation, depth estimation, etc. however we were not able to map the representations to human interpretable concepts. Therefore in our most recent work, we developed a new method that associates individual artificial neurons with human interpretable concepts. Overall, the works in this thesis revealed new insights into the representation of the visual cortex and DNNs...Im menschlichen Gehirn wird das auf der Netzhaut eintreffende Licht in sinnvolle Darstellungen umgewandelt, die es uns ermöglichen, mit der Welt zu interagieren. In ähnlicher Weise werden die RGB-Pixelwerte von einem tiefen neuronalen Netz (DNN) in sinnvolle Darstellungen umgewandelt, die für die Lösung einer Computer-Vision-Aufgabe relevant sind, für die es trainiert wurde. In meiner Forschung möchte ich daher Erkenntnisse darüber gewinnen, wie visuelle Informationen im menschlichen visuellen Kortex und in DNNs, die trainiert wurden visuelle Aufgaben zu lösen, dargestellt werden. Die Hauptidee im ersten Teil der Arbeit besteht darin, die Repräsentationen sowohl des menschlichen visuellen Kortex als auch der DNNs zu untersuchen, indem DNNs verglichen werden, die für verschiedene Aufgaben trainiert wurden. Um dies zu erreichen vergleichen wir eine Hirnregion oder eine Schicht eines DNNs mit den aufgabenspezifischen Repräsentationen mehrerer DNNs, die für unterschiedliche Aufgaben trainiert wurden. Der Vergleich informiert uns über die Repräsentation, die für die Lösung der Computer-Vision-Aufgabe relevant ist und die der Repräsentation der Gehirnregion/des Ziel-DNNs am nächsten kommt. Kapitel 1: Verständnis der Repräsentationen im menschlichen visuellen Kortex Im ersten Kapitel konzentriere ich mich auf das Verständnis der Repräsentation verschiedener Regionen im visuellen Kortex. Wir untersuchen zunächst, ob unser vorgeschlagener Ansatz Einblicke in die Repräsentation einer Hirnregion liefert, die mit früheren Untersuchungen dieser Hirnregion übereinstimmen. Nachdem wir den Ansatz validiert haben, können wir ihn anwenden, um die Repräsentationen von weniger untersuchten Hirnregionen zu verstehen und einige Einblicke in die funktionellen Aufgaben dieser Hirnregionen zu gewinnen. Daher validieren wir im ersten Teil von Kapitel 1 unseren Ansatz in den gut untersuchten szenenselektiven Regionen Occipital Place Area (OPA) und Parahippocampal Place Area (PPA). Im zweiten Teil des Kapitels wenden wir unseren Ansatz auf mehrere Regionen des visuellen Kortex an und geben somit Einblicke in deren Repräsentationen. Sondierung Selektive Regionen der Szene Szenenselektive Regionen sind Regionen im Gehirn, die im Vergleich zu Bildern aus anderen Kategorien und verschlüsselten Bildern eine hohe Reaktion auf Szenenbilder zeigen. In einer Neuroimaging-Studie wurde gezeigt, dass OPA, eine der szenenselektiven Regionen, an der Vorhersage von den Regionen in einem Innenraum beteiligt ist, die für die Navigation relevant sind (Navigational affordance). Um diese ”nagivational affordances’ ausfinding zu machen, sind räumliche Informationen darüber, wo sich die Hindernisse befinden und wo der Ausgang in der Szene liegt, entscheidend. Daher war unsere Hypothese, dass die Repräsentation in OPA näher an einem Computermodell liegen sollte, das darauf trainiert ist, Szenen in verschiedene Komponenten (Hindernisse, Boden, Wand usw.) zu zerlegen, als an einem Modell, das darauf trainiert ist, die Kategorie der Szene zu identifizieren. Um unsere Hypothese zu evaluieren, haben wir Modelle für das Parsing und die Klassifizierung von Szenen ausgewählt und ihre Darstellung mit der Darstellung von OPA verglichen. Um die Verallgemeinerbarkeit unserer Ergebnisse zu gewährleisten, verwenden wir drei Architekturen sowohl für die Szenenanalyse als auch für die Szenenklassifikation. Wir fanden heraus, dass die Modelle zur Szenenanalyse bei allen drei Architekturen die Reaktionen der szenenselektiven Region OPA besser vorhersagten. Die Ergebnisse der Studie bestätigen unsere Hypothese und damit die Umsetzbarkeit des vorgeschlagenen Ansatzes zum Verständnis der Repräsentationen der Gehirnregionen im visuellen Kortex. Untersuchung des gesamten visuellen Kortex Nachdem wir den vorgeschlagenen Ansatz im vorangegangenen Teil validiert haben, erweitern wir in diesem Teil die Menge der betrachteten Modelle und Gehirnregionen. Um sicherzustellen, dass der Unterschied in der Repräsentationsähnlichkeit zwischen einer bestimmten Hirnregion und einem DNN nur auf die Aufgabe zurückzuführen ist, war unser Kriterium für die Modellauswahl, dass alle Modelle auf demselben Datensatz trainiert werden (kein Einfluss der Trainingsdaten) und eine identische Architektur haben sollten (kein Einfluss der Architektur). Daher wählten wir einen großen Satz von Modellen aus, die auf dem Taskonomy-Datensatz trainiert wurden und die für eine Vielzahl von Aufgaben trainiert wurden, von einfachen 2DAufgaben bis hin zu Aufgaben, die ein dreidimensionales Verständnis der Szene und semantisches Wissen über die Szene erfordern. Für die Hirnregionen wählen wir den gesamten visuellen Kortex aus und unterteilen ihn mithilfe eines probabilistischen anatomischen Atlasses in Regionen..

    The spatiotemporal neural dynamics of object location representations in the human brain

    Get PDF
    To interact with objects in complex environments, we must know what they are and where they are in spite of challenging viewing conditions. Here, we investigated where, how and when representations of object location and category emerge in the human brain when objects appear on cluttered natural scene images using a combination of functional magnetic resonance imaging, electroencephalography and computational models. We found location representations to emerge along the ventral visual stream towards lateral occipital complex, mirrored by gradual emergence in deep neural networks. Time-resolved analysis suggested that computing object location representations involves recurrent processing in high-level visual cortex. Object category representations also emerged gradually along the ventral visual stream, with evidence for recurrent computations. These results resolve the spatiotemporal dynamics of the ventral visual stream that give rise to representations of where and what objects are present in a scene under challenging viewing conditions
    corecore