45 research outputs found

    COMPUTATIONAL MODELS OF FEATURE REPRESENTATIONS IN THE VENTRAL VISUAL STREAM

    Get PDF
    Understanding vision requires unpacking the representations of the visual processing hierarchy. One major and unresolved challenge is to understand the representations of high-level category-selective areas – areas that respond preferentially to certain semantic categories of stimuli (e.g., scene-selective areas respond more to scenes than objects). Attempts at characterizing the representations of category-selective areas have been hampered by the difficulty of describing their complex perceptual representations in words — these representations exist in an “ineffable valley” between the describable patterns of perceptual features (e.g., edges, colors) and the commonsense concepts of visual cognition (e.g., object categories). Here I developed a novel approach to identify the emergent properties of mid-level representations in purely feedforward deep convolutional neural network (CNN) models of category-selective cortex. Using this approach, CNN models were fit to scene-evoked fMRI responses in both scene-selective cortex and object-selective cortex. This method uses a semantically-guided image-occlusion procedure together with behavioral ratings to systematically characterize the tuning profiles of the category-selective CNNs. I found that while the representations in category-selective CNNs appear complex and difficult to describe at a surface level, large-scale computational analyses can reveal 1) interpretable descriptions of mid-level feature representations and 2) the emergence of semantic selectivity through purely bottom-up perceptual feature tuning. Specifically, these models provide a proof-of-principle demonstration of how the semantic selectivity of category-selective regions could arise through perceptual-feature tuning in a small series of feedforward computations. These effects were robust to variations of model hyperparameters and were reproducible across different CNN architectures and training procedures. Taken together, I demonstrated how large datasets and in-silico computational models can be used to reveal the tuning profiles of category-selective regions and to identify how semantic preferences could emerge through bottom-up processes

    Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense

    Full text link
    Recent progress in deep learning is essentially based on a "big data for small tasks" paradigm, under which massive amounts of data are used to train a classifier for a single narrow task. In this paper, we call for a shift that flips this paradigm upside down. Specifically, we propose a "small data for big tasks" paradigm, wherein a single artificial intelligence (AI) system is challenged to develop "common sense", enabling it to solve a wide range of tasks with little training data. We illustrate the potential power of this new paradigm by reviewing models of common sense that synthesize recent breakthroughs in both machine and human vision. We identify functionality, physics, intent, causality, and utility (FPICU) as the five core domains of cognitive AI with humanlike common sense. When taken as a unified concept, FPICU is concerned with the questions of "why" and "how", beyond the dominant "what" and "where" framework for understanding vision. They are invisible in terms of pixels but nevertheless drive the creation, maintenance, and development of visual scenes. We therefore coin them the "dark matter" of vision. Just as our universe cannot be understood by merely studying observable matter, we argue that vision cannot be understood without studying FPICU. We demonstrate the power of this perspective to develop cognitive AI systems with humanlike common sense by showing how to observe and apply FPICU with little training data to solve a wide range of challenging tasks, including tool use, planning, utility inference, and social learning. In summary, we argue that the next generation of AI must embrace "dark" humanlike common sense for solving novel tasks.Comment: For high quality figures, please refer to http://wellyzhang.github.io/attach/dark.pd

    The computational neurology of active vision

    Get PDF
    In this thesis, we appeal to recent developments in theoretical neurobiology – namely, active inference – to understand the active visual system and its disorders. Chapter 1 reviews the neurobiology of active vision. This introduces some of the key conceptual themes around attention and inference that recur through subsequent chapters. Chapter 2 provides a technical overview of active inference, and its interpretation in terms of message passing between populations of neurons. Chapter 3 applies the material in Chapter 2 to provide a computational characterisation of the oculomotor system. This deals with two key challenges in active vision: deciding where to look, and working out how to look there. The homology between this message passing and the brain networks solving these inference problems provide a basis for in silico lesion experiments, and an account of the aberrant neural computations that give rise to clinical oculomotor signs (including internuclear ophthalmoplegia). Chapter 4 picks up on the role of uncertainty resolution in deciding where to look, and examines the role of beliefs about the quality (or precision) of data in perceptual inference. We illustrate how abnormal prior beliefs influence inferences about uncertainty and give rise to neuromodulatory changes and visual hallucinatory phenomena (of the sort associated with synucleinopathies). We then demonstrate how synthetic pharmacological perturbations that alter these neuromodulatory systems give rise to the oculomotor changes associated with drugs acting upon these systems. Chapter 5 develops a model of visual neglect, using an oculomotor version of a line cancellation task. We then test a prediction of this model using magnetoencephalography and dynamic causal modelling. Chapter 6 concludes by situating the work in this thesis in the context of computational neurology. This illustrates how the variational principles used here to characterise the active visual system may be generalised to other sensorimotor systems and their disorders

    Visually Guided Control of Movement

    Get PDF
    The papers given at an intensive, three-week workshop on visually guided control of movement are presented. The participants were researchers from academia, industry, and government, with backgrounds in visual perception, control theory, and rotorcraft operations. The papers included invited lectures and preliminary reports of research initiated during the workshop. Three major topics are addressed: extraction of environmental structure from motion; perception and control of self motion; and spatial orientation. Each topic is considered from both theoretical and applied perspectives. Implications for control and display are suggested

    How to improve learning from video, using an eye tracker

    Get PDF
    The initial trigger of this research about learning from video was the availability of log files from users of video material. Video modality is seen as attractive as it is associated with the relaxed mood of watching TV. The experiments in this research have the goal to gain more insight in viewing patterns of students when viewing video. Students received an awareness instruction about the use of possible alternative viewing behaviors to see whether this would enhance their learning effects. We found that: - the learning effects of students with a narrow viewing repertoire were less than the learning effects of students with a broad viewing repertoire or strategic viewers. - students with some basic knowledge of the topics covered in the videos benefited most from the use of possible alternative viewing behaviors and students with low prior knowledge benefited the least. - the knowledge gain of students with low prior knowledge disappeared after a few weeks; knowledge construction seems worse when doing two things at the same time. - media players could offer more options to help students with their search for the content they want to view again. - there was no correlation between pervasive personality traits and viewing behavior of students. The right use of video in higher education will lead to students and teachers that are more aware of their learning and teaching behavior, to better videos, to enhanced media players, and, finally, to higher learning effects that let users improve their learning from video

    Developmental Bootstrapping of AIs

    Full text link
    Although some current AIs surpass human abilities in closed artificial worlds such as board games, their abilities in the real world are limited. They make strange mistakes and do not notice them. They cannot be instructed easily, fail to use common sense, and lack curiosity. They do not make good collaborators. Mainstream approaches for creating AIs are the traditional manually-constructed symbolic AI approach and generative and deep learning AI approaches including large language models (LLMs). These systems are not well suited for creating robust and trustworthy AIs. Although it is outside of the mainstream, the developmental bootstrapping approach has more potential. In developmental bootstrapping, AIs develop competences like human children do. They start with innate competences. They interact with the environment and learn from their interactions. They incrementally extend their innate competences with self-developed competences. They interact and learn from people and establish perceptual, cognitive, and common grounding. They acquire the competences they need through bootstrapping. However, developmental robotics has not yet produced AIs with robust adult-level competences. Projects have typically stopped at the Toddler Barrier corresponding to human infant development at about two years of age, before their speech is fluent. They also do not bridge the Reading Barrier, to skillfully and skeptically draw on the socially developed information resources that power current LLMs. The next competences in human cognitive development involve intrinsic motivation, imitation learning, imagination, coordination, and communication. This position paper lays out the logic, prospects, gaps, and challenges for extending the practice of developmental bootstrapping to acquire further competences and create robust, resilient, and human-compatible AIs.Comment: 102 pages, 29 figure

    Proceedings of KogWis 2012. 11th Biannual Conference of the German Cognitive Science Society

    Get PDF
    The German cognitive science conference is an interdisciplinary event where researchers from different disciplines -- mainly from artificial intelligence, cognitive psychology, linguistics, neuroscience, philosophy of mind, and anthropology -- and application areas -- such as eduction, clinical psychology, and human-machine interaction -- bring together different theoretical and methodological perspectives to study the mind. The 11th Biannual Conference of the German Cognitive Science Society took place from September 30 to October 3 2012 at Otto-Friedrich-Universität in Bamberg. The proceedings cover all contributions to this conference, that is, five invited talks, seven invited symposia and two symposia, a satellite symposium, a doctoral symposium, three tutorials, 46 abstracts of talks and 23 poster abstracts

    The role of phonology in visual word recognition: evidence from Chinese

    Get PDF
    Posters - Letter/Word Processing V: abstract no. 5024The hypothesis of bidirectional coupling of orthography and phonology predicts that phonology plays a role in visual word recognition, as observed in the effects of feedforward and feedback spelling to sound consistency on lexical decision. However, because orthography and phonology are closely related in alphabetic languages (homophones in alphabetic languages are usually orthographically similar), it is difficult to exclude an influence of orthography on phonological effects in visual word recognition. Chinese languages contain many written homophones that are orthographically dissimilar, allowing a test of the claim that phonological effects can be independent of orthographic similarity. We report a study of visual word recognition in Chinese based on a mega-analysis of lexical decision performance with 500 characters. The results from multiple regression analyses, after controlling for orthographic frequency, stroke number, and radical frequency, showed main effects of feedforward and feedback consistency, as well as interactions between these variables and phonological frequency and number of homophones. Implications of these results for resonance models of visual word recognition are discussed.postprin
    corecore