21 research outputs found

    How does the primate ventral visual stream causally support core object recognition?

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 161-173).Primates are able to rapidly, accurately and effortlessly perform the computationally difficult visual task of invariant object recognition - the ability to discriminate between different objects in the face of high variation in object viewing parameters and background conditions. This ability is thought to rely on the ventral visual stream, a hierarchy of visual cortical areas culminating in inferior temporal (IT) cortex. In particular, decades of research strongly suggests that the population of neurons in IT supports invariant object recognition behavior. However, direct causal evidence for this decoding hypothesis has been equivocal to date, especially beyond the specific case of face-selective sub-regions of IT. This research aims to directly test the general causal role of IT in invariant object recognition. To do so, we first characterized human and macaque monkey behavior over a large behavioral domain consisting of binary discriminations between images of basic-level objects, establishing behavioral metrics and benchmarks for computational models of this behavior. This work suggests that, in the domain of basic-level core object recognition, humans and monkeys are remarkably similar in their behavioral responses, while leading models of the visual system significantly diverge from primate behavior. We then reversibly inactivated individual, millimeter-scale regions of IT via injection of muscimol while monkeys performed several interleaved binary object discrimination tasks. We found that inactivating different millimeter-scale regions of primate IT resulted in different patterns of object recognition deficits, each predicted by the local region's neuronal selectivity. Our results provide causal evidence that IT directly underlies primate object recognition behavior in a topographically organized manner. Taken together, these results establish quantitative experimental constraints for computational models of the ventral visual stream and object recognition behavior.by Rishi Rajalingham.Ph. D

    Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes

    Full text link
    Humans and animals have a rich and flexible understanding of the physical world, which enables them to infer the underlying dynamical trajectories of objects and events, plausible future states, and use that to plan and anticipate the consequences of actions. However, the neural mechanisms underlying these computations are unclear. We combine a goal-driven modeling approach with dense neurophysiological data and high-throughput human behavioral readouts to directly impinge on this question. Specifically, we construct and evaluate several classes of sensory-cognitive networks to predict the future state of rich, ethologically-relevant environments, ranging from self-supervised end-to-end models with pixel-wise or object-centric objectives, to models that future predict in the latent space of purely static image-based or dynamic video-based pretrained foundation models. We find strong differentiation across these model classes in their ability to predict neural and behavioral data both within and across diverse environments. In particular, we find that neural responses are currently best predicted by models trained to predict the future state of their environment in the latent space of pretrained foundation models optimized for dynamic scenes in a self-supervised manner. Notably, models that future predict in the latent space of video foundation models that are optimized to support a diverse range of sensorimotor tasks, reasonably match both human behavioral error patterns and neural dynamics across all environmental scenarios that we were able to test. Overall, these findings suggest that the neural mechanisms and behaviors of primate mental simulation are thus far most consistent with being optimized to future predict on dynamic, reusable visual representations that are useful for embodied AI more generally.Comment: 17 pages, 6 figure

    Characterization, modeling and analysis of neural signals for brain-machine interface applications

    No full text
    Part 1: Brain-machine interfaces provide a means of communication between the brain and the environment by extracting and decoding neural signals for control of external devices such as prosthetic limbs. Although such devices are promising solutions to help paralyzed patients, progress is limited by the quantity of information that can be safely extracted from the brain. The strategy proposed in this project is to maximize the amount of information recorded from a minimum number of electrodes by increasing the number of independent modalities recorded. We recorded local oxygen concentration using a fluorescence-based optical sensor, simultaneously with the electrical activity of single neurons in the posterior parietal cortex of Rhesus macaque monkeys. Local oxygen was measured using a fluorescence-quenching optical oxygen sensor, while electrical neural signals were recorded with a microelectrode. Throughout the recordings, monkeys performed a delayed memory reach task to targets in the 2D fronto-parallel plane. We characterized the acquired signal and its relationship to spikes and local field potentials. Results showed that local oxygen increased from baseline during the memory period, independently of the recorded neural activity. Additionally, this modulation in oxygen concentration was used in conjunction with spiking and LFP activity to decode reach directions. Class-dependent information was measured on all combinations of the recorded modalities, using a cross-validated classification scheme. We found that oxygen contains complementary information to spikes and LFPs which arises early in the trial., This early predictive signal would be a useful input for cognitive brain-machine interfaces.Part 2:Previous research has shown that the expected value of reward associated with a reach modulates neural activity in the medial intraparietal cortex (MIP). In this study, we investigated the temporal dynamics of reward modulation in MIP by examining how given reward magnitudes are encoded when presented in different contexts or schedules. We recorded neural activity while monkeys performed a delayed reach task under two reward schedules. In the variable schedule, an equal number of small and large rewards were randomly interleaved trial by trial, while in the constant schedule only one of these rewards was delivered for a block of trials. Each recording session consisted of a block of trials with a variable schedule as well as blocks with both small and large constant schedules. Neural activity for the same reward was observed to vary significantly between the two schedules. Specifically, the discrimination between large and small reward at the neural level was significantly less in the variable reward schedule than in the constant reward schedule. This result is accounted for by the dependence of instantaneous firing rate on past trials, which is shown using information theoretic metrics. We modeled the neural firing rate as a linear system response to reward using a systems identification approach, and showed that the modeled systems effectively low-pass filtered reward signals. Given this observation, we hypothesize that this filtering mechanism leads to a robust memory of low-risk rewards, advantageous to decision-making mechanisms in the brain that assess rewards against risks.Part 1: Les interfaces cerveau-machine propose un moyen de communication entre le cerveau et l'environnement en extrayant et en décodant les signaux neuraux. Ces signaux peuvent ensuite être utilisés pour actionner un appareil externe comme un membre artificiel. Même si ces appareils sont une solution prometteuse pour les gens atteints de paralysie, les progrès sont limités par la quantité d'information qui peut-être extraite en toute sécurité. La stratégie proposée par ce projet est de maximiser la quantité d'information échantillonnée d'un minimum d'électrode, en augmentant le nombre de modalité. Nous avons enregistré simultanément la concentration locale d'oxygène, en utilisant un senseur optique basé sur la fluorescence, et l'activité électrochimique d'un seul neurone dans le cortex pariétal postérieur d'un singe macaque Rhésus. L'oxygène locale a été mesurée en utilisant un senseur optique d'oxygène à extinction de fluorescence, alors que l'activité électrochimique a été enregistrée à l'aide d'une microélectrode. Au cours de l'échantillonnage, les singes ont performé une tâche de mémorisation d'atteinte de cible à délai, la cible étant présentée dans un plan 2D situé en face du singe. Nous avons caractérisé le signal acquis et sa relation avec les impulsions du neurone et le champ de potentiel locale. Les résultats montrent que l'oxygène locale a augmenté par rapport à son niveau de référence durant la période de mémorisation indépendamment de l'activité neuronale enregistrée. De plus, cette modulation de la concentration d'oxygène a été utilisée en conjonction avec les impulsions des neurones et le champ de potentiel local pour décoder la direction du mouvement. L'information donnée en fonction de la classe a été mesurée pour toute les combinaisons de modalités enregistrées, en utilisant une méthode de classification par validation croisée. Nous avons trouvé que l'oxygène contient de l'information complémentaire à celle convoyée par les impulsions et le champ de potentiel local, en plus d'être disponible plus tôt au cours de l'essai. Part 2: Les études précédentes ont démontré que la valeur de la récompense anticipée associé avec un mouvement module l'activité neuronale dans l'aire intrapariétale médial (IPM). Dans notre étude, nous examinons la dynamique temporelle de la modulation par la récompense dans l'aire IPM en questionnant comment la magnitude de la récompense est encodée lorsque présentée dans différent contexte ou planning. Nous avons enregistré l'activité des neurones alors que les singes performaient une tâche d'atteinte de cible à délai sous deux conditions de récompenses. Dans la condition variable, un nombre égal d'essai résultant en de petite ou large récompenses ont été entrelacés de façon aléatoire, alors que pour la condition constante, un seul format de récompense était dispensé pour la durée d'un bloc d'essai. Chaque séance d'enregistrement a consisté en un bloc d'essai à condition variable et de blocs à conditions constantes dispensant une petite et une large récompense. L'activité neuronale pour la même récompense a varié de façon significative entre les deux conditions. Spécifiquement, la différence dans le taux d'impulsions entre la petite et la large récompense pour la condition variable était plus petite que pour la condition de récompense constante. Ces résultats indiquent que la récompense par essai n'est pas indépendante, au niveau neuronale, mais plutôt que l'activité des neurones est modulé par la récompense présente et passée. En utilisant l'identification de système, nous avons modélisé les impulsions en fonction de la récompense comme un système linéaire. Nous avons observé que le système modélisé effectuait un filtrage à passe-bas du signal de récompense. Nous défendons l'hypothèse que ce mécanisme entraine une mémoire robuste pour les récompenses de bas risque, avantageux pour les mécanismes de prise de décision du cerveau qui compare la récompense attendu au risque encouru

    The PMP certification exam study guide

    No full text
    To support accurate memory-guided reaching, the brain must represent both the direction and amplitude of reaches in a movement plan. Several cortical areas have been shown to represent the direction of a planned reaching movement, but the neuronal representation of reach amplitude is still unclear, especially in sensory-motor integration areas. To investigate this, we recorded from neurons in the medial intraparietal area (MIP) of monkeys performing a variable amplitude memory reach task. In one monkey, we additionally recorded from the dorsal premotor cortex (PMd) for direct cross-area comparisons. In both areas, we found modest but significant proportions of neurons with movement-planning activity sensitive to reach amplitude. However, reach amplitude was under-represented relative to direction in the neuronal population, with approximately one third as many selective neurons. We observed an interaction between neuronal selectivity for amplitude and direction; neurons in both areas exhibited significant modulation of neuronal activity by reach amplitude in some but not all directions. Consistent with an encoding of reach goals as a position in visual space, the response patterns of MIP/PMd neurons were best predicted by 2D Gaussian position encoding model, in contrast to a number of alternative direction and amplitude tuning models. Taken together, these results suggest that amplitude and direction jointly modulate activity in MIP, as in PMd, to form representations of intended reach position

    Reversible Inactivation of Different Millimeter-Scale Regions of Primate IT Results in Different Patterns of Core Object Recognition Deficits

    No full text
    Extensive research suggests that the inferior temporal (IT) population supports visual object recognition behavior. However, causal evidence for this hypothesis has been equivocal, particularly beyond the specific case of face-selective subregions of IT. Here, we directly tested this hypothesis by pharmacologically inactivating individual, millimeter-scale subregions of IT while monkeys performed several core object recognition subtasks, interleaved trial-by trial. First, we observed that IT inactivation resulted in reliable contralateral-biased subtask-selective behavioral deficits. Moreover, inactivating different IT subregions resulted in different patterns of subtask deficits, predicted by each subregion’s neuronal object discriminability. Finally, the similarity between different inactivation effects was tightly related to the anatomical distance between corresponding inactivation sites. Taken together, these results provide direct evidence that the IT cortex causally supports general core object recognition and that the underlying IT coding dimensions are topographically organized.National Eye Institute (Grant R01-EY014970)United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant MURI-114407))Simons Foundation. Simons Collaboration on the Global Brain (Grant 325500

    Example spatial response patterns.

    No full text
    <p>Each row shows the spatial response patterns of an example neuron, for each epoch considered. One example neuron from each area is shown. The measured average firing rates at spatial positions of the reach target are plotted (colored circles) overlaid on the interpolated spatial response pattern (11x11 colored maps), using the same color scale. Warmer colors indicate higher firing rates. For visualization purposes, the interpolated spatial response patterns have been slightly smoothed, using a two-dimensional Gaussian kernel of width 0.5 bins.</p

    Description of all computational models tested, with mathematical characterizations, explicitly stated optimization parameters and constraints.

    No full text
    <p>Description of all computational models tested, with mathematical characterizations, explicitly stated optimization parameters and constraints.</p

    Comparison of Object Recognition Behavior in Human and Monkey

    No full text
    Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception.National Institutes of Health (U.S.)Natural Sciences and Engineering Research Council of Canad
    corecore