    Predictive Encoding of Contextual Relationships for Perceptual Inference, Interpolation and Prediction

    We propose a new neurally-inspired model that can learn to encode the global relationship context of visual events across time and space and to use the contextual information to modulate the analysis by synthesis process in a predictive coding framework. The model learns latent contextual representations by maximizing the predictability of visual events based on local and global contextual information through both top-down and bottom-up processes. In contrast to standard predictive coding models, the prediction error in this model is used to update the contextual representation but does not alter the feedforward input for the next layer, and is thus more consistent with neurophysiological observations. We establish the computational feasibility of this model by demonstrating its ability in several aspects. We show that our model can outperform state-of-art performances of gated Boltzmann machines (GBM) in estimation of contextual information. Our model can also interpolate missing events or predict future events in image sequences while simultaneously estimating contextual information. We show it achieves state-of-art performances in terms of prediction accuracy in a variety of tasks and possesses the ability to interpolate missing frames, a function that is lacking in GBM

    A Learning-Style Theory for Understanding Autistic Behaviors

    Understanding autism's ever-expanding array of behaviors, from sensation to cognition, is a major challenge. We posit that autistic and typically developing brains implement different algorithms that are better suited to learn, represent, and process different tasks; consequently, they develop different interests and behaviors. Computationally, a continuum of algorithms exists, from lookup table (LUT) learning, which aims to store experiences precisely, to interpolation (INT) learning, which focuses on extracting underlying statistical structure (regularities) from experiences. We hypothesize that autistic and typical brains, respectively, are biased toward LUT and INT learning, in low- and high-dimensional feature spaces, possibly because of their narrow and broad tuning functions. The LUT style is good at learning relationships that are local, precise, rigid, and contain little regularity for generalization (e.g., the name–number association in a phonebook). However, it is poor at learning relationships that are context dependent, noisy, flexible, and do contain regularities for generalization (e.g., associations between gaze direction and intention, language and meaning, sensory input and interpretation, motor-control signal and movement, and social situation and proper response). The LUT style poorly compresses information, resulting in inefficiency, sensory overload (overwhelm), restricted interests, and resistance to change. It also leads to poor prediction and anticipation, frequent surprises and over-reaction (hyper-sensitivity), impaired attentional selection and switching, concreteness, strong local focus, weak adaptation, and superior and inferior performances on simple and complex tasks. The spectrum nature of autism can be explained by different degrees of LUT learning among different individuals, and in different systems of the same individual. Our theory suggests that therapy should focus on training autistic LUT algorithm to learn regularities

    Contributions of local speech encoding and functional connectivity to audio-visual speech perception

    Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments

    Reconstructing Representations of Dynamic Visual Objects in Early Visual Cortex

    As raw sensory data are partial, our visual system extensively fills in missing details, creating enriched percepts based on incomplete bottom-up information. Despite evidence for internally generated representations at early stages of cortical processing, it is not known whether these representations include missing information of dynamically transforming objects. Long-range apparent motion (AM) provides a unique test case because objects in AM can undergo changes both in position and in features. Using fMRI and encoding methods, we found that the “intermediate” orientation of an apparently rotating grating, never presented in the retinal input but interpolated during AM, is reconstructed in population-level, feature-selective tuning responses in the region of early visual cortex (V1) that corresponds to the retinotopic location of the AM path. This neural representation is absent when AM inducers are presented simultaneously and when AM is visually imagined. Our results demonstrate dynamic filling-in in V1 for object features that are interpolated during kinetic transformations

    Perceptions as Hypotheses: Saccades as Experiments

    If perception corresponds to hypothesis testing (Gregory, 1980); then visual searches might be construed as experiments that generate sensory data. In this work, we explore the idea that saccadic eye movements are optimal experiments, in which data are gathered to test hypotheses or beliefs about how those data are caused. This provides a plausible model of visual search that can be motivated from the basic principles of self-organized behavior: namely, the imperative to minimize the entropy of hidden states of the world and their sensory consequences. This imperative is met if agents sample hidden states of the world efficiently. This efficient sampling of salient information can be derived in a fairly straightforward way, using approximate Bayesian inference and variational free-energy minimization. Simulations of the resulting active inference scheme reproduce sequential eye movements that are reminiscent of empirically observed saccades and provide some counterintuitive insights into the way that sensory evidence is accumulated or assimilated into beliefs about the world

    Learning under uncertainty in the young and older human brain: Common and distinct mechanisms of different attentional and intentional systems

    The human brain is able to infer the probability of future events by combining information of past observations with current sensory input. Naturally, we are surrounded by more stimuli than we can pay attention to, so selection of relevant input is crucial. The present thesis aimed at identifying common and distinct neural correlates engaged in predictive processing in spatial attention (selection of attended locations) and motor intention (selection of prepared motor responses). Secondly, age-related influences on probabilistic inference in spatial-attention, feature-based attention (selection of attended color) and motor intention, and the impact of task difficulty were considered. Orienting attention during goal-directed behavior can be supported by visual cues, whereas reorienting to unexpected events following misguiding information is linked to behavioral costs and updating of predictions. These processes can be investigated with a cueing paradigm in which differences in reaction time (RT) between valid and invalidly cued trials increase with higher cue validity (%CV) (Posner, 1980). Bayesian models can describe the experience-dependent learning effects of inferring %CV, following novel events (Vossel et al., 2014c; Vossel, Mathys, Stephan & Friston, 2015). The principle aim of the first experiment was to identify and compare the neural correlates involved in inferring probabilities in the spatial attentional and motor intentional domain. Cues indicated either the possible location or prepared the motor response associated with the target. Instead of a fixed probability context, participants were exposed to a volatile environment, in which the validity of the cue information changed unpredictably over time. Combining functional magnetic resonance imaging (fMRI) data with behavioral estimates derived from a Bayesian learning model (Mathys, Daunizeau, Friston & Stephan, 2011) unveiled domain-specific predictability-dependent responses within the right temporoparietal junction (TPJ) for spatial attention and the left angular gyrus (ANG) and anterior cingulate (ACC) in the motor intention task. The blood oxygen level dependent (BOLD) amplitude particularly increased in accord with violations of cue predictability in high cue validity contexts (i.e. when invalid trials were least expected). Valid trials however, induced no (TPJ and ANG) or decreased modulation (ACC). A further aim was to examine possible commonalities in the neural signatures of predictability-dependent processing. Connectivity analysis uncovered common coupling of all three seed regions involved in predictability-dependent processing with the right anterior hippocampus. Since cognitive functions undergo substantial changes in healthy ageing, a second behavioral study was conducted to test whether age differentially influences probabilistic inference in different attentional subsystems, and how task difficulty impacts on learning performance. Thus, following up on the first experiment, similar tasks and the same computational model was used to assess updating behavior in healthy aging. Older and younger adults performed two separate experiments with different difficulty levels. Each experiment included three versions of a cueing task, entailing predictive spatial- (i.e. location), feature- (i.e. color of target) and motor intention cues (i.e. prepare response). Results of the easier version demonstrated a preserved ability of older adults to generate predictions and profit from all cue types. Interestingly, increased task demand uncovered a reduced ability to use motor intention cues to update predictions in older compared to younger adults. In conclusion, the results provide evidence for a segregated functional anatomy of probabilistic inference in spatial attention and motor intention. Nonetheless a common connectivity profile with the hippocampus also points at commonalities. Finally age seems to differentially impact the efficiency of learning behavior in the motor intention system, supporting the notion of independence of the attentional- and intentional subsystems

    Brain Responses Track Patterns in Sound

    This thesis uses specifically structured sound sequences, with electroencephalography (EEG) recording and behavioural tasks, to understand how the brain forms and updates a model of the auditory world. Experimental chapters 3-7 address different effects arising from statistical predictability, stimulus repetition and surprise. Stimuli comprised tone sequences, with frequencies varying in regular or random patterns. In Chapter 3, EEG data demonstrate fast recognition of predictable patterns, shown by an increase in responses to regular relative to random sequences. Behavioural experiments investigate attentional capture by stimulus structure, suggesting that regular sequences are easier to ignore. Responses to repetitive stimulation generally exhibit suppression, thought to form a building block of regularity learning. However, the patterns used in this thesis show the opposite effect, where predictable patterns show a strongly enhanced brain response, compared to frequency-matched random sequences. Chapter 4 presents a study which reconciles auditory sequence predictability and repetition in a single paradigm. Results indicate a system for automatic predictability monitoring which is distinct from, but concurrent with, repetition suppression. The brain’s internal model can be investigated via the response to rule violations. Chapters 5 and 6 present behavioural and EEG experiments where violations are inserted in the sequences. Outlier tones within regular sequences evoked a larger response than matched outliers in random sequences. However, this effect was not present when the violation comprised a silent gap. Chapter 7 concerns the ability of the brain to update an existing model. Regular patterns transitioned to a different rule, keeping the frequency content constant. Responses show a period of adjustment to the rule change, followed by a return to tracking the predictability of the sequence. These findings are consistent with the notion that the brain continually maintains a detailed representation of ongoing sensory input and that this representation shapes the processing of incoming information

    Generic Object Detection and Segmentation for Real-World Environments

