    Reinforcement learning or active inference?

    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

    On the pathogenesis of penile venous leakage: role of the tunica albuginea

    <p>Abstract</p> <p>Background</p> <p>Etiology of venogenic erectile dysfunction is not exactly known. Various pathologic processes were accused but none proved entirely satisfactory. These include presence of large venous channels draining corpora cavernosa, Peyronie's disease, diabetes and structural alterations in fibroblastic components of trabeculae and cavernous smooth muscles. We investigated hypothesis that tunica albuginea atrophy with a resulting subluxation and redundancy effects venous leakage during erection.</p> <p>Methods</p> <p>18 patients (mean age 33.6 ± 2.8 SD years) with venogenic erectile dysfunction and 17 volunteers for control (mean age 31.7 ± 2.2 SD years) were studied. Intracorporal pressure was recorded in all subjects; tunica albuginea biopsies were taken from 18 patients and 9 controls and stained with hematoxylin and eosin and Masson's trichrome stains.</p> <p>Results</p> <p>In flaccid phase intracorporal pressure recorded a mean of 11.8 ± 0.8 cm H<sub>2</sub>O for control subjects and for patients of 5.2 ± 0.6 cm, while during induced erection recorded 98.4 ± 6.2 and 5.9 ± 0.7 cmH<sub>2</sub>O, respectively. Microscopically, tunica albuginea of controls consisted of circularly-oriented collagen impregnated with elastic fibers. Tunica albuginea of patients showed degenerative and atrophic changes of collagen fibers; elastic fibers were scarce or absent.</p> <p>Conclusion</p> <p>Study has shown that during erection intracorporal pressure of patients with venogenic erectile dysfunction was significantly lower than that of controls. Tunica albuginea collagen fibers exhibited degenerative and atrophic changes which presumably lead to tunica albuginea subluxation and floppiness. These tunica albuginea changes seem to explain cause of lowered intracorporal pressure which apparently results from loss of tunica albuginea veno-occlusive mechanism. Causes of tunica albuginea atrophic changes and subluxation need to be studied.</p

    Invasive characteristics of human prostatic epithelial cells: understanding the metastatic process

    Prostate cancer has a predilection to metastasise to the bone marrow stroma (BMS) by an as yet uncharacterised mechanism. We have defined a series of coculture models of invasion, which simulate the blood/BMS boundary and allow the elucidation of the signalling and mechanics of trans-endothelial migration within the complex bone marrow environment. Confocal microscopy shows that prostate epithelial cells bind specifically to bone marrow endothelial-to-endothelial cell junctions and initiate endothelial cell retraction. Trans-endothelial migration proceeds via an epithelial cell pseudopodial process, with complete epithelial migration occurring after 232±43 min. Stromal-derived factor-1 (SDF-1)/CXCR4 signalling induced PC-3 to invade across a basement membrane although the level of invasion was 3.5-fold less than invasion towards BMS (P=0.0007) or bone marrow endothelial cells (P=0.004). Maximal SDF-1 signalling of invasion was completely inhibited by 10 μM of the SDF-1 inhibitor T140. However, 10 μM T140 only reduced invasion towards BMS and bone marrow endothelial cells by 59% (P=0.001) and 29% (P=0.011), respectively. This study highlights the need to examine the potential roles of signalling molecules and/or inhibitors, not just in single-cell models but in coculture models that mimic the complex environment of the bone marrow

    Fluctuation-Driven Neural Dynamics Reproduce Drosophila Locomotor Patterns.

    The neural mechanisms determining the timing of even simple actions, such as when to walk or rest, are largely mysterious. One intriguing, but untested, hypothesis posits a role for ongoing activity fluctuations in neurons of central action selection circuits that drive animal behavior from moment to moment. To examine how fluctuating activity can contribute to action timing, we paired high-resolution measurements of freely walking Drosophila melanogaster with data-driven neural network modeling and dynamical systems analysis. We generated fluctuation-driven network models whose outputs-locomotor bouts-matched those measured from sensory-deprived Drosophila. From these models, we identified those that could also reproduce a second, unrelated dataset: the complex time-course of odor-evoked walking for genetically diverse Drosophila strains. Dynamical models that best reproduced both Drosophila basal and odor-evoked locomotor patterns exhibited specific characteristics. First, ongoing fluctuations were required. In a stochastic resonance-like manner, these fluctuations allowed neural activity to escape stable equilibria and to exceed a threshold for locomotion. Second, odor-induced shifts of equilibria in these models caused a depression in locomotor frequency following olfactory stimulation. Our models predict that activity fluctuations in action selection circuits cause behavioral output to more closely match sensory drive and may therefore enhance navigation in complex sensory environments. Together these data reveal how simple neural dynamics, when coupled with activity fluctuations, can give rise to complex patterns of animal behavior

    An Imperfect Dopaminergic Error Signal Can Drive Temporal-Difference Learning

    An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards

    MYT1L mutations cause intellectual disability and variable obesity by dysregulating gene expression and development of the neuroendocrine hypothalamus

    Deletions at chromosome 2p25.3 are associated with a syndrome consisting of intellectual disability and obesity. The smallest region of overlap for deletions at 2p25.3 contains PXDN and MYT1L. MYT1L is expressed only within the brain in humans. We hypothesized that single nucleotide variants (SNVs) in MYT1L would cause a phenotype resembling deletion at 2p25.3. To examine this we sought MYT1L SNVs in exome sequencing data from 4, 296 parent-child trios. Further variants were identified through a genematcher-facilitated collaboration. We report 9 patients with MYT1L SNVs (4 loss of function and 5 missense). The phenotype of SNV carriers overlapped with that of 2p25.3 deletion carriers. To identify the transcriptomic consequences of MYT1L loss of function we used CRISPR-Cas9 to create a knockout cell line. Gene Ontology analysis in knockout cells demonstrated altered expression of genes that regulate gene expression and that are localized to the nucleus. These differentially expressed genes were enriched for OMIM disease ontology terms “mental retardation”. To study the developmental effects of MYT1L loss of function we created a zebrafish knockdown using morpholinos. Knockdown zebrafish manifested loss of oxytocin expression in the preoptic neuroendocrine area. This study demonstrates that MYT1L variants are associated with syndromic obesity in humans. The mechanism is related to dysregulated expression of neurodevelopmental genes and altered development of the neuroendocrine hypothalamus

    Fine-Tuning and the Stability of Recurrent Neural Networks

    A central criticism of standard theoretical approaches to constructing stable, recurrent model networks is that the synaptic connection weights need to be finely-tuned. This criticism is severe because proposed rules for learning these weights have been shown to have various limitations to their biological plausibility. Hence it is unlikely that such rules are used to continuously fine-tune the network in vivo. We describe a learning rule that is able to tune synaptic weights in a biologically plausible manner. We demonstrate and test this rule in the context of the oculomotor integrator, showing that only known neural signals are needed to tune the weights. We demonstrate that the rule appropriately accounts for a wide variety of experimental results, and is robust under several kinds of perturbation. Furthermore, we show that the rule is able to achieve stability as good as or better than that provided by the linearly optimal weights often used in recurrent models of the integrator. Finally, we discuss how this rule can be generalized to tune a wide variety of recurrent attractor networks, such as those found in head direction and path integration systems, suggesting that it may be used to tune a wide variety of stable neural systems

    Invasive Plants and Enemy Release: Evolution of Trait Means and Trait Correlations in Ulex europaeus

    Several hypotheses that attempt to explain invasive processes are based on the fact that plants have been introduced without their natural enemies. Among them, the EICA (Evolution of Increased Competitive Ability) hypothesis is the most influential. It states that, due to enemy release, exotic plants evolve a shift in resource allocation from defence to reproduction or growth. In the native range of the invasive species Ulex europaeus, traits involved in reproduction and growth have been shown to be highly variable and genetically correlated. Thus, in order to explore the joint evolution of life history traits and susceptibility to seed predation in this species, we investigated changes in both trait means and trait correlations. To do so, we compared plants from native and invaded regions grown in a common garden. According to the expectations of the EICA hypothesis, we observed an increase in seedling height. However, there was little change in other trait means. By contrast, correlations exhibited a clear pattern: the correlations between life history traits and infestation rate by seed predators were always weaker in the invaded range than in the native range. In U. europaeus, the role of enemy release in shaping life history traits thus appeared to imply trait correlations rather than trait means. In the invaded regions studied, the correlations involving infestation rates and key life history traits such as flowering phenology, growth and pod density were reduced, enabling more independent evolution of these key traits and potentially facilitating local adaptation to a wide range of environments. These results led us to hypothesise that a relaxation of genetic correlations may be implied in the expansion of invasive species