140 research outputs found

    When Does Reward Maximization Lead to Matching Law?

    Get PDF
    What kind of strategies subjects follow in various behavioral circumstances has been a central issue in decision making. In particular, which behavioral strategy, maximizing or matching, is more fundamental to animal's decision behavior has been a matter of debate. Here, we prove that any algorithm to achieve the stationary condition for maximizing the average reward should lead to matching when it ignores the dependence of the expected outcome on subject's past choices. We may term this strategy of partial reward maximization “matching strategy”. Then, this strategy is applied to the case where the subject's decision system updates the information for making a decision. Such information includes subject's past actions or sensory stimuli, and the internal storage of this information is often called “state variables”. We demonstrate that the matching strategy provides an easy way to maximize reward when combined with the exploration of the state variables that correctly represent the crucial information for reward maximization. Our results reveal for the first time how a strategy to achieve matching behavior is beneficial to reward maximization, achieving a novel insight into the relationship between maximizing and matching

    An Imperfect Dopaminergic Error Signal Can Drive Temporal-Difference Learning

    Get PDF
    An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards

    West Nile Virus Genetic Diversity is Maintained during Transmission by Culex pipiens quinquefasciatus Mosquitoes

    Get PDF
    Due to error-prone replication, RNA viruses exist within hosts as a heterogeneous population of non-identical, but related viral variants. These populations may undergo bottlenecks during transmission that stochastically reduce variability leading to fitness declines. Such bottlenecks have been documented for several single-host RNA viruses, but their role in the population biology of obligate two-host viruses such as arthropod-borne viruses (arboviruses) in vivo is unclear, but of central importance in understanding arbovirus persistence and emergence. Therefore, we tracked the composition of West Nile virus (WNV; Flaviviridae, Flavivirus) populations during infection of the vector mosquito, Culex pipiens quinquefasciatus to determine whether WNV populations undergo bottlenecks during transmission by this host. Quantitative, qualitative and phylogenetic analyses of WNV sequences in mosquito midguts, hemolymph and saliva failed to document reductions in genetic diversity during mosquito infection. Further, migration analysis of individual viral variants revealed that while there was some evidence of compartmentalization, anatomical barriers do not impose genetic bottlenecks on WNV populations. Together, these data suggest that the complexity of WNV populations are not significantly diminished during the extrinsic incubation period of mosquitoes

    Exposure–response relationship of AMG 386 in combination with weekly paclitaxel in recurrent ovarian cancer and its implication for dose selection

    Get PDF
    To characterize exposure-response relationships of AMG 386 in a phase 2 study in advanced ovarian cancer for the facilitation of dose selection in future studies.A population pharmacokinetic model of AMG 386 (N = 141) was developed and applied in an exposure-response analysis using data from patients (N = 160) with recurrent ovarian cancer who received paclitaxel plus AMG 386 (3 or 10 mg/kg once weekly) or placebo. Reduction in the risk of progression or death with increasing exposure (steady-state area under the concentration-versus-time curve [AUC(ss)]) was assessed using Cox regression analyses. Confounding factors were tested in multivariate analysis. Alternative AMG 386 doses were explored with Monte Carlo simulations using population pharmacokinetic and parametric survival models.There was a trend toward increased PFS with increased AUC(ss) (hazard ratio [HR] for each one-unit increment in AUC(ss), 0.97; P = 0.097), suggesting that the maximum effect on prolonging PFS was not achieved at the highest dose tested (10 mg/kg). Among patients with AUC(ss) ≥ 9.6 mg h/mL, PFS was 8.1 months versus 5.7 months for AUC(ss) < 9.6 mg h/mL and 4.6 months for placebo. No relationship between AUC(ss) and grade ≥ 3 adverse events was observed. Simulations predicted that AMG 386 15 mg/kg once weekly would result in an AUC(ss) ≥ 9.6 mg h/mL in > 90% of patients with median PFS of 8.2 months versus 5.0 months for placebo (HR [15 mg/kg vs. placebo], 0.56).Increased exposure to AMG 386 was associated with improved clinical outcomes in recurrent ovarian cancer, supporting the evaluation of a higher dose in future studies

    Grid Cells, Place Cells, and Geodesic Generalization for Spatial Reinforcement Learning

    Get PDF
    Reinforcement learning (RL) provides an influential characterization of the brain's mechanisms for learning to make advantageous choices. An important problem, though, is how complex tasks can be represented in a way that enables efficient learning. We consider this problem through the lens of spatial navigation, examining how two of the brain's location representations—hippocampal place cells and entorhinal grid cells—are adapted to serve as basis functions for approximating value over space for RL. Although much previous work has focused on these systems' roles in combining upstream sensory cues to track location, revisiting these representations with a focus on how they support this downstream decision function offers complementary insights into their characteristics. Rather than localization, the key problem in learning is generalization between past and present situations, which may not match perfectly. Accordingly, although neural populations collectively offer a precise representation of position, our simulations of navigational tasks verify the suggestion that RL gains efficiency from the more diffuse tuning of individual neurons, which allows learning about rewards to generalize over longer distances given fewer training experiences. However, work on generalization in RL suggests the underlying representation should respect the environment's layout. In particular, although it is often assumed that neurons track location in Euclidean coordinates (that a place cell's activity declines “as the crow flies” away from its peak), the relevant metric for value is geodesic: the distance along a path, around any obstacles. We formalize this intuition and present simulations showing how Euclidean, but not geodesic, representations can interfere with RL by generalizing inappropriately across barriers. Our proposal that place and grid responses should be modulated by geodesic distances suggests novel predictions about how obstacles should affect spatial firing fields, which provides a new viewpoint on data concerning both spatial codes

    Anti-angiogenic tyrosine kinase inhibitors: what is their mechanism of action?

    Get PDF
    Tyrosine kinases are important cellular signaling proteins that have a variety of biological activities including cell proliferation and migration. Multiple kinases are involved in angiogenesis, including receptor tyrosine kinases such as the vascular endothelial growth factor receptor. Inhibition of angiogenic tyrosine kinases has been developed as a systemic treatment strategy for cancer. Three anti-angiogenic tyrosine kinase inhibitors (TKIs), sunitinib, sorafenib and pazopanib, with differential binding capacities to angiogenic kinases were recently approved for treatment of patients with advanced cancer (renal cell cancer, gastro-intestinal stromal tumors, and hepatocellular cancer). Many other anti-angiogenic TKIs are being studied in phase I-III clinical trials. In addition to their beneficial anti-tumor activity, clinical resistance and toxicities have also been observed with these agents. In this manuscript, we will give an overview of the design and development of anti-angiogenic TKIs. We describe their molecular structure and classification, their mechanism of action, and their inhibitory activity against specific kinase signaling pathways. In addition, we provide insight into what extent selective targeting of angiogenic kinases by TKIs may contribute to the clinically observed anti-tumor activity, resistance, and toxicity. We feel that it is of crucial importance to increase our understanding of the clinical mechanism of action of anti-angiogenic TKIs in order to further optimize their clinical efficacy

    Unraveling the mechanism of cascade reactions of zincke aldehydes.

    No full text
    The thermal pericyclic cascade rearrangement of Zincke aldehydes (5-(dialkylamino)-2,4-pentadienals) to afford Z-α,β,γ,δ-unsaturated amides discovered by the Vanderwal group has been studied in depth using quantum mechanical methods. Two mechanistic possibilities that had previously been put forth to explain this internal redox process, one that had been discounted by experiment and the other that had withstood experimental scrutiny, were evaluated. Both of these mechanisms suffered from energetic barriers that appeared too high to allow rearrangement to proceed under the conditions used; however, computational study of a third possibility that implicates the intermediacy of vinylketenes revealed that it is the most likely pathway of rearrangement. Further computational studies accounted for the relative rates of rearrangement in substituted Zincke aldehydes, predicted the feasibility of related processes for other donor-acceptor dienes, and provided insight into the rearrangement of allylamine-derived Zincke aldehydes that provide either dihydropyridones or polycyclic lactams by further pericyclic processes
    corecore