12 research outputs found

    A spiking network model of decision making employing rewarded STDP.

    No full text
    Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforcement signal that modulates synaptic changes. It was proposed as a learning rule capable of solving the distal reward problem in reinforcement learning. Nonetheless, performance and limitations of this learning mechanism have yet to be tested for its ability to solve biological problems. In our work, rewarded STDP was implemented to model foraging behavior in a simulated environment. Over the course of training the network of spiking neurons developed the capability of producing highly successful decision-making. The network performance remained stable even after significant perturbations of synaptic structure. Rewarded STDP alone was insufficient to learn effective decision making due to the difficulty maintaining homeostatic equilibrium of synaptic weights and the development of local performance maxima. Our study predicts that successful learning requires stabilizing mechanisms that allow neurons to balance their input and output synapses as well as synaptic noise

    Effect of noise and STDP strength on learning performance.

    No full text
    <p>STDP strength is scale in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0090821#pone.0090821.e008" target="_blank">equation 1</a> from the <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0090821#s5" target="_blank">methods</a> section. (A) Plot of mean final performance with variable levels of variability in synaptic release. Twenty-five simulations were run under each noise condition and final performance was recorded after 4 million moves. Red dashed lines shows the limits of standard error. (B) Plot of mean performance over time with variable levels of variability in synaptic release represented by different lines. Twenty-five simulations were run under each noise condition over 8 million moves. Noise level: 2% orange, 4% gold, 8% dark green, 12% blue, 16% red, 32% magenta, 64% brown. (C) Plot of mean final performance with variable STDP coefficient strength. Twenty-five simulations were run under each STDP coefficient condition and final performance was recorded after 8 million moves. Two sets were run with different noise levels: 16% release noise is shown in blue and 8% is shown in green. Red lines show standard error. (D) Plot of mean performance over time for different STDP strength. Twenty-five simulations were run for each STDP strength over 8 million moves (4 million shown). Release noise is set to 16%. STDP strength: orange-0.25; gold-0.5; dark green-1; light blue-1.5; dark blue-2; purple-4; magenta-8; red-16.</p

    Effect of synaptic noise.

    No full text
    <p>(A) Mean final performance of 8 runs with different levels of random perturbation of excitatory synaptic weights from middle layer to output layer. The simulations represented in green applied the perturbation only at the start. Those represented by blue applied perturbations at regular intervals. The thin red lines represent the limits of standard error. (B) 50% random variations applied to synaptic weights of the trained network. Learning was turned off and synapses were held at a fixed strength from 4,000,000 to 6,000,000 iterations. (C) Performance when reward and punishment conditions are reversed in an attempt to train avoidance behavior. Performance in “food” acquisition falls well below random (indicating successful learning) but the model failed to explicitly avoid all food.</p

    Effect of changing environment on synaptic strength.

    No full text
    <p>(A) Synaptic strengths of the outputs of a middle layer cell to all output cells during learning under normal conditions. This cell indicates food immediately above of the entity. (B) Synaptic strengths of the same cell after environment was later changed to a vertical arrangement. (C and D) Same as A and B but for a cell indicating food immediately to the left. (E) Shows the location of cells in the middle layer and the color representation of outputs by destination cell.</p

    Model properties.

    No full text
    <p>(A) Steady-state response pattern of an isolated spiking neuron for three different levels of the resting potential: black – , green –, blue – . (B) Network organization. Arrowed lines indicate outgoing connections of a sample of cells in each layer with excitatory cells shown in blue, inhibitory cells shown in red and output cells shown in green. (C) Sample IPSP in the postsynaptic neuron (bottom trace) triggered by a spike in presynaptic inhibitory neurons (top trace).</p

    Performance after elimination of different model features over 8 million movement itterations.

    No full text
    <p>Each line corresponds to performance after removing one feature. Green is default. Blue corresponds to the network when punishment was turned off. Magenta shows a network with no output balancing. Orange represents a network with no variability in synaptic release.</p

    Non-homogeneous extracellular resistivity affects the current-source density profiles of up–down state oscillations

    No full text
    Rhythmic local field potential (LFP) oscillations observed during deep sleep are the result of synchronized electrical activities of large neuronal ensembles, which consist of alternating periods of activity and silence, termed ‘up’ and ‘down’ states, respectively. Current-source density (CSD) analysis indicates that the up states of these slow oscillations are associated with current sources in superficial cortical layers and sinks in deep layers, while the down states display the opposite pattern of source–sink distribution. We show here that a network model of up and down states displays this CSD profile only if a frequency-filtering extracellular medium is assumed. When frequency filtering was modelled as inhomogeneous conductivity, this simple model had considerably more power in slow frequencies, resulting in significant differences in LFP and CSD profiles compared with the constant-resistivity model. These results suggest that the frequency-filtering properties of extracellular media may have important consequences for the interpretation of the results of CSD analysis

    Effect of changing the environment.

    No full text
    <p>(A) Normal “food” distribution. (B) A vertically biased “food” distribution. (C) Performance over time of the network starting in a normal environment then being switched to a vertically biased environment at 2,000,000 iterations. Learning was turned off and all synaptic weights were held constant until 3,000,000 epochs when learning was turned on again. (D) Performance over time of the network starting in a vertically biased environment then being switched to a normal environment at 2,000,000 epochs. Learning was turned off and all synaptic weights were held constant until 3,000,000 epochs when learning was turned on again.</p

    The change in rate of “food” acquisition as a result of learning.

    No full text
    <p>(A, B) Trajectory of the movement in the virtual environment. (A) Before training. (B) After training (one million iterations). Light green dots represent “food” location. Red dots are locations without food. Dark green line traces the entire movement. (C) Performance for 6 independent trials (different colors) over 4 million iterations. One of the trials (blue line) failed to achieve normal rates of performance. Horizontal lines represent constant performance of other strategies in solving the same problem. 1 - blind strategy; 2 – collecting adjacent food; 3 - moving towards the closest “food” within three grid squares; 4 - searching through all possible sets of moves within the visual field.</p
    corecore