Volume 14 Supplement 1

Abstracts from the Twenty Second Annual Computational Neuroscience Meeting: CNS*2013

Open Access

Requirements for the robust operant conditioning of neural firing rates

  • Robert R Kerr1, 2, 3Email author,
  • David B Grayden1, 2, 3, 4,
  • Doreen A Thomas5,
  • Matthieu Gilson6 and
  • Anthony N Burkitt1, 2, 3, 4
BMC Neuroscience201314(Suppl 1):P48

DOI: 10.1186/1471-2202-14-S1-P48

Published: 8 July 2013

Operant conditioning experiments have shown that changes in the firing rates of individual neurons in the motor cortex of monkeys can be elicited [1, 2]. In these experiments, the firing rate of the neurons were measured using an implanted electrode, and the monkeys were presented with feedback based on these rates and rewarded for increasing them. Behavioral learning such as this is assumed to be due to plasticity at the synaptic level and reward-modulated spike-timing-dependent plasticity (RSTDP) has previously been proposed as such a model [3]. In this study, we propose a generalization of the existing RSTDP model (classical RSTDP) that can account for experiments where dopamine differentially modulates the amplitude of long-term potentiation and depression (LTP and LTD) [4]. Using analytical techniques and numerical simulations with leaky integrate-and-fire (LIF) neurons, we compare the classical RSTDP (see Figure 1A) with our generalized model (see Figure 1B). We consider the potential for these models to elicit the increased firing rates observed in operant conditioning experiments [1, 2] and find two requirements. The first requirement is that, relative to their base level amplitudes, the strengthening of LTP by the reward signal must be greater than the strengthening of LTD. Classical RSTDP cannot exhibit this and, contrary to previous studies [3], we predict that it consequently cannot robustly elicit an increased firing rate. The second requirement is that the reinforced neuron must be able to exhibit short inter-spike intervals (ISIs) relative to its mean ISI. For the LIF neurons we consider, this corresponds to being in a fluctuation-driven regime, such as receiving a balance of excitatory and inhibitory inputs. The findings of this study are consistent with existing experimental studies and they also make testable predictions for possible future experiments.
https://static-content.springer.com/image/art%3A10.1186%2F1471-2202-14-S1-P48/MediaObjects/12868_2013_Article_2982_Fig1_HTML.jpg
Figure 1

Effective STDP learning windows that show the synaptic change, ΔK , caused by a spike pair with the timing difference, Δt , between the pre- and post-synaptic spikes. The different windows are for different reward signal levels (y = 0, 1, 2, 3, 4, 5, and 6, shown in green, blue, purple, magenta, red, orange, and yellow lines, respectively). A: Classical RSTDP has no LTD or LTP when there is no reward and increasing amounts of both as reward increases. B: Generalized RSTDP allows separate modulation of LTP and LTD and non-zero LTP and LTD when there is no reward. Shown here is generalized RSTDP parameterized to match the effect dopamine has been observed to have on STDP [4]. Switching of LTD to LTP for high reward values is exhibited.

Declarations

Acknowledgements

Funding is acknowledged from the Australian Research Council (ARC Discovery Project #DP1096699). The Bionics Institute acknowledges the support it receives from the Victorian Government through its Operational Infrastructure Support Program.

Authors’ Affiliations

(1)
NeuroEngineering Laboratory, Dept. of Electrical & Electronic Engineering, University of Melbourne
(2)
Centre for Neural Engineering, University of Melbourne
(3)
NICTA, Victoria Research Lab, University of Melbourne
(4)
Bionics Institute
(5)
Department of Mechanical Engineering, University of Melbourne
(6)
Laboratory for Neural Circuit Theory, RIKEN Brain Science Institute

References

  1. Fetz EE, Baker MA: Operantly conditioned patterns on precentral unit activity and correlated responses in adjacent cells and contralateral muscles. J Neurophysiol. 1973, 36 (2): 179-204.PubMedGoogle Scholar
  2. Kobayashi S, Schultz W, Sakagami M: Operant conditioning of primate prefrontal neurons. J Neurophysiol. 2010, 103 (4): 1843-1855. 10.1152/jn.00173.2009.PubMed CentralView ArticlePubMedGoogle Scholar
  3. Legenstein R, Pecevski D, Maass W: A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback. PLoS Comput Biol. 2008, 4 (10): e1000180-10.1371/journal.pcbi.1000180.PubMed CentralView ArticlePubMedGoogle Scholar
  4. Zhang J-C, Lau P-M, Bi G-Q: Gain in sensitivity and loss in temporal contrast of STDP by dopaminergic modulation at hippocampal synapses. Proc Natl Acad Sci USA. 2009, 106 (31): 13028-13033. 10.1073/pnas.0900546106.PubMed CentralView ArticlePubMedGoogle Scholar

Copyright

© Kerr et al; licensee BioMed Central Ltd. 2013

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement