A spiking temporal-difference learning model based on dopamine-modulated plasticity

Potjansu, Wiebke; Morrison, Abigail; Diesmann, Markus

doi:10.1186/1471-2202-10-S1-P140

Volume 10 Supplement 1

Eighteenth Annual Computational Neuroscience Meeting: CNS*2009

Poster presentation
Open access
Published: 13 July 2009

A spiking temporal-difference learning model based on dopamine-modulated plasticity

Wiebke Potjansu^1,2,
Abigail Morrison¹ &
Markus Diesmann^1,3,4

BMC Neuroscience volume 10, Article number: P140 (2009) Cite this article

1330 Accesses
Metrics details

Making predictions about future rewards and adapting the behavior accordingly is crucial for any higher organism. One theory specialized for prediction problems is temporal-difference (TD) learning. Experimental findings suggest that TD learning is implemented by the mammalian brain. In particular, the resemblance of dopaminergic activity to the TD error signal [1] and the modulation of corticostriatal plasticity by dopamine [2] lend support to this hypothesis. We recently proposed the first spiking neural network model to implement actor-critic TD learning [3], enabling it to solve a complex task with sparse rewards. However, this model calculates an approximation of the TD error signal in each synapse, rather than utilizing a neuromodulatory system.

Here, we propose a spiking neural network model which dynamically generates a dopamine signal based on the actor-critic architecture proposed by Houk [4]. This signal modulates as a third factor the plasticity of the synapses encoding value function and policy. The proposed model simultaneously accounts for multiple experimental results, such as the generation of a TD-like dopaminergic signal with realistic firing rates in conditioning protocols [1], and the role of presynaptic activity, postsynaptic activity and dopamine in the plasticity of corticostriatal synapses [5]. The excellent agreement between the predictions of our synaptic plasticity rules and the experimental findings is particularly noteworthy, as the update rules were postulated employing a purely top-down approach.

We performed simulations in NEST [6] to test the learning behavior of the model in a two dimensional grid-world task with a single rewarded state. The network learns to evaluate the states with respect to its reward proximity and adapt its policy accordingly. The learning speed and equilibrium performance are comparable to those of a discrete time algorithmic TD learning implementation.

The proposed model paves the way for investigations of the role of the dynamics of the dopaminergic system in reward-based learning. For example, we can use lesion studies to analyze the effects of dopamine treatment in Parkinson's patients. Finally, the experimentally constrained model can be used as the centerpiece of closed-loop functional models.

References

Schultz W, Dayan P, Montague PR: A neural substrate of prediction and reward. Science. 1997, 275: 1593-1599. 10.1126/science.275.5306.1593.
Article CAS PubMed Google Scholar
Reynolds JN, Hyland BI, Wickens JR: A cellular mechanism of reward-related learning. Nature. 2001, 413: 67-70. 10.1038/35092560.
Article CAS PubMed Google Scholar
Potjans W, Morrison A, Diesmann M: A spiking neural network model of an actor-critic learning agent. Neural Computation. 2009, 21: 301-339. 10.1162/neco.2008.08-07-593.
Article PubMed Google Scholar
Houk JC, Adams JL, Barto AG: A model of how the basal ganglia generate and use neural signals that predict reinforcement. 1995, MIT Press, Cambridge, MA
Google Scholar
Reynolds JN, Hyland BI, Wickens JR: Dopamine-dependent plasticity of corticostriatal synapses. Neural Networks. 2002, 15: 507-521. 10.1016/S0893-6080(02)00045-X.
Article PubMed Google Scholar
Gewaltig M-O, Diesmann M: NEST (neural simulation tool). Scholarpedia. 2007, 2: 1430.
Article Google Scholar

Download references

Acknowledgements

Partially funded by EU Grant 15879 (FACETS), BMBF Grant 01GQ0420 to BCCN Freiburg, Next-Generation Supercomputer Project of MEXT, Japan, and the Helmholtz Alliance on Systems Biology.

Author information

Authors and Affiliations

Theoretical Neuroscience Group, RIKEN Brain Science Institute, Wako City, Saitama, Japan
Wiebke Potjansu, Abigail Morrison & Markus Diesmann
Institute of Neurosciences and Medicine, Research Center Jülich, Jülich, Germany
Wiebke Potjansu
Brain and Neural Systems Team, RIKEN Computational Science Research Program, Wako City, Saitama, Japan
Markus Diesmann
Bernstein Center for Computational Neuroscience, Albert-Ludwigs-University, Freiburg, Germany
Markus Diesmann

Authors

Wiebke Potjansu
View author publications
You can also search for this author in PubMed Google Scholar
Abigail Morrison
View author publications
You can also search for this author in PubMed Google Scholar
Markus Diesmann
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Potjansu, W., Morrison, A. & Diesmann, M. A spiking temporal-difference learning model based on dopamine-modulated plasticity. BMC Neurosci 10 (Suppl 1), P140 (2009). https://doi.org/10.1186/1471-2202-10-S1-P140

Download citation

Published: 13 July 2009
DOI: https://doi.org/10.1186/1471-2202-10-S1-P140

Eighteenth Annual Computational Neuroscience Meeting: CNS*2009

A spiking temporal-difference learning model based on dopamine-modulated plasticity

References

Acknowledgements

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

BMC Neuroscience

Contact us

Eighteenth Annual Computational Neuroscience Meeting: CNS*2009

A spiking temporal-difference learning model based on dopamine-modulated plasticity

References

Acknowledgements

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Neuroscience

Contact us