Spike timing dependent plasticity implements reinforcement learning

Santiago, Roberto A; Roberts, Patrick D; Lafferriere, Gerardo

doi:10.1186/1471-2202-8-S2-S16

Volume 8 Supplement 2

Sixteenth Annual Computational Neuroscience Meeting: CNS*2007

Oral presentation
Open access
Published: 06 July 2007

Spike timing dependent plasticity implements reinforcement learning

Roberto A Santiago²,
Patrick D Roberts¹ &
Gerardo Lafferriere²

BMC Neuroscience volume 8, Article number: S16 (2007) Cite this article

1189 Accesses
1 Citations
Metrics details

An explanatory model is developed to show how synaptic learning mechanisms modeled through spike-timing dependent plasticity (STDP) can result in longer term adaptations consistent with reinforcement learning models. In particular, the reinforcement learning model known as temporal difference (TD) learning has been used to model neuronal behavior in the orbitofrontal cortex (OFC) and ventral tegmental area (VTA) of macaque monkey during reinforcement learning. While some research has observed, empirically, a connection between STDP and TD there is as yet no explanatory model directly connecting TD to STDP. Through analysis of the STDP rule, the connection between STDP and TD is explained. We further show that an STDP learning rule drives the spike probability of reward predicting neurons to a stable equilibrium. The equilibrium solution has an increasing slope where the steepness of the slope predicts the probability of the reward. This connection begins to shed light into more recent data gathered from VTA and OFC which are not well modeled by TD. We suggest that STDP provides the underlying mechanism for explaining reinforcement learning and other higher level perceptual and cognitive function.

Author information

Authors and Affiliations

Neurological Sciences Institute, Oregon Health & Science University, Portland, OR, USA
Patrick D Roberts
Department of Mathematics and Statistics, Portland State University, Portland, OR, USA
Roberto A Santiago & Gerardo Lafferriere

Authors

Roberto A Santiago
View author publications
You can also search for this author in PubMed Google Scholar
Patrick D Roberts
View author publications
You can also search for this author in PubMed Google Scholar
Gerardo Lafferriere
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick D Roberts.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Santiago, R.A., Roberts, P.D. & Lafferriere, G. Spike timing dependent plasticity implements reinforcement learning. BMC Neurosci 8 (Suppl 2), S16 (2007). https://doi.org/10.1186/1471-2202-8-S2-S16

Download citation

Published: 06 July 2007
DOI: https://doi.org/10.1186/1471-2202-8-S2-S16

Sixteenth Annual Computational Neuroscience Meeting: CNS*2007

Spike timing dependent plasticity implements reinforcement learning

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Neuroscience

Contact us

Sixteenth Annual Computational Neuroscience Meeting: CNS*2007

Spike timing dependent plasticity implements reinforcement learning

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Neuroscience

Contact us