Spike-based reinforcement learning of navigation
© Vasilaki et al; licensee BioMed Central Ltd. 2008
Published: 11 July 2008
We have studied a spiking, reinforcement learning model derived from reward maximization [1, 2] where causal relations between pre-and postsynaptic activity set a synaptic eligibility trace [2, 3]. Neurons are modeled according to the "Integrate-and-Fire" model with escape noise. Synapses are binary and are modulated via the release probability. The synaptic release probability is updated when a global reward signal (such as dopamine) is received.
- Pfister JP, Toyoizumi T, Barber D, Gerstner W: Optimal Spike-Timing Dependent Plasticity for Precise Action Potential Firing in Supervised Learning. Neural Computation. 2006, 18 (6): 1309-1339. 10.1162/neco.2006.18.6.1318.View ArticleGoogle Scholar
- Florian RV: Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Computation. 2007, 19 (6): 1468-1502. 10.1162/neco.2007.19.6.1468.View ArticlePubMedGoogle Scholar
- Izhikevich EM: Solving the Distal Reward Problem through Linkage of STDP and Dopamine Signaling. Cerebral Cortex. 2007, 17: 2443-2452. 10.1093/cercor/bhl152.View ArticlePubMedGoogle Scholar
- Strösslin T, Sheynikhovich D, Chavarriaga R, Gerstner W: Robust self-localisation and navigation based on hippocampal place cells. Neural Networks. 2005, 18 (9): 1125-1140. 10.1016/j.neunet.2005.08.012.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd.