Skip to main content


Spike-based reinforcement learning of navigation

Article metrics


We have studied a spiking, reinforcement learning model derived from reward maximization [1, 2] where causal relations between pre-and postsynaptic activity set a synaptic eligibility trace [2, 3]. Neurons are modeled according to the "Integrate-and-Fire" model with escape noise. Synapses are binary and are modulated via the release probability. The synaptic release probability is updated when a global reward signal (such as dopamine) is received.

We have used the learning algorithm in a model of the Morris Water Maze task. The simulated rat explores the environment in random search. After only few trials the rat has learned to approach the goal from arbitrary start conditions, see Figure 1. The model features automatic generalization in state and action space due to coding by overlapping profiles of place cell and action cells [4].

Figure 1

Escape latency versus number of trials. Escape latency measures the time it takes the simulated rat to reach a hidden platform starting from arbitrary initial conditions. Learning is achieved in less than 20 trials. Error bars indicate 25% and 75% percentiles.


  1. 1.

    Pfister JP, Toyoizumi T, Barber D, Gerstner W: Optimal Spike-Timing Dependent Plasticity for Precise Action Potential Firing in Supervised Learning. Neural Computation. 2006, 18 (6): 1309-1339. 10.1162/neco.2006.18.6.1318.

  2. 2.

    Florian RV: Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural Computation. 2007, 19 (6): 1468-1502. 10.1162/neco.2007.19.6.1468.

  3. 3.

    Izhikevich EM: Solving the Distal Reward Problem through Linkage of STDP and Dopamine Signaling. Cerebral Cortex. 2007, 17: 2443-2452. 10.1093/cercor/bhl152.

  4. 4.

    Strösslin T, Sheynikhovich D, Chavarriaga R, Gerstner W: Robust self-localisation and navigation based on hippocampal place cells. Neural Networks. 2005, 18 (9): 1125-1140. 10.1016/j.neunet.2005.08.012.

Download references

Author information

Correspondence to Eleni Vasilaki.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article


  • Reinforcement Learning
  • Action Space
  • Random Search
  • Place Cell
  • Maze Task