Skip to main content
  • Poster presentation
  • Open access
  • Published:

Policy gradient rules for populations of spiking neurons

Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions given a high neuronal variability. Here, we present two general recipes to derive learning rules from a policy gradient approach for different neural codes and decision making networks, one based on partial integration across feature values, and one based on linear approximation around a target feature. The first technique leads to a tightly code-specific learning rule where details of the code-irrelevant spiking information are integrated away and the code-specificity enters at the synaptic level. The second technique yields modular learning rules which can be weakly code-specific, with a spike-timing dependent base synaptic plasticity rule which is modulated by a code specific population and decision signal. Decisions can be binary, multi-valued, or even continuous-valued. For illustration, we consider a spike count and a spike latency code. We test them on simple model problems and assess the superiority of tight over weak code-specificity with respect to the performance. While code-specific rules increase the performance only marginally when considering a single neuron [1], our tightly code-specific rule designed for population coding can strongly boost performance. Both code-specific learning rules improve in performance with increasing population size as opposed to standard reinforcement learning [2]. For mathematical clarity we presented the rules for an episodic learning scenario. But a biological plausible implementation of a fully online scheme is also possible [2, 3].


  1. Sprekeler H, Hennequin G, Gerstner W: Code-specific policy gradient rules for spiking neurons. Advances in Neural Information Processing Systems. Edited by: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, and Culotta A. 2009, 22: 1741-1749.

  2. Urbanczik R, Senn W: Reinforcement learning in populations of spiking neurons. Nature Neuroscience. 2009, 12 (3): 250-252. 10.1038/nn.2264.

    Article  CAS  PubMed  Google Scholar 

  3. Friedrich J, Urbanczik R, Senn W: Learning spike-based population codes by reward and population feedback. Neural Computation. 2010, 22 (7): 1698-1717. 10.1162/neco.2010.05-09-1010.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Johannes Friedrich.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Friedrich, J., Urbanczik, R. & Senn, W. Policy gradient rules for populations of spiking neurons. BMC Neurosci 12 (Suppl 1), P111 (2011).

Download citation

  • Published:

  • DOI: