Operant behavior controlled by position of a moving object – a reinforcement learning model

Brom, Cyril; Klement, Daniel; Preuss, Michal

doi:10.1186/1471-2202-9-S1-P75

Volume 9 Supplement 1

Seventeenth Annual Computational Neuroscience Meeting: CNS*2008

Poster presentation
Open access
Published: 11 July 2008

Operant behavior controlled by position of a moving object – a reinforcement learning model

Cyril Brom¹,
Daniel Klement² &
Michal Preuss¹

BMC Neuroscience volume 9, Article number: P75 (2008) Cite this article

1255 Accesses
Metrics details

Description

It has been demonstrated that operant behavior can be controlled by spatial stimuli. In one of our experiment, rats were conditioned to press a lever for reward when a moving object was passing through a particular region of the experimental room (unpublished data). Although the stimulus was changing smoothly, the transitions between rewarded and non-rewarded condition were sudden. Consequently the animals anticipated the arrival to the rewarded zone by responding in its vicinity.

We developed a reinforcement learning model to simulate this anticipatory behavior and to study its spatial and temporal components. An output neuron integrated inputs from four classes of sensory neurons: (1) neurons detecting the position of the object, (2) neurons indicating the time elapsed since the last reward and (3) since the last operant response, and (4) a neuron signaling the presence/absence of the reward. While the output neuron was a leaky-integrator with a binary activation function, a manner for sending a motor signal to press the lever, the sensory neurons were simple nodes lacking the time dynamic component that signaled the presence of a stimulus in their receptive field in a rate-coded manner. The synapses between the sensory neurons and the output neuron were modified according to a rule based on the Rescorla-Wagner rule [1]. The overall model resembles the spectral-timing model of Grossberg and Schmajuk [2] extended to the spatial domain.

Depending on the set up of learning parameters related to the different classes of sensory neurons, the network can learn the spatial and/or temporal features of the task resulting in spatial and/or temporal anticipation of the reward. The network well approximates data observed in real animals.

References

Rescorla RA, Wagner AR: A theory of Pavlovian conditioning: The effectiveness of reinforcement and non-reinforcement. Classical Conditioning II: Current Research and Theory. 1972, New-York: Appleton-Century-Crofts, 64-69.
Google Scholar
Grosberg S, Schmajuk NA: Neural dynamics of adaptive timing and temporal discrimination during associative learning. Neural Networks. 1989, 2: 79-102. 10.1016/0893-6080(89)90026-9.
Article Google Scholar

Download references

Acknowledgements

This work was supported by grants of MSMT (1M0517, LC554, and MSM0021620838) and research projects AVOZ50110509 and 1ET100300517.

Author information

Authors and Affiliations

Dept. of Software and Computer Science Education, Faculty of Mathematics and Physics, Charles University in Prague, 118 00, Czech Republic
Cyril Brom & Michal Preuss
Dept. of Neurophysiology of Memory and Computational Neuroscience, Institute of Physiology, Academy of Sciences of the Czech Republic, Prague, 142 00, Czech Republic
Daniel Klement

Authors

Cyril Brom
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Klement
View author publications
You can also search for this author in PubMed Google Scholar
Michal Preuss
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cyril Brom.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Brom, C., Klement, D. & Preuss, M. Operant behavior controlled by position of a moving object – a reinforcement learning model. BMC Neurosci 9 (Suppl 1), P75 (2008). https://doi.org/10.1186/1471-2202-9-S1-P75

Download citation

Published: 11 July 2008
DOI: https://doi.org/10.1186/1471-2202-9-S1-P75

Seventeenth Annual Computational Neuroscience Meeting: CNS*2008

Operant behavior controlled by position of a moving object – a reinforcement learning model

Description

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Neuroscience

Contact us

Seventeenth Annual Computational Neuroscience Meeting: CNS*2008

Operant behavior controlled by position of a moving object – a reinforcement learning model

Description

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Neuroscience

Contact us