Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior

Bo Wei Chen, Shih Hung Yang, Chao Hung Kuo, Jia Wei Chen, Yu Chun Lo, Yun Ting Kuo, Yi Chen Lin, Hao Cheng Chang, Sheng Huang Lin, Xiao Yu, Boyi Qu, Shuan Chu Vina Ro, Hsin Yi Lai*, You Yin Chen*

*此作品的通信作者

研究成果: Article同行評審

7 引文 斯高帕斯(Scopus)

摘要

Hippocampal pyramidal cells and interneurons play a key role in spatial navigation. In goal-directed behavior associated with rewards, the spatial firing pattern of pyramidal cells is modulated by the animal's moving direction toward a reward, with a dependence on auditory, olfactory, and somatosensory stimuli for head orientation. Additionally, interneurons in the CA1 region of the hippocampus monosynaptically connected to CA1 pyramidal cells are modulated by a complex set of interacting brain regions related to reward and recall. The computational method of reinforcement learning (RL) has been widely used to investigate spatial navigation, which in turn has been increasingly used to study rodent learning associated with the reward. The rewards in RL are used for discovering a desired behavior through the integration of two streams of neural activity: trial-and-error interactions with the external environment to achieve a goal, and the intrinsic motivation primarily driven by brain reward system to accelerate learning. Recognizing the potential benefit of the neural representation of this reward design for novel RL architectures, we propose a RL algorithm based on Q-learning with a perspective on biomimetics (neuro-inspired RL) to decode rodent movement trajectories. The reward function, inspired by the neuronal information processing uncovered in the hippocampus, combines the preferred direction of pyramidal cell firing as the extrinsic reward signal with the coupling between pyramidal cell-interneuron pairs as the intrinsic reward signal. Our experimental results demonstrate that the neuro-inspired RL, with a combined use of extrinsic and intrinsic rewards, outperforms other spatial decoding algorithms, including RL methods that use a single reward function. The new RL algorithm could help accelerate learning convergence rates and improve the prediction accuracy for moving trajectories.

原文English
文章編號2250038
期刊International journal of neural systems
32
發行號9
DOIs
出版狀態Published - 1 9月 2022

指紋

深入研究「Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior」主題。共同形成了獨特的指紋。

引用此