Communication-less Cooperative Q-learning Agents in Maze Problem


This paper introduces a reinforcement learning technique with an internal reward for a multi-agent cooperation task. The proposed method is an extension of Q-learning which changes the ordinary (external) reward to the internal reward for agent-cooperation under the condition of no communication. To increase the certainty of the proposed methods, we theoretically investigate what values should be set to select the goal for the cooperation among agents. In order to show the effectiveness of the proposed method, we conduct the intensive simulation on the maze problem for the agent-cooperation task, and confirm the following implications: (1) the proposed method successfully enable agents to acquire cooperative behaviors while a conventional method fails to always acquire such behaviors; (2) the cooperation among agents according to their internal rewards is achieved no communication; and (3) the condition for the cooperation among any number of agent is indicated.


題目: Communication-less Cooperative Q-learning Agents in Maze Problem
著者: Fumito Uwano and Keiki Takadama
誌名: Intelligent and Evolutionary Systems: Proceedings of the 20th Asia-Pacific Symposium on Intelligent and Evolutionary Systems (IES2016)
詳細: Canberra, Australia, November 2016, pp. 453-467

Bibtex or Download

Fumito Uwano, Keiki Takadama. Communication-less Cooperative Q-learning Agents in Maze Problem. Intelligent and Evolutionary Systems, pages 453-467, November, 2016. Springer.
[BibTeX] [Download PDF]
@inproceedings{fumito uwano 2016communication,
  title={Communication-less Cooperative Q-learning Agents in Maze Problem},
  author={Fumito Uwano and Keiki Takadama},
  booktitle={Intelligent and Evolutionary Systems},