Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment

sngtjack コメントを受け付けていません

概要

This paper proposes a multi-agent reinforcement learning method without communication toward dynamic environments, called profit minimizing reinforcement learning with oblivion of memory (PMRL-OM). PMRL-OM is extended from PMRL and defines a memory range that only utilizes the valuable information from the environment. Since agents do not require information observed before an environmental change, the agents utilize the information acquired after a certain iteration, which is performed by the memory range. In addition, PMRL-OM improves the update function for a goal value as a priority of purpose and updates the goal value based on newer information. To evaluate the effectiveness of PMRL-OM, this study compares PMRL-OM with PMRL in five dynamic maze environments, including state changes for two types of cooperation, position changes for two types of cooperation, and a combined case from these four cases. The experimental results revealed that: (a) PMRL-OM was an effective method for cooperation in all five cases of dynamic environments examined in this study; (b) PMRL-OM was more effective than PMRL was in these dynamic environments; and (c) in a memory range of 100 to 500, PMRL-OM performs well.

論文誌情報

題目: Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment
著者: Fumito Uwano and Keiki Takadama
誌名: SICE Journal of Control, Measurement, and System Integration (JCMSI)
詳細: Volume 12, Number 5, 2019, pp.199-208

Bibtex or Download

Fumito Uwano, Keiki Takadama. Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment. SICE Journal of Control, Measurement, and System Integration, 12(5): 199-208, 2019.

@article{fumito uwano 2019utilizing,
  title={Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment},
  author={Fumito Uwano and Keiki Takadama},
  journal={SICE Journal of Control, Measurement, and System Integration},
  year={2019},
  volume={12},
  number={5},
  pages={199--208},
  publisher={SICE}
}

月	火	水	木	金	土	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

雑記 ー Fumito Uwano's pages ー

Utilizing Observed Information for No-Communication Multi-Agent Reinforcement Learning toward Cooperation in Dynamic Environment

概要

論文誌情報

Bibtex or Download

雑記ー Fumito Uwano's pages ー