Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-Optimal Sequential Action Data

Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-Optimal Sequential Action Data はコメントを受け付けていません

概要

Inverse reinforcement learning (IRL) estimates a reward function for an agent to behave along with expert data, e.g., as human operation data. However, expert data usually have redundant parts, which decrease the agent’s performance. This study extends the IRL to sub-optimal action data, including lack and detour. The proposed method searches for new actions to determine optimal expert action data. This study adopted maze problems with sub-optimal expert action data to investigate the performance of the proposed method. The experimental results show that the proposed method finds optimal expert data better than the conventional method, and the proposed search mechanisms perform better than random search.

論文誌情報

題目: Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-Optimal Sequential Action Data
著者: Fumito Uwano, Satoshi Hasegawa and Keiki Takadama
誌名: Journal of Advanced Computational Intelligence and Intelligent Informatics
詳細: Vol. 28 No. 2, pp. 380-392, 2024.

Bibtex or Download

Fumito Uwano, Satoshi Hasegawa, Keiki Takadama. Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-optimal Sequential Action Data. Journal of Advanced Computational Intelligence and Intelligent Informatics, 28(2): 380-392, 2024.
@article{fumito uwano 2024inverse,
  title={Inverse Reinforcement Learning with Agents’ Biased Exploration Based on Sub-optimal Sequential Action Data},
  author={Fumito Uwano and Satoshi Hasegawa and Keiki Takadama},
  journal={Journal of Advanced Computational Intelligence and Intelligent Informatics},
  year={2024},
  volume={28},
  number={2},
  pages={380--392},
  publisher={Fuji Technology Press Ltd.}
}