Reward design in multi-agent systems using successor features and multi-information source bayesian optimization

  • Park, Kyeonghyeon
  • Concha, David Molina
  • Lee, Hyun-Rok
  • Lee, Taesik
  • Lee, Chi-Guhn
Citations

WEB OF SCIENCE

2
Citations

SCOPUS

1

초록

Coordinating self-interested agents in multi-agent systems to achieve system-level objectives presents significant challenges due to the inherent misalignment between individual and collective goals. Mechanism design offers a solution by employing a bi-level optimization framework, where a designer agent intervenes in the reward structures to incentivize desired behaviors among self-interested agents. However, a major obstacle in reward optimization lies in solving multi-agent reinforcement learning problems given a reward structure. This paper addresses this challenge by introducing a novel algorithm that leverages successor features (SFs) at both levels of the optimization. Specifically, SFs help reduce the number of design iterations at the upper level by using previously learned equilibria as biased information sources and accelerate equilibrium learning at the lower level by transferring equilibria from previously solved Markov games. This innovative approach leads to significant computational savings, making the process up to ten times faster compared to traditional methods.

키워드

Reward designMulti-information source Bayesian optimizationMean-field reinforcement learningTransfer learningSuccessor featureINEFFICIENCY
제목
Reward design in multi-agent systems using successor features and multi-information source bayesian optimization
저자
Park, KyeonghyeonConcha, David MolinaLee, Hyun-RokLee, TaesikLee, Chi-Guhn
DOI
10.1007/s13042-025-02622-z
발행일
2025-04-18
유형
Article
저널명
International Journal of Machine Learning and Cybernetics
16
9
페이지
6249 ~ 6270