RL-POSGsReinforcement Learning for Zero-Sum Partially Observable Stochastic Games

OngoingRUG

2023–2027

Due to the omnipresence of artificial agents in society, the development of algorithms that can train them to act to best of their interests in the face of others has become inevitable. In this project, we formalize the interactions among multiple artificial agents as a partially observable stochastic game (POSG). In such a setting, agents can neither see the true state of the world nor share their information with one another, a problem known as the silent coordination dilemma. This dilemma partially explains why finding the optimal solution for an infinite-horizon cooperative POSG is undecidable, finite-horizon cooperative POSGs are hard for the class NEXP, and non-cooperative variants are hard for the class NEXP^NP. To circumvent these negative complexity results, we adopted the central planning for decentralized control approach, which suggests recasting POSGs into simpler games, whose solutions can be transferred back to the original games. In this project, we aim at extending this approach to POSGs with competing and mixed-motive interests.

Team

Jilles Dibangoye — PI
Matthia Sabatelli — Co-PI
Erwan Escudie — PhD Student