Pcgd_arxiv
Joint work with Jeffrey, Alistair, Yuanyuan, and Anima now on the arxiv. We extend competitive gradient descent to arbitrary numbers of agents, prove its local convergence, and show that it learns significantly more performant policies when used for multi-agent reinforcement learning.