The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games
by Chao Yu, …, Eugene Vinitsky, et. al
- Paper doesn't propose new algorithm, but provides empirical evidence on usefullness of PPO in MARL
- PPO is overlooked in MARL but people think it would be sample inefficient
- But authors found that it was sample efficient as wells as produced good results
- They use usual modifications to PPO like using GAE (Generalized Advantage Estimation) and use modification specific to MARL like Death Masking (pg. 14)
- As limitation: the experiments were only done on envrionments with [Page 10]
- discrete action space
- collaborative problems
- homogeneous agents