Past Work self play: this is a \text{coNP} vs \text{NP} problem: whereas competitive self-play attempts to defend against all strategies, collaborative self-play only needs to find one useful strategy; this doesn’t generalize well because humans are not a partner behavior cloning: Population Based Training: computational super e Novelty instead, learn a generative model from both simulated agents or human data then, sample from this generative model Notable Methods Key Figs New Concepts Notes

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?