FV-POMCPs — Jemoka Knowledge Base

Main problem: joint actions and observations are exponential by the number of agents. Solution: Smaple-based online planning for multiagent systems. We do this with the factored-value POMCP. factored statistics: reduces the number of joint actions (through action selection statistics) factored trees: reduces the number of histories Multiagent Definition I set of agents S set of states A_{i} set of states for each agent i T state transitions R reward function Z_{i} joint observations for each agents O set of observations Coordination Graphs you can use sum-product elimination to shorten the Baysian Network of the agent Coordination Graphs (which is how agents influnece each other). Mixture of Experts Directly search for the best joint actions; computed by MLE of the total value.