Create an approximation of the value function U_{\phi} using Approximate Value Function, and use Policy Gradient to optimize an monte-carlo tree search policy
Create an approximation of the value function U_{\phi} using Approximate Value Function, and use Policy Gradient to optimize an monte-carlo tree search policy