What if you don’t know about a probability of success? Beta Distribution time!!! Multi-Arm Bandit See Multi-Arm Bandit Strategies: upper confidence bound: take the action with theh highest n-tn-thonfidence bound Posterior Sampling: take a sample from each Beta Distributions distribution; take the action that has a higher probability of success based on their r

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?