Armed Bandits
- In AI and decision theory, armed bandits are gambling devices.
They're like slot machines.
- What is typically done with a one armed bandit is to show how likely
you are to get a positive return, with the rest of the options being
0.
- For instance, you have 10% chance of winning £100 and 90% chance
of losing your bet.
- Often people are asked what they'd bet to play this bandit.
- The key is that the experimentor can set the probabilities, or
arrange payouts explicitly.
- You can use this to show how people don't choose rationally (Kahnemann)
- There is also a lot of work with multi armed bandits (K or N).
- Part of the question here, is if you can play over and over again,
how do you discover what to play.
- If you pick the right bandit, you can maximise your expected outcome.
- This involves the exploration exploitation dilemma.
- Moreover, the probabilities can change over time.