Two Choice Task Learning Model

One task we did both in CABot2 and 3 was learn the correct action to choose.
We ran this task on some further data from Friedman.
Here the user was given two options and rewarded for the correct choice.
We added a random activation network and a reward network to modify the Hebbian learning.
Initially, when offered a choice the random activation network chooses an action randomly.
If it chooses correctly, that choice is strengthened.
If it chooses incorrectly, another is randomly chosen and strengthened.
In the long run, this leads to the correct behaviour.
This is a simple form of reinforcement learning.
People don't get the reward percentages correct, tending to choose a value more toward 50%.
This probably has something to do with exploration vs. exploitation.
Our system duplicates these results.