Cognitive Model of Two Choice Task
- We used reinforcement learning within the system.
- One of the problems is that if a system chooses a bad rule,
the neurons that make the choice and are the rulre co-fire.
- Via Hebbian learning these are then strengthened.

- The explore subnet stimulates the choice of action (a). By itself
(if the precondition (s) is valid), it will randomly
choose between actions.
- Without it, the current action will persist.
- The value net inhibits explore, so if there's a reward
it stops it, and the good result is reinforced.
- This works for verb learning (see CABots),
and the two choice task cognitive model.

- There is a line with slope 1 which is the input, showing how
likely you are to get a reward. People do the solid line,
our model the dashed.