Two Choice Model

We set up a system with four subnets.
The input was the Ss, orthogonal CAs. In the two choice model there were just 2, but we did it with more inputs and outputs.
The output was the As, again orthogonal CAs.
Initially the weights were random, so the system would guess.
The guess used the explore net, which was randomly connected to the As.
The system then got environmental feedback from the value net.
If it was good, this would supress the explore net and the correct response would be reinforced.
If it was not good, it would not supress the explore net, and a new response would be generated when the first CA fatigued.