Results
- CABot1 is pretty sophisticated so is a bit difficult to evaluate.
- Moreover, each subnet has some randomness in generation, so it
performs differently from run to run.
- Unit testing shows parsing being successful over 99% of the time
on 23 sentences on the best nets.
- The whole system can also be evaluated.
- There are four types of commands.
- The simplest are direct commands (Turn left. Move forward). The
best nets emit the correct commands over 90% of the time and the
average is around 80%.
- The compound commands (Go left. which means left then forward)
are successful around 85% of the time on the best nets and 75%
on the average.
- The one step context sensitive commands (Turn toward the
stalctite/pyramid.) are successful around 75% of the time with the
best nets and 65% on average.
- The mutli-step context senstive commands (Go to the pyramid.) are
successful 50% in the best case and 35% on average (with some nets
always failing).
- The measurements for the context sensitive commands include failure
conditions.
- In short, it works, but is far from perfect.