Rules and Rule Learning
- In some sense you can see how a rule is just a simple
if conditional.
- If CA1 and CA2 are on the do CA3.
- However, for a bit more sophistication, variable binding is needed.
- A standard grammatical rule is NP + VP -> VP and the NP is the
actor (I saw).
- This requires at least two variables (one for the NP and one
for the VP).
- We did a simple form of this with our counting paper back at
ICCM in 2006 but we used binding via LTP.
- The first stack based parser from CABot does the later task.
- So, we can do rules in this system.
- However, as in processing, the real issue is how does the system
learn rules.
- We presented a paper at ECAI this summer about a system that learns
goal action pairs.
- It makes use of environmental feedback in a kind of semi-supervised
learning.
- So, a goal is set.
- The system then selects an action.
- If the action results in the goal being fulfilled, the system reinforces
the action.
- If not it selects another action.
- This works but has problem with the bucket brigade.
- It is a neural implementation of my colleague Roman Belavkin's
theoretical work on optimal learning. To some extent this solves
the exploration exploitation dilemma.
- We also need to figure out how to incorporate variables into the
learning.