Improvements to the Turing Test
- I am really fond of the Turing test, but it has some methodological
problems.
- First of all the statistics are weak. If one person says one
machine is a human, that does not make it intelligent.
- This could be made statistically significant. The test could
be run repeatedly with different judges (and human contestants).
- Guessing is allowed. So, the judges would have to perform near
chance.
- Similarly, the judges need to be screened. I could get three year
olds to be the judges, or illiterates.
- Another problem would be language. All three contestants need
to use the same language. Perhaps they need to be native
speakers.
- If I were a judge, and both contestants were typing in Chinese,
I would just have to guess.
- Of course, for now this is not a problem. No system comes close,
so we are not currently concerned with these details.
- The open ended duration is not a weakness. It seems like a judge
should be able to spend as much time conversing as he
likes to make up his mind.