Natural Language Understanding
- GATE was originally designed for Text Extraction, a
NLU task.
- Typically, this involves getting a series of texts,
and processing them to extract database entries.
- The Message Understanding Competitions are a good example
of this and involve domains like Central American terrorist
events, rocket launches, and movement of business executives
between companies. The inputs are usually newspaper articles.
- A typical system would consist of modules for
- sentence splitting
- lexicon
- part of speech tagging
- sentence parsing
- template generation
- Lexicon, part of speech tagging, and sentence parsing are pretty
much all that are needed for the NLU bit of a conversation.
- As an aside, Huong Le Than is putting her discourse parser
into GATE, and with the sentence splitting, lexicon, POS, and
sentence parsing, is building an automatic summarizer.