Corpora with Semantics

martym@cs.utexas.edu
Tue, 24 Dec 1996 05:32:49 -0600

Hello,

I'd appreciate it if anyone could point out some corpora that have
semantic information contained within them, preferably deeper than
case-role analyses, although a substantial corpus with just this
would do well as a starting point. The semantic information need
not be explicit, but could take the form of query targets or frames,
as examples. However, the more explicit the better. As more concrete
examples of what I'm looking for, I understand that the most recent
ATIS 3 corpus provides a measure of semantics (per Miller's et al
paper "A Fully Statistical Approach to Natural Language Interfaces),
as well as an older tagging of Hemingway's "the Old Man and the Sea",
which I've been unable to find (I'm told it's referenced in Gary
Cottrell's dissertation).
Any pointers to corpora with both syntactic and semantic tagging
would be extremely helpful.

Another, more modest, request is for an unabridged wordlist of the
English language (preferably based on the OED), with part-of-speech
tagging, so that I can apply morphological rules to generate a
(supposedly) exhaustive list of words for another project I'm working
on.

Thanks in advance for any help in these searches.

Marty Mayberry
University of Texas at Austin
Neural Nets Research Group, Computer Science Department
martym@cs.utexas.edu