Re: Corpora: NLP and Syntax in the Classroom

Manaris Bill Z (bzm3402@bp.ucs.usl.edu)
Tue, 3 Feb 1998 12:21:33 -0600 (CST)

Greetings to all,

I apologize for using valuable bandwidth, but I have a minor problem with
the continued announcements regarding this "patent-pending" tool.

I do not doubt that this tool may be useful to some. Many NLP tools are.
However, based on the verbiage used to promote the tool (quoted below),
this novel approach sounds all too familiar. I am afraid there is nothing
new to a system that takes advantage of synonymy in its linguistic model.
Burton's SOPHIE did this in the 70's. A major issue is overgeneration,
i.e., dealing with input that the system recognizes when it shouldn't; for
instance, given the example (of a rule?) provided in the announcement:

send/mail/email Bob a message/email/letter/memo/fax (that/which says)
saying, "meeting at five"

the system would accept the input:

email Bob a email which says saying, "meeting at five"

which makes no sense; even worse, this "overgenerated" sentence could
actually mean something completely different from what the system developer
had intended. Put that in a command-and-control situation and you have a
serious problem.

Actually, there have been several papers that describe such an approach to
language modeling/parsing (other than Burton's), so I am somewhat skeptical
with respect to the novelty of the approach (at least based on the
examples/discussion provided).

Again I do not doubt that the system may be useful in certain domains.

Sincerely,
Bill Manaris

--
Bill Manaris, Ph.D.                      | Office : (318) 482-6638           
Computer Science Department              | Fax    : (318) 482-5791           
Univ. of SW Louisiana, P.O. Box 41771    | E-mail : manaris@usl.edu          
2 Rex St., Lafayette, LA 70504-1771, USA | WWW    : http://www.usl.edu/~manaris

> To: corpora@lists.uib.no > From: "Philip A. Bralich, Ph.D." <bralich@hawaii.edu> > Subject: Corpora: NLP and Syntax in the Classroom > Date: Mon, 2 Feb 1998 22:21:17 -1000 > Resent-Date: Tue, 3 Feb 1998 09:21:42 +0100 > Resent-From: corpora-request@lists.uib.no > > Since the original announcement of the availability of > "BracketDoctor" to generate trees and labeled brackets in the > style of Linguistic Data Consortium's Penn Treebank II guidelines, > there have already been requests by several professors and > students about the use of the software in the classroom. [cut] > In addition to the obvious enhancements to database and Internet > searching, web site assistance, and dialoging with game characters, one > of the most common comments about possible new products is that this > will likely increase the number of possible commands in speech rec > systems from a set of a few hundred to thousands. This is because > exponential growth, which for years was a problem for NLP actually works > in our favor where it is possible to ask for a file (and hundreds of > other things) in these and more combinations. > > (could/would/can/will you) (please) open/get/find/grab/take (me) > the file/document called/named//which/that is called/ > named//which/that/0 I named/called manual.doc > > or > > send/mail/email Bob a message/email/letter/memo/fax (that/which says) > saying, "meeting at five" > > Thus, the fact that this parsing system allows all the above variations for > all possible commands leads to an exponential growth in the number of > possible commands to allow thousands of possibilities over a few dozen > commands that are currently allowed while only requiring an increase in > vocabulary of a few hundred to a few thousand words. Of course, the real > advantage though is that this makes it possible for users of speech rec > technologies to do command and control without the need to refer to a list > of fixed commands. The user can just speak as though he were talking to a > friend or a neighbor. Those of you who are already working in the area > are aware of the value and rarity of such abilities. Thus, for students > working on projects in syntax or for students looking to design projects > in NLP, the BracketDoctor can be a very useful tool, and we would like to > encourage professors and students alike to contact us for support, > discussion, and commentary. [cut]