Corpora: NLP and Syntax in the Classroom

Philip A. Bralich, Ph.D. (bralich@hawaii.edu)
Mon, 2 Feb 1998 22:21:17 -1000

Since the original announcement of the availability of
"BracketDoctor" to generate trees and labeled brackets in the
style of Linguistic Data Consortium's Penn Treebank II guidelines,
there have already been requests by several professors and
students about the use of the software in the classroom. This is
certainly allowable and I would like to offer my services as a
consultant to those who would like to use the BracketDoctor as a
classroom tool. I am willing to serve as a clearing house for
materials and to provide commentary as to what does and does not
work and what the reason for it is. I can also implement
improvements in the parser as gaps or errors become apparent.
Because the development of this software has been done through
private investment and it is currently proprietary and patent
pending, I cannot reveal the source code or the exact nature of
the linguistic theory that is being used, but I can provide
general discussion of the Penn Treebank guidelines and of
theoretical syntax and English syntax. If this becomes popular,
we can provide a shared space for handouts, problems, discussions
and so on on our server. If there were sufficient interest we
could also set up an email discussion list. Of course, this might
also provide invaluable contact with students and professors at
other universities.

The only requirement for group use is that each member of the class
or group download his own copy of the software. The license that is
part of the standard set up is written only for single users, so
rather than copying it from someone, it is necessary to download it
from our web site or get it by email from me.

There are also those who have begun to speculate on possible
improvements to current NLP devices using the underlying technology.
We definitely support such speculation and would encourage you to talk
to us about such possibilities. However, please be advised that this
is a copyrighted product and the parser that underlies it is patent
pending. You cannot make such developments on your own without a
license from us. Such licenses will be easy or difficult to obtain
depending on the commercial viability of the project being described,
the relative role of the parsing technology in the overall value of
the project, and the intended uses.

In addition to the obvious enhancements to database and Internet
searching, web site assistance, and dialoging with game characters, one
of the most common comments about possible new products is that this
will likely increase the number of possible commands in speech rec
systems from a set of a few hundred to thousands. This is because
exponential growth, which for years was a problem for NLP actually works
in our favor where it is possible to ask for a file (and hundreds of
other things) in these and more combinations.

(could/would/can/will you) (please) open/get/find/grab/take (me)
the file/document called/named//which/that is called/
named//which/that/0 I named/called manual.doc

or

send/mail/email Bob a message/email/letter/memo/fax (that/which says)
saying, "meeting at five"

Thus, the fact that this parsing system allows all the above variations for
all possible commands leads to an exponential growth in the number of
possible commands to allow thousands of possibilities over a few dozen
commands that are currently allowed while only requiring an increase in
vocabulary of a few hundred to a few thousand words. Of course, the real
advantage though is that this makes it possible for users of speech rec
technologies to do command and control without the need to refer to a list
of fixed commands. The user can just speak as though he were talking to a
friend or a neighbor. Those of you who are already working in the area
are aware of the value and rarity of such abilities. Thus, for students
working on projects in syntax or for students looking to design projects
in NLP, the BracketDoctor can be a very useful tool, and we would like to
encourage professors and students alike to contact us for support,
discussion, and commentary.

For those of you who have not received the BracketDoctor executable, it is
available for download at http://www.ergo-ling.com or by sending an email
request to me at bralich@hawaii.edu.

To save on bandwidth let me remind you that discussion of this matter
beyond this invitation may not be appropriate for the entire list, so
rather than just hitting the "reply" button, please respond to me
directly at "bralich@hawaii.edu."

Sincerely,

Phil Bralich

Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)5393924

Philip A. Bralich, President
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822
tel:(808)539-3920
fax:(880)539-3924

Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)5393924