Re: Corpora: Parser analysis

Philip A. Bralich, Ph.D. (bralich@hawaii.edu)
Thu, 19 Feb 1998 09:51:59 -1000

At 06:51 AM 2/19/98 -1000, David Coniam wrote:
>Some interesting duscussion has been taking place over Ego’s parser.
>
>I thought I’d check out a few sentences to see how the parsers cope with
>real, but not too awkward, sentences (as one might expect, the parsers
>copes fine with sentences such as “Time flies like an arrow” !!)\

But other parsers only get a part of speech analysis with such sentences
while we get a full analysis including the ability to manipulate the
sentences.

>I mentioned a while ago that Id been looking at speech recognition
>technology – my specific aim here was to see how the technology might be
>used with ESL students. (In short, I don’t think it yet can, but that’s
>another story.)
>
>In a small study with 10 very fluent ESL students, I’ve been getting
>subjects to read 1,000 words of Arthur C.Clarke’s 2001 into Dragon
>Naturally Speaking and then analysing the output. So I thought a few
>sentences from this source might be worth trying with the Ego parser. Ive
>been using the English Sentence Enhancer, which we’ve been informed by Dr
>Bralich, is for ESL students. The parsing algorithms will, I imagine, be
>similar to the other tools in the suite.

As I stated in the announcement of the ESL grammar checker, the program
represents six months of an 18 month development and it is designed
for Junior High School students in Japan. The main point of the early
release is to get serious researchers in this area to take a look and
work with the new functions we offer that other grammar checkers do not.
Specifically, the ability to transform one structure type into another.

>First a couple of sentences from the original – fairly short ones.
>
You might want to get copies of the text books from Japanese Junior
High Schools as that is the basis of the dictionary.

LOOK CLOSELY AT THE FOLLOWING. This may not be 90% of what is possible
in principle but it is 1000% beyond anything currently available. Believe
me when I say that stockholders LOVE IT when you show you can do something
a thousand times better than the competition.

>Parts of Speech
>"It" is a pronoun
>"was" is a verb
>"mounted" is an adjective
>"like" is a preposition
>"a" is an indefinite article
>"gunsight" is a noun
>"on" is a preposition
>"the" is a definite article
>"rim" is a noun
>"of" is a preposition
>"the" is a definite article
>"ship's" is a noun
>"long-range" is a noun
>"antenna" is a noun
>"and" is a coordinate conjunction
>"checked" is a verb
>"that" is a complementizer
>"the" is a definite article
>"great" is an adjective
>"parabolic" is an adjective
>"bowl" is a noun
>"was" is a verb
>"rigidly" is an adverb
>"locked" is an adjective
>"upon" is a preposition
>"its" is a pronoun
>"distant" is an adjective
>"target" is a noun
>
>Parts of Sentence
>"It" is the subject of the verb "was checked"
>"mounted like a gunsight on the rim of the ship's long-range antenna" is
>the complement of the verb "was checked"
>"a gunsight on the rim of the ship's long-range antenna" is object of the
>preposition "like"
>"the rim of the ship's long-range antenna" is object of the preposition "on"
>"the ship's long-range antenna" is object of the preposition "of"
>"that the great parabolic bowl was rigidly locked upon its distant target"
>is a direct object of the verb "was checked"
>"the great parabolic bowl" is the subject of the verb "was"
>"rigidly locked" is the complement of the verb "was"
>"its distant target" is object of the preposition "upon"
>
>Sentence Type
>This is a statement.
>
>Tense and Voice
>Active Simple Past
>
>Simple/Compound/Complex
>This sentence is simple.
>
>Statement to Question
>Y/N Question
>was It mounted like a gunsight on the rim of the ship's long-range antenna
>and checked that the great parabolic bowl was rigidly locked upon its
>distant target
>WH Question
>What was mounted like a gunsight on the rim of the ship's long-range
>antenna and checked that the great parabolic bowl was rigidly locked upon
>its distant target
>What was It mounted like a gunsight on the rim of the ship's long-range
>antenna and checked
>
>Question to Statement
>Question to Statement not necessary
>
>Active to Passive
>Active to Passive not possible.
>
>Change Tense
[snip]
>(4) "mounted like a gunsight on the rim of the ship's long-range antenna"
>is the complement of the verb "was checked"
>
>- hmm.
>
>(5) Simple/Compound/Complex
>This sentence is simple.
>
>- In certain theories (more best?) of syntax, perhaps.
>
>(6) Past Progressive - It was was mounted like a gunsight on the rim of the
>ship's long-range antenna and checking that the great parabolic bowl was
>rigidly locked upon its distant target
>
>- This brings us to the question of nonsensical overgeneration problems
>that were mentioned not so long ago by Bill Manaris (3 Feb), and which Ego
>denied.

This again is terribly important. We are not saying that we can handle
90% of what anyone can dream up. We are merely saying that we can
thousands of more sentences than current systems. Just grab any
speech rec system and you will find that you are limited to a few
hundred commands. By adding our parser that can be increased to
thousands. Currently, magazine reviewers will RAVE if you get a few
hundred more commands than your competitor. Imagine what they will
say if you increase it to thousands. Here is a more appropriate
example to view what we are talking about. (Taken from my response
to Bill Manaris).

> send/mail/email Bob a message/email/letter/memo/fax (that/which says)
> saying, "meeting at five"

Note in these cases the bane of NLP--exponential growth has actually
turned in our favor. Now again this is not going to get 90% of anything
anyone can throw at it but it does increase by 1000's the number of
possible commands that a speech rec system can handle. Again I am sure
my stockholders will be in ecstacy with news like this and they might
want to hunt me down if my competitor comes up with news like this. This
is what I am saying: we are NOT the ideal parser for all times, we
are MERELY a very important step forward.

>I finally chose a few of the sentences produced by Dragon Naturally
>Speaking from one the ESL subjects. One of these sentences was :
>(7) It is mounted like guns liked on the remote ships long-range antenna,
>and checked the repair voted all was rigidly locked upon this distant targets.
>
>When passed through to ESE for analysis, (7) nor any other of the 10
>subjects’ outputs for this sentence were able to be analysed. All that was
>returned was the message:
>“This sentence is ungrammatical”.

Again that product is meant be used with sentences and structures that
are a part of the Japanese junior high school curriculum.

>This is true, of course, but as a tool for ESL students, I wouldn’t have
>thought it represented billion-dollar potential.

Sad to say there will never be a billion dollar grammar checker--for
ESL students or for others. The NLP market however continues to be
a billion dollar market. Note this weeks _Business Week_ and the
December 8th issue of _Fortune Magazine_.

Phil Bralich
Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)539-3924