Corpora: Ergo's Parsing Contest

Philip A. Bralich, Ph.D. (bralich@hawaii.edu)
Sun, 21 Feb 1999 11:01:54 -1000 (HST)

The parsing contest we announced six weeks ago is still open
and will close in six more weeks (the end of March) though this
could be extended if there are such requests. To date no one
has dared to take our challenge on these very practical and
relatively easy parsing tasks. We can only assume that the
computational linguistics community is that much behind us,
because, if there were any tools that would even come
close to the practical abilities we offer, those who had
such tools would have a simple and straightforward opportunity
to demonstrate the supperiority of their methods and tools
over ours with just one demonstration. That is, we assume
that the only reason that this opportunity for a demonstration
of the superiority of other's tools is being ignored is because
there are no tools currently superior to those we offer.

To reiterate briefly, Ergo Linguistic Technologies is offering
its first annual parsing contest based on a fixed set of
sentences and a fixed set of tasks to be performed on that set
of sentences. The area of NLP to be explored is that of increased
syntactic analysis to provide: 1) improvements in navigation and control
technology through more complex commands and chained commands, 2)
improvements in the implementation of question/answer, statement/
response dialogs with computers and computer characters, and 3)
improvements in web and database searching using natural language
queries.

The contest will be based on a comparison of results for parses of
a fixed set of sentences (included on our web site) and various
tasks that can be performed as a result of those parses. Ergo's
results on these tasks for these questions as well as for the
Air Travel Industry Sentences (ATIS) can be downloaded from our
site. That is, the comparison will be based on the actual parse tree
and the ability to use that parsed output to generate theory independent
parse trees and output and to perform various NLP tasks. The
judging will be based on the standards for evaluating NLP that have
been proposed previously on this list by myself and Derek Bickerton
and which are currently being developed into an ISO standard for the
Virtual Reality Modeling Language (VRML) as part of the VRML
Consortium's development efforts (http://www.vrml.org/WorkingGroups/
NLP-ANIM). The standards proposed are theory and field independent
standards which allow both linguists and non-linguists to evaluate NLP
systems in the areas of navigation and control, question/answer
dialogues, and database and web searching.

The sentences chosen for this contest are rather simple, but as we find
more and more parsers that can accomplish the tasks on this list, we
will add more complex sentences and tasks to the list. Please, be aware
that systems that may be designed for large corpora of unrestricted text
actually cannot work in this domain. Thus, while such systems may be
useful for certain searching tasks, they are not useful in the domain
explored in this contest — and this is evidenced by their inability to
perform on tests such as the one provide here.

The full contest instructions and an HTML document of Ergo's results in
this area can be found at http://www.ergo-ling.com. The standards were
designed to allow the developers of a parsing system (statistical or
syntactic) to demonstrate the thoroughness and accuracy of the parses they
produce by using the parsed output to perform a number of straightforward,
traditional syntactic tasks such as changing a statement to a question or
an active to a passive as well as demonstrating an ability to create
standard trees (Using the Penn Treebank II guidelines) and standard
grammatical analyses. All the standards chosen were chosen to be theory
independent measures of the accuracy of a parse through the use of standard
and ordinary grammatical and syntactic output.

The contest officially begins on January 15th and will be closed on March
31st. This will allow developers 2.5 months to develop tools and to work
with trouble spots that they may have with the set of sentences offered in
this contest. The contest will be offered in subsequent years from January
to March. As time develops we hope the parsers, the contest rules, and the
test sentences will all grow in sophistication and scope. However, as most
parsers have existed many more years than ours, it is reasonable to think
these tools exist already.
Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)539-3924
bralich@hawaii.edu
http://www.ergo-ling.com

Philip A. Bralich, President
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822
tel:(808)539-3920
fax:(880)539-3924