Corpora: Re: Parser analysis

Timo Honkela (tho@james.hut.fi)
Fri, 20 Feb 1998 11:17:39 +0200 (EET)

On Thu, 19 Feb 1998, David Coniam wrote:

> Some interesting duscussion has been taking place over Ego’s parser.
>
> I thought I’d check out a few sentences to see how the parsers cope with
> real, but not too awkward, sentences (as one might expect, the parsers
> copes fine with sentences such as “Time flies like an arrow” !!)
>
> [...]

To have a chance for comparison I used the ENGCG system the demo
of which is available at the WWW address of Lingsoft, Inc.

http://www.lingsoft.fi/

One can make own experiments also to see the description of
the morphological tags, syntactic tags and other notations.
Below you can find some of the results of the ENGCG parser for
the examples provided by David Coniam.

The objectives of the parsers may be a little different but
the comparison may still be of some interest.

> First a couple of sentences from the original – fairly short ones.
>
> (1) But now Texas was invisible, and even the United States was hard to see.

---- ENGCG: ----

"<*but>"
"but" <*> CC @CC
"<now>"
"now" ADV @ADVL
"<*texas>"
"texas" <*> <Proper> N NOM SG @SUBJ
"<was>"
"be" <SV> <SVC/N> <SVC/A> V PAST SG1,3 VFIN @+FMAINV
"<invisible>"
"invisible" <DER:ble> A ABS @PCOMPL-S
"<$\,>"
"<and>"
"and" CC @CC
"<even>"
"even" ADV @ADVL
"<the>"
"the" <Def> DET CENTRAL ART SG/PL @DN>
"<*united_*states>"
"united_*states" <*> <Proper> N NOM PL @SUBJ
"<was>"
"be" <SV> <SVC/N> <SVC/A> V PAST SG1,3 VFIN @+FMAINV
"<hard>"
"hard" A ABS @PCOMPL-S
"<to>"
"to" INFMARK> @INFMARK>
"<see>"
"see" <as/SVOC/A> <SVO> <SV> <InfComp> V INF @<NOM-FMAINV

> (2) Probably no one would ever know this; it did not matter.

---- ENGCG: ----

"<*probably>"
"probable" <*> <DER:bly> ADV @ADVL
"<no_one>"
"no_one" PRON NOM SG @SUBJ
"<would>"
"would" V AUXMOD VFIN @+FAUXV
"<ever>"
"ever" ADV @ADVL
"<know>"
"know" <Vcog> <SVO> <SV> <InfComp> <P/of> V INF @-FMAINV
"<this>"
"this" PRON DEM SG @OBJ
"<$\;>"
"<it>"
"it" <NonMod> PRON NOM SG3 SUBJ @SUBJ
"<did>"
"do" <SVO> <SVOO> <SV> V PAST VFIN @+FAUXV
"<not>"
"not" NEG-PART @NEG
"<matter>"
"matter" <SV> V INF @-FMAINV
"<$.>"


> For both (1) and (2), ESE threw up the analysis:
> “This sentence is ungrammatical.”
> [...]
>
> The ESE program did manage to analyse the following sentence:
> (3) It was mounted like a gunsight on the rim of the ship's long-range
> antenna, and checked that the great parabolic bowl was rigidly locked upon
> its distant target.

---- ENGCG: ----

"<*it>"
"it" <*> <NonMod> PRON NOM SG3 SUBJ @SUBJ
"<was>"
"be" <SV> <SVC/N> <SVC/A> V PAST SG1,3 VFIN @+FAUXV
"<mounted>"
"mount" <SVO> <SV> <P/on> PCP2 @-FMAINV
"<like>"
"like" PREP @ADVL
"<a>"
"a" <Indef> DET CENTRAL ART SG @DN>
"<gunsight>"
"gunsight" <?> N NOM SG @<P
"<on>"
"on" PREP @<NOM @ADVL
"<the>"
"the" <Def> DET CENTRAL ART SG/PL @DN>
"<rim>"
"rim" N NOM SG @<P
"<of>"
"of" PREP @<NOM-OF
"<the>"
"the" <Def> DET CENTRAL ART SG/PL @DN>
"<ship's>"
"ship" N GEN SG @GN>
"<long-range>"
"long-range" <Attr> A ABS @AN>
"<antenna>"
"antenna" N NOM SG @<P
"<and>"
"and" CC @CC
"<checked>"
"check" <SVO> <SV> <P/for> <P/with> <P/on> V PAST VFIN @+FMAINV
"check" <SVO> <SV> <P/for> <P/with> <P/on> PCP2 @-FMAINV
"<that>"
"that" <**CLB> CS @CS
"<the>"
"the" <Def> DET CENTRAL ART SG/PL @DN>
"<great>"
"great" A ABS @AN>
"<parabolic>"
"parabolic" <DER:ic> A ABS @AN>
"<bowl>"
"bowl" N NOM SG @SUBJ
"<was>"
"be" <SV> <SVC/N> <SVC/A> V PAST SG1,3 VFIN @+FAUXV
"<rigidly>"
"rigid" <DER:ly> ADV @ADVL
"<locked>"
"lock" <SVO> <SV> PCP2 @-FMAINV
"<upon>"
"upon" PREP @ADVL
"<its>"
"it" PRON GEN SG3 @GN>
"<distant>"
"distant" A ABS @AN>
"<target>"
"target" N NOM SG @<P
"<$.>"
"<$2-NL>"

According to what I have read the results of the morphological
and syntactic analyzer have been successfully used in very many
real-life applications.

However, I would also like to point out that there's much more
in the language than syntax; but that's another story
(see, e.g., http://www.cis.hut.fi/~tho/thesis/).

Best regards,
Timo Honkela

P.S. I am not and haven't been an employee of Lingsoft -- a comment that
I want to make referring to the previous discussions related to the
relationship between academic research and product marketing of the
companies... On the other hand, I'm pleased to give some publicity
for the recent great Finnish contributions in the area of
information technology such as
- Nokia mobile phones (http://www.nokia.com/company/overview/),
- Linux operating system (http://www.cs.helsinki.fi/linux/),
- SOM - Kohonen Self-Organizing Map (http://www.cis.hut.fi/nnrc/),
- IRC - Internet Relay Chat (http://www.funet.fi/~irc/),
- Ssh - Secure Shell (http://www.cs.hut.fi/ssh/), etc. etc.

------------------- ---------------------------
Timo Honkela Timo.Honkela@hut.fi http://www.cis.hut.fi/~tho/
Neural Networks Research Centre, Helsinki Univ of Technology
and P.O.Box 2200 FIN-02015 HUT, Finland
Nat Lang Proc Tel. +358-9-451 3275, Fax +358-9-451 3277