Corpora: SVO from tagged text - summary

Paul Bowden (pmr@doc.ntu.ac.uk)
Mon, 10 May 1999 10:32:38 +0100

Dear Corpora list member,

Some time ago I put out a query regarding whether it is possible to obtain
SVO information from tagged text. Here is a summary of the responses I got.
Thankyou to everyone who responded!

----------
Mark Stevenson's paper "Extracting Syntactic Relations using Heuristics"
shows how simple rules can be used to achieve reasonable results (precision
and recall around 50 - 60%). See
http://www.dcs.shef.ac.uk/~marks/publications/esslli98.ps

----------
Daelemans, Buchholz and Veenstra (Tilburg University) will present their
paper "Memory Based Shallow Parsing" at CoNNL'99 in Bergen later this year.
The MBL approach keeps the raw data for consultation rather than attempting
to create heuristcs from that data. Precision and recall figures are high,
but I don't want to steal their thunder, so see them at CoNNL'99 !

----------
Yuval Krymolowski reports that a recent paper given at COLING-ACL'98
(Argamon, Dagan and Krymolowski) considered SV and VO data from the Penn
Treebank. I'm not sure that this answers my original query (which was for
SVO from only tagged text) but a script for doing the extraction is
available from Yuval at yuval@macs.biu.ac.il

----------
Marc Vilain et al. have also been working on chunks for a trainable
approach to grammatical relations parsing. Again, CoNNL'99 apperas to be
the place to be this year! The paper "Learning Transformation Rules to Find
Grammatical Relations" by Lisa Ferro, Marc Vilain and Alexander Yeh will be
read there. I haven't read the paper yet but again the recall and precision
seem to be high.

----------
Jean-Pierre Chanod et al. have also been working on shallow PoS based
parsers, for French and English. I haven't had time to follow these up yet,
but he provides me with the following two references: "Incremental
Finite-State Parsing" by Salah Ait Mokhtar and J.P. Chanod, in Procs. App.
Nat. Lang. Processing Washington DC April 1997 , and "Subject and Object
Dependency Extraction Using finite-State Transducers" in Procs. ACL
workshop on Automatic IE and Building of Lexical Semantic Resources for NLP
Applications (Madrid, 1997) (same authors).

----------
Gabriel Lopes at gpl@di.fct.unl.pt has been working in Portuguese on the
SVO problem. Loglinear models are used, e.g. to cluster verbs based upon
subcategorisation preferences. Please email Nuno Marques at
nmm@di.fct.unl.pt if you require further information e.g. on published papers.

----------
Adam Kilgarriff at Brighton tells me that he is using regular expression
patterns to find SVO, on the BNC. He's not yet evaluated his approach
against "pukka parsing", though. Adam also tells me that Heid and Eckle at
IMS in Stuttgart, and Gahl at Framenet in Berkeley are also doing something
similar. I haven't yet followed up those leads.

Thankyou once again to all those who replied to me original query - I hope
I haven't missed anyone off the above list!

Paul R. Bowden
Lecturer, Dept. of Computing
Nottingham Trent University