Corpora: Tagging guidelines for word alignment

Jean Veronis (Jean.Veronis@lpl.univ-aix.fr)
Sun, 26 Apr 1998 12:55:50 +0200

Dear Colleagues,

In the framework of the ARCADE evaluation exercise, we have developped
tagging guidelines for the alignment of words between parallel texts. These
guidelines used as a starting point Dan Melamed's Style Guide, developped
for the Blinker project (with his permission). They have been largely
adapted, however, because of the different nature of the task: the Blinker
project aimed at aligning all words between the two parallel texts,
whereas, at least for this phase, the ARCADE exercise needs only alignment
of a given set of words.

A first draft is available at

http://www.lpl.univ-aix.fr/projects/arcade/2nd/word/guide/

You can download it as a whole on form of a zip file (if Web reading is too
slow):

http://www.lpl.univ-aix.fr/projects/arcade/2nd/word/guide/guide.ZIP

We would be very interested in receiving comments from colleagues working
in the area, especially if they have experience in word alignment, and even
more if they have themselves developped similar guidelines for their own
projects. Comments can be sent to the ARCADE discussion list

arcade@lpl.univ-aix.fr

or to myself

Jean.Veronis@lpl.univ-aix.fr

General information on ARCADE can be found at

http://www.lpl.univ-aix.fr/projects/arcade/

Thanks!
Jean Véronis