Corpora: Re: ARCADE: Call for participation

Dan Melamed (melamed@unagi.cis.upenn.edu)
Thu, 12 Mar 1998 14:45:59 -0500 (EST)

Since "participants cannot withdraw during the competition and accept
the publication of the results," it is important to specify a priori
exactly what the rules of the competition are. I am interested in
participating, but I am concerned that vague rules will promote
comparison of apples with oranges, and decrease the value of the
exercise (which I think will be quite high otherwise). Could you
please clarify (at least) the following points?

1. How automatic must participating systems be? Is a system allowed
to ask a human for help on difficult cases?

2. What resources, in addition to the test bitext, are systems allowed
to exploit? Other corpora? Dictionaries? POS-taggers? Obviously,
the more resources, the easier the task. It would not be fair to
compare systems that use different resources.

3. To what degree must the systems be language-independent? E.g. is
it reasonable to rely on cognates? Is the point to find the best
system for French/English or the best system for arbitrary language
pairs? (If the answer is "both", then it might be best to formalize
two separate tracks.)

4. What are the evaluation metrics? Only exact match with the gold
standard? Or does a system get partial credit for being "close"?
Either way, which objective functions are of interest --- precision,
recall, Dice, F-measure, 11-point average precision, or what?

5. In the sentence category competition, are systems expected to
recognize inversions or only monotonic alignments? Inversions are
surprisingly frequent.

6. In the word category competition, what are "words?" Who decides
the tokenization and on what basis? Note that it is not enough to say
that we care about only the 60 selected French words --- it is also
necessary to specify how English words should be tokenized and counted
in the "correct" translations.

7. In the word category competition, on what basis will the "correct"
translations be determined? E.g. what are the rules for matching one
of the 60 selected words when it appears as part of an idiomatic
expression, or when its translation is an idiom, or both?

I. Dan Melamed melamed@linc.cis.upenn.edu
University of Pennsylvania http://www.cis.upenn.edu/~melamed/