Problem w/ aligning parallel texts

Kursat Ince (kince@cs.bilkent.edu.tr)
Fri, 16 Feb 1996 13:47:06 +0300 (EET)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Lynne Bowker: "Unity in Diversity -- Call for Participation"
Previous message: Paul Mc Kevitt: "BOOKS: 4 VOLUMES: LANGUAGE + VISION: KLUWER"

Hello everyone,

I am reposting this message, since the system had some delivery problems
in the last few days.

I have question regarding Kay and Roscheisen's Text Translation Alignment
paper (Comp. Linguistics Vol 19 No 1). I have not been able to get any
response from the authors so I thought I may look for some help here.

I will not go in the detail of the algorithm, but what I want to ask about
is the formula that measures the similarity of two given words, one from
each text. If the similarity is greater than some threshold, then the
words are said to be aligned. The formula on page 125 (of CL 19(1))is :

2c/(NA(v) +NB(w)) (sorry, I could do no better ) [1]

where c is the number of corresponding positions of the two words, and
NA(v) (NB(w)) is the number of occurrences of word v (w) in text A (B).

The example following the formula doesn't use the formula as it is, but
somehow subtracts the c from the denominator. Then the formula used in the
example becomes

2/(NA(v) +NB(w) -c). [2]

On the other hand, on page 127 of the same paper, it is written that the c
is in the denominator.

I cannot really understand which is the correct one. Any help or
clarifications would be appreciated.

Thanks in advance

Kursat Ince

Bilkent University

Next message: Lynne Bowker: "Unity in Diversity -- Call for Participation"
Previous message: Paul Mc Kevitt: "BOOKS: 4 VOLUMES: LANGUAGE + VISION: KLUWER"