Re: Corpora: Tagsets

David Elworthy (dahe@cre.canon.co.uk)
Fri, 27 Mar 1998 15:05:00 +0000

Meunier Fanny wrote:
>
> Dear all,
>
> I was wondering whether (or not) studies have been published on the
> comparison of the success rates of POS taggers with a restricted tagset vs
> POS taggers with a refined tagset. Any interesting references would be most
> welcome!
> Thank you very much,
>
> Fanny

See my paper in the EACL SIGDAT workshop in 1995. You can get it from
the cmp-lg server as 9504002. I tried tagging English, French and
Swedish, and graually simplifying the tagset by combining similar sets
of tags. The general conclusion was that the often made assertion that
smaller tagset give higher accuracy does not hold, and that French
behaves very differently from English and Swedish.

-- David

_______________________________________________________________________
David Elworthy <dahe@cre.canon.co.uk>
Canon Research Centre Europe Ltd., Guildford, Surrey, UK
URL: http://www.cre.canon.co.uk/
Phone: +44 1483 448844; Fax: +44 1483 448845