Re: Corpora: List of abbreviations

Pete Whitelock (pete@sharp.co.uk)
Fri, 01 May 1998 12:39:14 +0100

"Manuel J. Maña López" wrote:
>
> Hello,
>
> I am looking for a list of abbreviations of common use in English (such as Ltd., Mr., Inc., ...). I have found some of them in Internet but they include a lot of acronyms I am not interested in.
>
> Does anybody know if there is any available? Thanks.

Why not just build your own? Presumably you are interested only
in those which end in full stop. Go through a corpus and make a list
of all strings followed by full stop. Count and uniq them and
compare them against the overall frequencies of the same strings (i.e.
including cases where they are not followed by full stops). In other
words, just find those strings with the highest mutual information
with full stop (perhaps weighted for absolute frequency). It should
be about 10 lines of Perl.

You have to be slightly careful cos some corpora don't use full stop
on any abbreviations.

Pete

-- 
E-mail: pete@sharp.co.uk          \ Pete Whitelock
 Sharp global mail: SLEMV1::PETE   \ Sharp Laboratories of Europe Ltd
  phone: +44 (0)1865 747711         \ Oxford Science Park
   fax: +44 (0)1865 714170           \ Oxford, OX4 4GA, England