[Corpora-List] Frequency list of transformations

From: Marijke Koster (marijke@polderland.nl)
Date: Fri Jan 21 2005 - 09:44:46 MET

  • Next message: Martin Wynne: "[Corpora-List] Corpora and the web in the Economist"

    Dear corpora list members,

    Does anyone have a suggestion for a simple method / a script to extract
    a frequency list of transformations from a list of spelling errors and
    corrections?

    For example here's this tab separated list:

    wrong correct
    ----- -------
    occurence occurrence
    occosion occasion
    commputer computer
    live life
    heavie heavy
    geat great
    save safe

    After applying the method it should result in something like this
    1 rr -> r
    1 a -> o
    1 m -> mm
    2 f -> v
    1 y -> ie
    1 r -> ()

    Thanks in advance,
    Marijke Koster
    ______________________________________
    Marijke Koster, linguistic engineer
    Polderland Language & Speech Technology BV
    The Netherlands
    http://www.polderland.nl
    Phone: +31.24.352 28 66
    Fax: +31.24.352 28 60



    This archive was generated by hypermail 2b29 : Fri Jan 21 2005 - 10:02:41 MET