Re: [Corpora-List] Frequency list of transformations

From: Stefan Th. Gries (STGries@sitkom.sdu.dk)
Date: Fri Jan 21 2005 - 11:01:15 MET

  • Next message: geoffrey.williams: "[Corpora-List] Journées de la linguistique de corpus 2005"

    Dear Marijke

    This is not quite what you're lokking for, but maybe it's still useful
    in some connection. In R, the approximate pattern matching function
    agrep has a function computing the Levenshtein string edit distance,
    which you could use to at least determine the number of "the total
    number of insertions, deletions and substitutions required to transform
    one string into another"; in the help file it also says that this
    function is "a simple interface to the apse library developed by Jarkko
    Hietaniemi (also used in the Perl String::Approx module)".
        If you are also interested in a possibility to compute the
    similarity of two strings to each other, let me know and I'll send you
    an R program I have written. It takes as input a list of strings and
    outputs the following pairwise similarity measures: Dice, a weighted
    version of Dice, XDice, a weighted version of XDice, absolute and
    relative longest common subsequence, mean and minimum longest common
    subsequence.
    Best,
    STG

    Stefan Th. Gries
    ----------------------------------------
    IFKI, Southern Denmark University
    http://people.freenet.de/Stefan_Th_Gries
    ----------------------------------------

    -- 
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 19.01.2005
    



    This archive was generated by hypermail 2b29 : Fri Jan 21 2005 - 11:43:46 MET