[Corpora-List] automatic search for orthographic recurring patterns

From: MARC FRYD (marc.fryd@univ-poitiers.fr)
Date: Wed Dec 08 2004 - 09:38:53 MET

  • Next message: Spela Vintar: "[Corpora-List] corpus of student translations - looking for references"

    Hi,
    Perhaps someone on the List will be able to help me with the following
    datamining problem:

    Given a corpus of isolated lexical units or collocations, I would like
    to determine recurring orthographic patterns whether initial, i.e.
    "CARPO" (carpogenic, carpogenous, carpolite), final i.e. "IONALISM"
    (sensationalism, functionalism, etc.) , or internal, i.e. "CHRON"
    (synchony, synchronize, etc.).
    The output should be arranged so as to show respective productivity for
    each pattern.
    Important constraint: the various patterns will *not* be fed in
    initially but should be extracted as a result of the algorithm.
    I'll post a summary if I get several replies.
    Regards to all list members.
    Marc Fryd





    This archive was generated by hypermail 2b29 : Wed Dec 08 2004 - 10:53:00 MET