Corpora: latin corpus

From: Christian Saam (saam2801@uni-trier.de)
Date: Fri Jan 26 2001 - 16:36:58 MET

  • Next message: Narjčs Boufaden: "Corpora: Anaphora resolution"

    Dear Reader,

    I'm currently looking for a corpus of Latin for my master's thesis on
    inflectional morphology.

    My ideal corpus looks like this:

    size: 2 million running words

    annotations (per word form):grammatical features (for all possible
    readings), DISAMBIGUATION of readings, conjugation/declension class
    (paradigm), lemma OR anything (thereof) that gives me enough clues to
    get to a disambiguated reading of every word form

    What I've come across so far:

    The Corpus Augustinianum Gissense which is only lemmatized, but due to
    its restricted interface not even that can be exploited for my purposes.

    The texts in the Perseus collection at Tufts University seem to be worth
    while looking at. But even though for all of the words an analysis can
    be looked up ambiguities are never resolved. (And I don't expect to come
    up with a quick enough solution to the resolution problem with respect
    to the (relatively) free word order.)

    The wordcruncher collection at the TITUS site in Frankfurt doesn't seem
    to contain any latin texts.

    The Thesaurus Linguae Latinae server at University of Saskatchewan
    couldn't be reached via any of the links I found.



    This archive was generated by hypermail 2b29 : Fri Jan 26 2001 - 16:33:43 MET