Re: [Corpora-List] Desperate Appeal for help - in search of Medieval French corpora

From: Serge HEIDEN (
Date: Mon Nov 22 2004 - 17:14:47 MET

  • Next message: Hal Daume III: "[Corpora-List] parser for name-internal structure"

    Dear Claire,

    | I'm quite desperately in need of corpora of Medieval French from 800C to
    | 1600C. They must be dated and, most importantly, viewable (as
    | opposed to simply allowing for statistical analysis, etc.).

    I know that you have already been in contact with our group : the BFM project (Base de Français
    The database it uses includes about 60 texts which correspond to about 2.7 millions words (the texts
    being from 842 to 16th , mainly from 12-13th century) [contact].
    So I suppose you've been able to access the database through the Weblex software
    Facing a clear act of despair, I can not be satisfied of an awkward silence.
    Could you please explain what you mean by :
    - dated texts : medieval texts are particular with respect to datation ;
    - viewable as opposed to other types of access.

    What interests me here is the question of corpora diffusion or delivery. If you cannot
    give the electronic texts 'as is' directly (like a file or online edition) for any reason -
    copyright, technical, political... - but instead propose an access method through
    different kinds of tools, for example Web based, questions rise of the right tools,
    their good documentation, sufficient documentation of the texts, etc. One way to
    circumvent all this is to ask people to try to explain what they are looking for, so as
    to try to do some part of the work, or maybe adapt tools. It is what we try to do in
    the BFM project, for free : there is extensive documentation of texts, Weblex does
    CQP concordances and indexes, etc. but Web interface and documentation is
    coarse for linguists.
    With feedback we can change things.
    Is it the right way to do ? what do you propose ?


        [Serge Heiden]

    Serge Heiden,,
    ENS-LSH/CNRS - ICAR UMR5191, Institut de Linguistique Française
    15, parvis René Descartes 69342 Lyon BP7000 Cedex, tél. +33 4 37 37 63 12, fax. +33 4 37 37 62 65

    This archive was generated by hypermail 2b29 : Mon Nov 22 2004 - 17:13:33 MET