Re: [Corpora-List] Is the TEI a waste of time? / Lack of TEI software tools

From: Sylvain Loiseau (liste_linguistique@toucheraveclesyeux.com)
Date: Sat Jul 05 2003 - 11:38:34 MET DST

  • Next message: Burnard Towers: "RE: [Corpora-List] Is the TEI a waste of time?"

    > I would guess that most software uses internal, non-XML formats, as they
    > are generally easier to process from a programmer's point of view and
    > more efficient computationally; and if you've got large corpora time and
    > space efficiency are quite important. My own approach has always been
    > that TEI-style markup is fine for exchanging data, but when it is being
    > indexed and prepared for processing it'll be converted into some
    > tool-specific form.

    But I think that many people use the TEI as a ready-to-use way of encoding
    data, without developing DTD or format, and to exploit them, thanks to XSLT
    or other tools that doesn't require development. Standardisation of
    software and format is costly in CPU time but they increase the capacity of
    exploiting corpora without strong development skill, which is perhaps an
    important bottle neck of the development of the interest in TEI in
    linguistics fields (This facility of use were the explicit aim of XSLT, if
    I remember correctly). If the TEI is a waste of time for many people it is
    perhaps due to this lack of tool.

    A simple framework allowing to just plug and run easily SAX handlers for
    processing tasks, as a concordancer for instance (and conversion), would be
    of general interest I think, and tools write in that way are more
    reusable and quicker to write.

    With best regards,
    Sylvain Loiseau



    This archive was generated by hypermail 2b29 : Sat Jul 05 2003 - 11:43:28 MET DST