Re:[Corpora-List] Is the TEI a waste of time?]

From: Torzec Nicolas ATER LSI (Nicolas.Torzec@enssat.fr)
Date: Tue Jul 01 2003 - 16:51:02 MET DST

  • Next message: Magali Jeanmaire: "[Corpora-List] Call for contributions to the Cocosda 2003 workshop"

    I agree with the Sylvain Loiseau's idea that "It is surprising to see
    how little software there is for TEI corpora".
    Now that TEI-P4 (and TEI-Lite) proposes a DTD to encode TEI conformant
    corpora in XML, it is easier for people having a little background in
    Computer Sciences to develop TEI specific tools by existing (generic)
    XML libraries.
     
    Do we have to conclude:
    - That no one uses the TEI standard nowaday (i.e. no one needs TEI
    specific XML tools to create, annotate, manage and exploit TEI
    conformant corpora). :-(
    - Or that every one has developed its own TEI-XML specific tools and
    keep it secret? ;-)

    Personally, I am in the second position but the tools that I have
    developped are more "quick-and-dirty tools" (that's why I don't
    communicate about them) than "high-quality softwares" !

    Is the TEI Software Page up to date ? (Cf.
    http://www.tei-c.org/Software/index.html)

    Nicolas.
     

    --
    Nicolas TORZEC
     
    ENSSAT / Université de Rennes 1
    6, rue de Kerampont
    22300 Lannion
     
    Mel : nicolas.torzec@enssat.fr
    Tel : 02.96.46.27.30
    Fax : 02.96.37.01.99
    Web : http://www.enssat.fr
    --
    

    > > Sylvain Loiseau wrote: > > > > > > I agree with this idea. It is surprising to see how little software there > > is for TEI corpora. The TEI is a waste of time only if the encoding is > > under-exploited - which is a problem for the researcher, not for the TEI. > > As said G. Williams a minimal encoding with hasty-pasted-header and > > word-processor-regex encoding of <p> takes only a few minute. But in order > > to exploit easily the encoding there is no public framework or set of tools > > for treatment of TEI-corpus - such as concordancer based on SAX stream, > > etc. Something like a set of classes for calling parser, SAX rewriting, > > etc., allowing just to insert SAX handlers or XSLT stylesheets in the > > pipeline could be very useful. While XML always gain ground when it > > normalizes both the standards and the software methodologies, the TEI > > remain a pure standard. > > > > I think the TEI is obviously necessary for the view G. Williams defends - a > > corpus is not a sac of words - and for interoperability, etc. But I agree > > that the TEI is perhaps "out to date" for some points: there is nothing for > > morphosyntaxic or morphologic encoding, texts profiling, etc. The TEI > > remains perhaps not sufficiently adapted to linguistic corpora. This > > is quite obvious if we look at the projects listed on tei-c.org : it is > > mainly philological uses of the TEI. > > > > Sylvain Loiseau



    This archive was generated by hypermail 2b29 : Tue Jul 01 2003 - 16:49:33 MET DST