[Corpora-List] Experience with linguistic annotation in separate files?

From: Scott James Cederberg (cederber@csli.Stanford.EDU)
Date: Fri Apr 18 2003 - 05:19:57 MET DST

Next message: Harold Somers: "[Corpora-List] Survey of online MT"

Previous message: Ken Litkowski: "Re: [Corpora-List] verb-sense classification"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello,

        Does anyone out there have some experience working with
        corpora that have linguistic annotation (e.g. for part of
        speech, syntax, multiword expressions, or word senses) kept in
        files separate from the text itself?

        This is the system recommended by the CES and XCES corpus
        encoding standards, and the TEI guidelines also provide a
        mechanism for putting tags in one document that indicate links
        to another document.

        I'm trying to get my mind around how best to enable software
        to access corpus annotation in such a format. Ideally such
        access could be provided using standard XML formats and tools,
        like XPath and XSLT.

        Any suggestions on how best to do this, pointers to software
        or APIs that work with modular annotation, etc. would be
        invaluable.

Thanks for your help.

Scott

-- Scott Cederberg Researcher

Infomap Project Computational Semantics Lab Center for the Study of Language and Information (CSLI) Stanford University

http://infomap.stanford.edu/

Next message: Harold Somers: "[Corpora-List] Survey of online MT"
Previous message: Ken Litkowski: "Re: [Corpora-List] verb-sense classification"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Apr 18 2003 - 05:21:24 MET DST