Corpora: Polytechnic of Wales Corpus

From: Norbert Schlueter (nosch@zedat.fu-berlin.de)
Date: Thu Mar 01 2001 - 11:04:34 MET

  • Next message: Priscilla Rasmussen: "Corpora: 2nd CFP for EMNLP-2001 (preceding NAACL-2001)"

    Dear corpus linguists,

    we are about to start a project on English child language and plan to
    use the Polytechnic of Wales Corpus (POW). We have already contacted
    Clive Souter, who has been so kind as to send us the accompanying
    manual. Unfortunately the manual does not have an answer to all our
    questions. We were therefore wondering if anyone could help us with
    the following points:

    1) The manual lists a number of tags but there seem to be extended
    tags in the corpus, which are not described in the manual. Does anyone
    know if a complete list of tags has been compiled?

    2) The authors of the corpus have used some tags which are identical
    to proper words like "A" and "OWN", etc. It would be nice to have a
    "raw text version" of the corpus. Before we start to program
    ourselves, we were wondering if anyone has already developed some
    scripts or tools which have been especially written for this corpus?

    3) Has anyone published results investigating lexical and grammatical
    features of the POW?

    Any help will be greatly appreciated. Best wishes from Berlin,

    Norbert Schlüter
    Freie Universität Berlin



    This archive was generated by hypermail 2b29 : Thu Mar 01 2001 - 12:59:28 MET