Re: [Corpora-List] Punctuation

From: Eric Atwell (eric@comp.leeds.ac.uk)
Date: Tue Jan 11 2005 - 17:56:33 MET

  • Next message: Valia Kordoni: "[Corpora-List] Final CfP - 2nd ACL-SIGSEM Workshop on the Linguistic Dimensions of Prepositions"

    Tim,
    most English corpora since pioneering Brown and LOB in 1960s have
    included punctuation, so any of these might do.
    The British National Corpus from 1990s has the advantage of www-based
    trail search, you can "try before you buy" at
    http://sara.natcorp.ox.ac.uk/lookup.html

    For example I tried search term {'|"}
    - regular expression finding all occurrences of ' or "
    (usage depends on original sources so there is no corpus-wide
      standardised punctuation)

    I'm not sure how to identify all and only scare quotes via such regular
    expressions... good luck!

    Eric Atwell, school of Computing, Leeds University

    On Tue, 11 Jan 2005, Grant, T. wrote:

    > I'm looking for a freely accessible English language corpus which allows analysis of punctuation marks - I'm interested for example in examining the use of scare quotes.
    >
    > Any ideas gratefully received.
    >
    > Tim
    >
    > ______________________________________
    > Tim Grant
    > Forensic Section - School of Psychology
    > University of Leicester
    > 106 New Walk
    > Leicester LE1 7EA
    > UK
    >
    > TG21@leicester.ac.uk
    > http://www.le.ac.uk/psychology/tg21/
    >
    > + 44(0)116 252 3658 (Direct Line) - + 44(0)116 252 2451 (Secretary) - + 44(0)116 252 3994 (Fax)
    >
    >
    >

    -- 
    Eric Atwell, Senior Lecturer, Computer Vision and Language research group,
    School of Computing, University of Leeds, LEEDS LS2 9JT, England
    TEL: +44-113-2335430  FAX: +44-113-2335468  http://www.comp.leeds.ac.uk/eric
    



    This archive was generated by hypermail 2b29 : Tue Jan 11 2005 - 17:56:00 MET