Re: Corpora: Noun phrases categories

From: Francis Bond (
Date: Mon May 20 2002 - 04:32:49 MET DST

  • Next message: Andrew Harley: "Re: Corpora: Noun phrases categories"


    Fuchun> I am working on classifying noun phrases into several
    Fuchun> categories, such as mass NPs and count NPs, and even dividing
    Fuchun> each category further. The goal is to develop better language
    Fuchun> models for noun phrases modeling. and If it works, we can
    Fuchun> develop better language models for sentences and better NP
    Fuchun> chunkers.

    Fuchun> I am wondering are there any previous work done on this topic?
    Fuchun> How many categories should we divide noun phrases into and are
    Fuchun> there such labeled data?

    There is a vast literature on this in linguistics, two of the references I
    found particularly interesting are:

      author = "Anna Wierzbicka",
      title = "The Semantics of Grammar",
      publisher = "John Benjamins",
      address = "Amsterdam",
      year = 1988

      author = "Keith Allan",
      title = "Nouns and Countability",
      journal = "Language",
      year = 1980,
      volume = 56,
      number = 3,
      pages = "541--67"

    From a computational point of view, I have been looking at
    countability from the point of view of Japanese-to-English MT, and
    suggest splitting countability into 5 types (with a couple of
    sub-types): Fully countable; Strongly Countable; Weakly Countable;
    Uncountable and Plural Only.

    I discuss these in several papers and my dissertation:

      author = "Francis Bond and Kentaro Ogura and Satoru Ikehara",
      title = "Countability and Number in {Japanese}-to-{English}
                      Machine Translation",
      booktitle = coling-94,
      year = "1994",
      address = "Kyoto",
      **month = aug,
      pages = "32--38",
      note = "(\url{})",
      **organization ="The International Committee on Computational
                      Linguistics (ICCL)"
      author = "Francis Bond and Kentaro Ogura",
      title = "Reference in {Japanese}-to-{English} Machine
      journal = MT,
      volume = 13,
      number = "2--3",
      year = 1998,
      pages = "107-134"
      author = "Francis Bond",
      title = "Determiners and Number in {English} contrasted with
                      {Japanese} --- as exemplified in Machine
      school = "University of Queensland",
      year = 2001,
      address = "Brisbane, Australia"

    Ann Copestake also talks a bit about countability in her dissertation
    and other publications too numerous to mention:

      author = "Ann Copestake",
      title = "The Representation of Lexical Semantic Information",
      school = "University of Sussex",
      year = 1992,
      address = "Brighton"

    As far as I know there isn't any labeled data generally available, but
    I would be happy to be proved wrong.

    Francis Bond  <>
    NTT Communication Science Laboratories | Machine Translation Research Group

    This archive was generated by hypermail 2b29 : Mon May 20 2002 - 04:53:46 MET DST