[Corpora-List] parser for name-internal structure

From: Hal Daume III (hdaume@ISI.EDU)
Date: Mon Nov 22 2004 - 18:32:41 MET

  • Next message: uclegan@ucl.ac.uk: "Re: [Corpora-List] plain text extraction from ICE-GB -- found a solution..."

    Hi Fellow Corpora Folks --

    At a recent ACE (Automatic Content Extraction) meeting, there was
    discussion regarding the parsing of the internal structure of names,
    something (I believe) along the lines of:

      "President George W. Bush" -->

       [ First = "George",
         Middle = "W.",
         Last = "Bush",
         PreMod = "President"
       ]

    There was a suggestion that there existed at least one or two publically
    available tools to do such parsing (rule-based, as I understood it).
    Here at ISI we've recently begun to desire such a tool (both for
    ACE-specific work and otherwise). If anyone has, or knows of, such a tool
    and would be willing to share such this tool/knowledge with us, we'd be
    quite appreciative. (Or, alternatively, if anyone has annotated data of
    this sort, that would be a good second option.)

    Best,

     - Hal

    -- 
     Hal Daume III                                   | hdaume@isi.edu
     "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume
    



    This archive was generated by hypermail 2b29 : Mon Nov 22 2004 - 18:47:39 MET