[Corpora-List] Converting the LDC NANTC to XML

From: Scott James Cederberg (cederber@csli.stanford.edu)
Date: Thu Jun 12 2003 - 22:16:14 MET DST

  • Next message: wassim souayah: "[Corpora-List] DTD for HTML documents?"

    Hello corpora folks,

          I'm attempting to convert the LDC North American News Text
          Corpus (NANTC; LDC95T21) to XML, using the OSX tool (descended
          from James Clark's SX).

          Has anyone else done this? One thing that stands in the way is
          that we don't have a DTD for the NANTC SGML format; does anyone
          have one?

          Any help/pointers/advice appreciated.

                                                    Scott Cederberg
                                                    CSLI
                                                    Stanford University



    This archive was generated by hypermail 2b29 : Thu Jun 12 2003 - 22:22:01 MET DST