Take a look at the Corpus Encoding Standard (CES) at
This EAGLES standard provides encoding conventions and a data architecture
for corpora intended for NLP and corpus linguistics research.
-----------------------------------------------------------------------------
Nancy Ide
Professor and Chair Tel: (+1 914) 437 5988
Department of Computer Science Fax: (+1 914) 437 7498
Vassar College WWW: http://www.cs.vassar.edu/~ide
124 Raymond Avenue E-mail: ide@cs.vassar.edu
Poughkeepsie, New York 12604-0520 USA
-----------------------------------------------------------------------------