Re: Corpora: Zarf now freely available for civilian use. (fwd)

James L. Fidelholtz (jfidel@siu.buap.mx)
Thu, 22 Apr 1999 09:55:43 -0500 (CDT)

On Thu, 22 Apr 1999 eric@scs.leeds.ac.uk wrote:

>Lou,
>surely Zarf *does* have a meaning,
>viz something like :
>Zarf: a codeword used by US National Security Agency to reference specific
>classified information; the term Zarf is UNCLASSIFIED, even though
>Information protected by the Zarf codeword will continue to require
>protection.
>
Having had some experience with US Government classification, perhaps I
can throw a little light on the subject. The whole basis of any
classification system is to limit knowledge in various ways to keep the
'enemy' (this of course includes the idle curious, reporters, as well as
real or imagined enemies, etc.) from getting hold of it. At the highest
(conceptual) level, we have CLASSIFIED vs UNCLASSIFIED. CLASSIFIED, in
turn, is divided into CONFIDENTIAL, SECRET and TOP SECRET, with ever
higher degrees of perceived danger if the information gets into the
wrong hands. In fact, there is no higher classification than TOP
SECRET, in principle. Nevertheless, as soon as something is classified,
another principle comes into effect: no one should see the information
if he does not have a NEED TO KNOW, which can be interpreted strictly or
loosely, depending on the area one is working in. In most government
areas, it is very strictly defined indeed, so that people will even
avoid trying to get something which they are not very sure they need to
know to do their job, analysis, etc. However, there are some areas of
knowledge where the information, because of its very nature, could
compromise the sources (eg stuff on the 'other side' known only to one
or two persons--one of them, then, would presumably have had to have
been the direct or indirect source of the information if it got out).
This kind of material is very closely protected indeed, and in addition
to its classification (usually TOP SECRET), it is further protected by
the designation 'codeword'. The specific codeword for each
classification is a quite closely held secret, and usually does not even
appear on the documents with such a classification. Furthermore, these
codewords are periodically changed. I suspect that no current document
is receiving the codeword 'Zarf'. Nevertheless, older documents that
were classified with this codeword continue to be protected until they
are declassified (when it is considered fairly certain that no damage
could come about from divulging the information), which in many cases
never occurs.
While a lot has been made of the fact that sometimes published
articles or newspapers have been classified, this is not always as
stupid as it looks. Oftentimes, the mere fact that someone who
analyzes, say, the Russian military is interested in a particular
article could convey information to the Russian military about what this
person knows, and therefore about possible sources of that information,
which could result in those sources drying up (published sources or
human ones). It is in fact a very complicated and convoluted sort of
game, but with rather high stakes at times.
This is not to say that stuff doesn't get classified just out of
pure paranoia, or to cover up mistakes, etc., but this latter is
definitely illegal, and this sort of thing is most of the time frowned
upon, although it is obviously a bit difficult to control.

>is this really very different from, say:
>
>BNC: a codeword used by the international Corpus Linguistics community to
>refer to a specific dataset; the term BNC is widely-used, even though
>only a small cognoscenti have detailed knowledge of the contents and
>structure of the dataset.
>
Not a bad point, but at least we can find out all we want to
know about the BNC if we want to, since nobody can zarf it!
Jim

James L. Fidelholtz e-mail: jfidel@siu.buap.mx
Maestri'a en Ciencias del Lenguaje
Instituto de Ciencias Sociales y Humanidades
Beneme'rita Universidad Auto'noma de Puebla, ME'XICO