On the other hand, it seems to me obvious that any history of electronic
corpora must include mention of the archives and collections as these are
part of our history and everyday working lives. The only proviso is to be
clear as to what distinguishes one from the other.
Best wishes
Geoffrey
williams@ensinfo.univ-nantes.fr
Faculte des Sciences et des Techniques
University of Nantes
France
On Tue, 1 Dec 1998, Oliver Mason wrote:
> > > Thesis and I would like to know when the following electronic corpora
> > > were compiled:
> >
> > >"The Oxford Text Archive";
> > >"International Computer Archive of Modern English".
>
> I don't want to split hairs or start an ideological flame war, but I
> personally wouldn't call those two `electronic corpora'. They're (as
> implied by the name) archives, which *contain* (amongst other data)
> corpora. A corpus is a special collection of textual material
> collected according to a certain set of criteria, like the BNC or the
> BoE, or Brown, COLT, Flob, LOB, whatever. They all made decisions
> about the composition of their data in advance and selected it
> accordingly.
>
> Also, they are homogeneous in the way they are stored/accessed. For
> the BNC you have got SARA, there's Lookup for the BoE, and CUP probably
> have their own special software for their corpus.
>
> Now, correct me if I'm wrong, but does the OTA do the same? Again, I
> DON'T want to criticise anything here, it's just a terminological
> distinction. I am worried that the term `corpus' gets watered down too
> much it is basically used the same way as `archive'. An archive is
> less focussed on doing things with its data, and mainly concerned with
> storage, archival, and retrieval of its elements. If I want an
> electronic copy of a certain book I would use the OTA, but for
> concordance lines of some word I wouldn't.
>
> Anybody else agrees, disagrees?
>
> Oliver
>
> --
> //\\ computer officer | corpus research | department of english | school of -
> //\\ humanities | university of birmingham | edgbaston | birmingham b15 2tt -
> \\// united kingdom | phone +44-(0)121-414-6206 | fax +44-(0)121-414-5668/\ -
> \\// mobile 07050 104504 | http://www-clg.bham.ac.uk | o.mason@bham.ac.uk\/ -
>
>
>