RE: Corpora: Corpus Linguistics User Needs

Scott Meredith (scottmer@microsoft.com)
Wed, 29 Jul 1998 08:06:39 -0700

Sorry, Geoffrey, American tourists would say "Where's the elevator
?"

> -----Original Message-----
> From: Geoffrey Sampson [SMTP:geoffs@cogs.susx.ac.uk]
> Sent: Wednesday, July 29, 1998 2:55 AM
> To: corpora@hd.uib.no
> Cc: O.Mason@bham.ac.uk; ylva.berglund@engelska.uu.se
> Subject: Re: Corpora: Corpus Linguistics User Needs
>
>
> I'm afraid my response risks sounding a little arrogant, but this is a
> point
> that has puzzled me for years. You are quite right to say that many
> corpus linguists do not know how to write programs, and rely on software
> produced by others which may not meet their needs. It has always seemed
> to
> me that the answer to a corpus linguist who sees this as a problem is
> "Learn to program, then". I have never understood why it has become
> socially
> acceptable for even quite junior academics to say "I can't program,
> someone
> else will have to do this for me", while they wouldn't dream of saying
> "I don't know how big library catalogue systems work, someone else will
> have to fetch my books".
>
> (In case anyone thinks "It's all very well for him to write that way, he
> is
> a computer specialist", perhaps I should mention that my first degree was
> in Chinese, mainly classical Chinese language, literature, Chinese
> history,
> etc., plus a little general linguistics. I decided to learn about
> computers
> as a graduate student because it was clear that they were destined to
> become useful tools in linguistics.)
>
> I believe this situation is not just a social oddity but is having
> unfortunate
> consequences for progress in corpus linguistics. There is now an attitude
>
> abroad that anyone who produces corpus research resources has not finished
> his
> job unless he also produces purpose-built software for extracting
> information
> from the resource. Since any such software necessarily will anticipate
> only
> some possible questions that users might want to ask, and will fail to
> provide for answering other kinds of question, this channels research into
> a limited range of "obvious" directions and discourages originality. My
> policy with SUSANNE and subsequent resources that I have been responsible
> for
> creating has been to give the files an extremely simple and
> well-documented
> structure, so that it is as easy as possible for researchers to write
> programs to extract whatever type of information they want. But I have
> become used to people asking "Where is the software to go with SUSANNE?",
> like American tourists in a lovely old hotel gazing upstairs
> to the first floor and forlornly asking "Where's the lift?"
>
>
> Geoffrey Sampson
>
> School of Cognitive & Computing Sciences
> University of Sussex
> Falmer, Brighton BN1 9QH, GB
>
> e-mail geoffs@cogs.susx.ac.uk
> tel. +44 1273 678525
> fax +44 1273 671320
> Web site http://www.grs.u-net.com