RE: [Corpora-List] Legal aspects of compiling corpora

From: Sampo Nevalainen (samponev@cc.joensuu.fi)
Date: Thu Jun 19 2003 - 10:26:31 MET DST

Next message: Doug Cooper: "Re: [Corpora-List] Legal aspects of compiling corpora"

Previous message: Lexikon International: "RE: [Corpora-List] Subcat Questions"
In reply to: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Next in thread: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Next in thread: Mark Sanderson: "Re: [Corpora-List] Legal aspects of compiling corpora"
Reply: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

>then we will face another problem of comparing approaches and techniques,
>if each of us use different corpora (without any possibility to share it
>with others because of the legal aspects) then no comparison will be possible.

My comment is clearly out of topic, but I could not resist... This is one
thing I have not fully understood ever since I was irrevocably taken with
CL. Many text books on CL give an idea that a corpus should have a finite
size and be "a standard reference" (as McEnery and Wilson put it in "Corpus
Linguistics" 1996). In my humble opinion, this is rather unnatural, as,
after all, we are studying an open, ever-growing, dynamic, lively organism
(unless we are interested in "dead" languages). From this viewpoint, if we
are going to generalize anything about a language, at least I would have
more confidence in results that are based on several different corpora
rather than on a detailed description of a certain corpus. Just as weather
forecasts or climate studies -- the more measurement points are available
the more reliable they are. (Clearly, one practical solution is a kind of
"monitor corpus" -- or the Internet. I understand that the cruciality of
this question depends a lot on the purpose(s) of the corpus and the aim(s)
of the researcher, which, I think, should be convergent to some extent.) Of
course, the other side of the coin is economy. It would be a huge waste of
money and resources if everybody should compile corpora of their own - and
preferably non-stop!

sincerely
Sampo

Next message: Doug Cooper: "Re: [Corpora-List] Legal aspects of compiling corpora"
Previous message: Lexikon International: "RE: [Corpora-List] Subcat Questions"
In reply to: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Next in thread: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Next in thread: Mark Sanderson: "Re: [Corpora-List] Legal aspects of compiling corpora"
Reply: Khalid CHOUKRI: "RE: [Corpora-List] Legal aspects of compiling corpora"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Thu Jun 19 2003 - 10:31:43 MET DST