Re: Corpora: Corpora of scientific texts

Tony Berber Sardinha (tony4@uol.com.br)
Wed, 21 Oct 1998 10:44:12 -0200

Hi,

I compiled a corpus of research articles for my PhD project consisting of
100 full texts in English, about 300,000 words. The purpose was to test a
discourse segmentation procedure. Details of the whole corpus (which
included other text types) and the results are described in the thesis
which is available in postscript (and soon in adobe acrobat as well) at

http://homepages.infoseek.com/~corpuslinguistics/homepage.html

please look under 'thesis'.

tony.
----------------------------------------------------------------------------
-------------------------------
Dr Tony Berber Sardinha
Catholic University of Sao Paulo, Brazil
tony4@uol.com.br
http://sites.uol.com.br/tony4/homepage.html
http://homepages.infoseek.com/~corpuslinguistics/homepage.html
----------------------------------------------------------------------------
-------------------------------

----------
> From: Chris Allen <Chris.Allen@ih.hh.se>
> To: corpora@hd.uib.no
> Subject: Corpora: Corpora of scientific texts
> Date: 21 October 1998 03:40
>
> I was wondering whether anyone in this list has any information about the
> creation of a corpus of scholarly scientific articles written in English.
I
> am well aware that the Cobuild Bank of English includes a science
subcorpus
> but this is drawn from the New Scientist magazine. The texts are what
might
> loosely be termed 'popular science'.
>
> What I am after is a corpus of journal papers in the physical, chemical,
> biological or medical sciences. I'd be most grateful to hear from anyone
> with information about such a corpus.
>
> Best wishes,
>
> Chris Allen
> University of Halmstad
> Sweden
> Chris Allen
>
> University of Halmstad Sweden
> direct tel. +46 35 167372 (office)
> +46 35 51527 (home)
> fax +46 35 129289
> http://www.hh.se
>