[Corpora-List] TASA corpus

From: P bI K O B___ B.B. (MOCKBA) (rykov@narod.ru)
Date: Fri Apr 30 2004 - 14:23:24 MET DST

Previous message: Gilles Serasset: "[Corpora-List] MLR2004: post COLING 2004 workshop on Multilingual Linguistic Ressources"
In reply to: Ute RЖmer: "RE: [Corpora-List] Learner Corpora"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Dear list members, maybe anybody knows about so called "TASA corpus":

It contains 10 million words of UNMARKED high-school level English text on
Language arts, Health, Home economics, Industrial arts, Science, Social studies, and Business.

Divided into 37,600 text samples, or contexts, or "documents"
(average of 166 words/document).

If the corpus is commercial - then who is owner and the terms of getting it.

The refs I know -

http://www.rni.org/kanerva/cogsci2k-poster.txt
http://lsa.colorado.edu/spaces.html

-- Regards Vladimir Rykov

PhD in Computational Linguistics Personal web-site: rykov.narod.ru mailto: rykov2000@mail.ru Si etiam omnes - ego non English version: www.blkbox.com/~gigawatt/rykov.html

-- Яндекс.Почта: объем почтового ящика неограничен! (http://mail.yandex.ru/monitoring/)

Previous message: Gilles Serasset: "[Corpora-List] MLR2004: post COLING 2004 workshop on Multilingual Linguistic Ressources"
In reply to: Ute RЖmer: "RE: [Corpora-List] Learner Corpora"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Fri Apr 30 2004 - 14:46:17 MET DST