Re: inquiry

toshiko Yamaguchi (tyamag@essex.ac.uk)
Tue, 14 Mar 1995 23:42:13 GMT

On Tue, 14 Mar 1995 09:54:33 +0900 Toshiyuki TAKEZAWA wrote:

> From: Toshiyuki TAKEZAWA <takezawa@jp.co.atr.itl>
> Date: Tue, 14 Mar 1995 09:54:33 +0900
> Subject: Re: inquiry
> To: toshiko Yamaguchi <tyamag@uk.ac.essex>
> Cc: corpora@no.uib.hd, takezawa@jp.co.atr.itl
>
> Dear Ms. Yamaguchi,
>
> > I am studying at the University of Essex in England and intend to deal with
> > morphological processing in Japanese as the topic of my phD dissertation. As
a
> > lead-in to this subject I need to get some frequency data of Japanese words
> > (especially inflectional words). After having had contact with R. Harald Baa
yen
in
> > Mac Planck Institute for Psycholinguistics in Nijmegen in Holland, I receive
d
your
> > e-mail address. He wrote that you may probably be able to provide me with
some
> > convenient Japanese date from your corpora list, I would be very grateful if
you
> > could give me more detailed information. Thank you very much in advance.
>
> Here is a list of available Japanese text corpora.
>
> (1) spoken Japanese
>
> ATR corpus contains conversations between Japanese speakers through
> telephone and/or keyboard communications. All conversations are
> transcribed. Morphological and syntactical tags are given.
> Corresponding English is given. About half million words are
> available. The contact address of a distribution coordinator is as
> follows.
>
> ATR (Advanced Telecommunications Research Institute) International
> Research Engineering Department
> Mr. Shohei TAHARA
> 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02, Japan
> Telephone: +81 774 95 1192
> Facsimile: +81 774 95 1179
> Email: sho@ctr.atr.co.jp
>
> (2) written Japanese
>
> EDR corpus is available. 28 million sentences are collected from
> newspapers, magazines and so on. Morphological and syntactical tags
> are given to about half million sentences.
>
> EDR (Japan Electronic Dictionary Research Institute, Ltd.)
> Telephone: +81 3 3798 5521
> Facsimile: +81 3 3798 5335
>
> I hope that this information is of help to you. Thank you.
> --
> //Toshiyuki TAKEZAWA <takezawa@itl.atr.co.jp>
> ATR Interpreting Telecommunications Research Laboratories
> Kyoto, Japan

Colchester, 14/March/95
Dear Mr TAKEZAWA
Thank you very much for your mail. I am very much interested in both of the
database which you referred to. I have already sent a mail to Mr Shohei TAGARA
concerning to spoken Japanese. With respect to written Japanese I have got the
following inquiries:
(1) Have you got the data for frequency (type and token) of inflected words such
as
verbsd, auxiliaries, adjectives, keihodoshi and so on. If they are ordered accor
ding
to the number of affixes, this would be very helpful for me).
(2) Is it possible to receive a database in CD-ROM?
(3) Is it possible to recieve a set of samples before I decide?
(4) How much are the costs?
I am looking forward to hearing from you soon.
Thank you very much in advance.
With best regards
Toshiko YAMAGUCHI
e-mail address:tyamag@essex.ac.uk
address:GPH Dep. of LANG and LING
University of Essex
Wivenhoe Park, Colchester
CO4 3SQ, England