Corpora: IREX (Japanese IR & IE contest)

Satoshi Sekine (sekine@nonki.cs.nyu.edu)
Thu, 30 Jul 98 12:53:10 EDT

This is a call for participation to IREX (IR&IE contest for Japanese language).
We apologise for the duplicate posting of this announcement.

Satoshi Sekine
==================

IREX
Information Retrieval and Extraction Exercise
http://cs.nyu.edu/cs/projects/proteus/irex/index-e.html

(IR & IE Contest for Japanese language)

------------------------------------------------------------
Introduction

An IR (Information Retrieval) and IE (Information Extraction)
contest for Japanese language is planned to hold. The first
contest will be "grass roots" style; international participants
are welcome provided that people share the following objectives:

* To contribute to the improvement of the field
* To widen the research area and circle of researchers
* To increase the amount of corpus and database
* To promote this kind of effort in the future

There is no fee to participate, although you have to buy a newspaper
corpus, "Mainichi Shinbun (94,95)" which is sold by a company,
"Nichigai-associates" (about $180 each year).
Also, most of the information (definitions, etc) will be distributed
in Japanese, please don't expect us to translate everything to
other languages. (Sekine (sekine@cs.nyu.edu) will privately
assist you in some cases.)

------------------------------------------------------------
Tasks

Two tasks are planning to hold.
Anyone can participate one or both tasks.

* NE: Named Entity Task

Basically, it is similar to the MUC-NE or MET task. There are
minor differences, like "artifact" which includes product names,
names of services, etc is added. Also, there will be two kinds of
test: one is for general domain texts, and the other is for
specific domain texts. The domain will be announced about two
weeks before the test period.

* IR: Information Retrieval

Basically, it is similar to the TREC adhoc task. The target is to
retrieve about 300 relevant documents from two years of newspaper
articles.

These tasks are designed for technology evaluations, rather than
commercial purposes. For example, many people discussed that
interface is an important issue in IR, etc, however, these issues
should be addressed in the future IREX.

------------------------------------------------------------
Schedule

* June 30, 1998 : Distribute draft version of definitions, sample data
* July 31, 1998 : Preliminary application due (this is not a hard deadline)
* September 16, 1998 : Close the discussion for the definitions
* October - November, 1998 : Dry-run
* February 28, 1999 : Application due

== Evaluation ==
* April 5, 1999 : Distribute IR queries
* April 12, 1999 : IR result due (JST 23:59)
* April 13, 1999 : Distribute NE tasks
* April 16, 1999 : NE result due (JST 23:59)

* September, 1999 : Workshop (planned)

------------------------------------------------------------
Registration

You can find the application form in the Japanese homepage.
If you can't read Japanese, please send email to sekine@cs.nyu.edu
and isahara@crl.go.jp.
------------------------------------------------------------
More Information (in Japanese - EUC)

http://cs.nyu.edu/cs/projects/proteus/irex/

------------------------------------------------------------
Organization

Organized by : IREX Committee
Mailing list : irex@karc.crl.go.jp
Co-chair : S.Sekine (NYU), H.Isahara (CRL)
Advisor : M.Nagao (Kyoto-U), H.Tanaka (TITech), R.Grishman (NYU),
T.Ishikawa (ULIS)
Committee Member : T.Tokunaga (TITech), S.Kurohashi (Kyoto-U),
M.Okumura (JAIST), C.Nobata (U-Tokyo), K.Kita (Tokushima-U),
K.Inui (KIT), Y.Nakagawa (YNU), A.Fujii (ULIS), T.Wakao (TAO),
N.Kando (NACSIS), K.Hashida (ETL), E.Sumita (ATR), M.Murata,
K.Uchimoto (CRL), N.Noguchi (Matsushita), A.Okumura,
S.Fukushima (NEC), Y.Ogawa (RICOH), T.Sakai (Toshiba),
J.Fukumoto (Oki), T.Kitani, Y.Eriguchi (NTT Data),
S.Nakawatase (NTT), J.Tomiura (Mitsubishi),
R.Ochitani (Fujitsu), S.Ogino (IBM)