Corpora: LREC WORKSHOP ANNOUNCEMENT

Simone Saint Laurent (lrec@ilc.pi.cnr.it)
Tue, 17 Mar 1998 16:53:25 +0100

ADAPTING LEXICAL AND CORPUS RESOURCES TO SUBLANGUAGES AND APPLICATIONS

a workshop to be held at the

FIRST INTERNATIONAL CONFERENCE
ON LANGUAGE RESOURCES AND EVALUATION

GRANADA, SPAIN, 26 MAY 1998

The workshop will provide a forum for those researchers involved in
the development of methods to integrate corpora and MRDs, with the aim of
adding adaptive capabilities to existing linguistic resources.

Organisers: Roberto Basili (University of Roma "Tor Vergata"),
Roberta Catizone (University of Sheffield),
Maria Teresa Pazienza (University of Roma "Tor Vergata"),
Paola Velardi (University of Roma "La Sapienza),
Yorick Wilks (University of Sheffield)

WORKSHOP SCOPE AND AIMS

Lexicons, i.e., those components of a NLP system that contain "computable"
information about words, cannot be considered as static objects. Words may
behave very differently in different domains, and there are language
phenomena that do not generalize across sublanguages.
Lexicons are a snapshot of a given stage of development of a language,
normally provided without support for adaptation changes, whether caused
by language creativity and development or the shift to such
a previously unencountered domain.

The divergence of corpus usages from lexical norms has been studied
computationally at least since the late Sixties, but only recently
has the availability of large on-line corpora made it possible to establish
methods to cope systematically with this problem.
An emerging branch of research is now involved in studies and experiments
on corpus-driven linguistics, with the aim of complementing and
extending earlier work on lexicon acquisition based on Machine Readable
Dictionaries (MRD): data are extracted from texts, as embodiments of
language in
use, so as to capture lexical regularities and to code them into operational
forms. The purpose of this workshop will be to provide an updated snapshot
of current work in the area, and promote discussion of how to make progress.

Central topics will be (though this list is in no way exclusive):

* corpus-driven tuning of MRDs to optimize domain-specific inferences,
* terminology and jargon acquisition,
* sense extensions,
* acquisition of preference or subcategorization information from corpora
* taxonomy adaptation,
* statistical weighting of senses etc. to domains
* use of MRDs to provide explanations of linguistic phenomena in corpora
* what is the scope of "lexical tuning"
* the evaluation of lexical tuning as a separate task, or as part
of a more generic task

INDUSTRIAL PANEL ***NEW****

Automatic adaptation of lexicons to new domains through the use of application
corpora makes NLP applications more adaptable and portable.
The Program Commettee is organizing a joint panel to discuss this (and
other) issues
concerning next generation Information Extraction Systems.
The panel intends to bring industrial representatives to confront
expectations in IE from their viewpoint and degree of maturity of the
offering.
The following (and other) issues will be discussed:
- Is there a market for IE?
- What is the demand in domains such as New Services for the citizens,
Telecommunications, Management Support, etc?
- What are the technical requirements?Is the technology near to the market?

PROGRAM COMMITTEE

Yorick Wilks University of Sheffield
Roberta Catizone University of Sheffield
Paola Velardi University of Roma "La Sapienza"
Maria Teresa Pazienza University of Roma "Tor Vergata"
Roberto Basili University of Roma "Tor Vergata"
Bran Boguraev Brandeis University
Sergei Nirenburg New Mexico State University
James Pustejowsky Brandeis University
Ralph Grishman New York University
Christiane Fellbaum Princeton University

PAPER SUBMISSION

FORMATTING GUIDELINES:
Papers should not exceed 4000 words or 10 pages.

HARD COPIES:

Three hard copies should be sent to:

Paola Velardi
Dipartimento di Scienza dell'Informazione
via Salaria 113
00198 Roma
Italy

ELECTRONIC SUBMISSION:

Electronic submission will be allowed in Poscript or Word per Mac or RTF.
An ftp site will be available on demand.
Authors should send an info email to Paola Velardi
(velardi@dsi.uniroma1.it) even

IMPORTANT DATES *****(PLEASE NOTE EXTENDED DEADLINE)******

Paper Submission Deadline (Hard Copy/Electronic) March 10
Paper Notification April 1
Camera-Ready Papers Due April 20
L&CT workshop May 26

Prof. Paola Velardi
Dipartimento di Scienza dell'Informazione
via Salaria 113
Universita' "La Sapienza"
00198 Roma
ph. +39-(0)6-49918356
fax +39-(0)6-8541842 8841964