[Corpora-List] Call for Participation: Coling Workshop on Arabic Script Languages [Correction]

From: Karine Megerdoomian (karinem@inxight.com)
Date: Wed Jul 14 2004 - 19:41:07 MET DST

  • Next message: Karine Megerdoomian: "[Corpora-List] Call for Participation: Coling Workshop on Arabic Script Languages"

    Note corrected workshop date. We apologize for any double-postings.

                        ** Call for Participation **

                    COLING 2004 WORKSHOP ON
    COMPUTATIONAL APPROACHES TO ARABIC SCRIPT-BASED LANGUAGES

                     Geneva, Switzerland, 28 August 2004
                 Invited Speaker: Martin Kay (Stanford University)
                  http://members.cox.net/karinem/COLING2004

    WORKSHOP THEME

    Recently, there has been a surge of interest in the study of the languages of the Middle East, especially Arabic, Persian (Farsi), Pashto, Kurdish and Urdu. The usage of the Arabic script gives rise to certain issues that are common to all these languages despite their being of distinct language families. Hence, these languages share properties such as the absence of capitalization, right to left direction, lack of clear word boundaries, complex word structure, a high degree of ambiguity due to non-representation of short vowels in the writing system, and related encoding issues. Yet the research on these various languages have rarely been brought together in a single forum, and most development has been the result of initiatives by individual research establishments or industry firms.

    The goal of this workshop is to provide a forum for those involved in the development of NLP systems in Arabic script languages to exchange ideas, approaches and implementations of computational systems; to discuss the common challenges faced by all practitioners; and to assess the state of the art in the field. In addition, one of the aims of the workshop is to identify promising areas for future collaborative research in the development of NLP systems for Arabic script languages.

    WORKSHOP PROGRAM

    I. Opening and Overview
    8:30-9:00 Computer Processing of Arabic Script-based Languages: Current State and Future Directions - Ali Farghaly

    II. Session 1: Lexicon and Corpora
    9:00-9:30 Developing an Arabic Treebank: Methods, Guidelines, Procedures, and Tools - Mohamed Maamouri and Ann Bies
    9:30-10:00 Preliminary Lexical Framework for English-Arabic Semantic Resource Construction - Anne R. Diekema
    10:00-10:30 The Architecture of a Standard Arabic Lexical Database: Some Figures, Ratios, and Categories from the DIINAR.1 Source Program - Ramzi Abbès, Joseph Dichy and Mohamed Hassoun

    10:30-10:45 Break

    III. Session 2: Morphology
    10:45-11:15 Systematic Verb Stem Generation for Arabic - Jim Yaghi and Sane Yagi
    11:15-11:45 Issues in Arabic Orthography and Morphology Analysis - Tim Buckwalter
    11:45-12:15 Finite-State Morphological Analysis of Persian - Karine Megerdoomian

    12:15-2:00 Lunch & Demo Sessions

    IV. Demonstrations
    Urdu Localization Project - Sarmad Hussain
    FarsiSum: A Persian Text Summarizer - Martin Hassel and Nima Mazdak
    Stemming the Qur'an - Naglaa Thabet
    Language Weaver Arabic->English MT - Daniel Marcu, Alex Fraser, William Wong and Kevin Knight

    V. Invited Speaker
    2:00-2:45 Arabic Script-Based Languages Deserve to be Studied Linguistically - Martin Kay

    VI. Session 3: Statistical Approaches
    2:45-3:15 An Unsupervised Approach for Bootstrapping Arabic Sense Tagging - Mona T. Diab
    3:15-3:45 Automatic Arabic Document Categorization Based on the Naive Bayes Algorithm - Mohamed El Kourdi, Amine Bensaid and Tajje-eddine Rachidi

    3:45-4:00 Break

    VII. Session 4: Speech Processing
    4:00-4:30 A Transcription Scheme for Languages Employing the Arabic Script Motivated by Speech Processing Applications - Shadi Ganjavi, Panayiotis G. Georgiou and Shrikanth Narayanan
    4:30-5:00 Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition - Dimitra Vergyri and Katrin Kirchhoff
    5:00-5:30 Letter-to-Sound Conversion for Urdu Text-to-Speech System - Sarmad Hussain

    VIII. Discussion and Closing
    5:30-6:00 Ali Farghaly and Karine Megerdoomian

    Accepted papers and formal demonstrations will be published in a proceedings volume, which will be made available at the workshop.

    WORKSHOP REGISTRATION

    For the workshops to take place, the COLING 2004 organizers require at least 20 participants to register for the workshop. Speakers and participants are therefore asked to register via the official Coling 2004 website as soon as possible by visiting http://www.issco.unige.ch/coling2004/.

    Workshop fees (in Swiss Francs):
    * Student early chf 90
    * Student late chf 120
    * Student on-site chf 150
    * Regular early chf 120
    * Regular late chf 150
    * Regular on-site chf 180

    ORGANIZING COMMITTEE
     
    Ali Farghaly (SYSTRAN Software, Inc.)
    Karine Megerdoomian (Inxight Software and University of California, San Diego)

    PROGRAM COMMITTEE

    Jan W. Amtrup (Bowne Global Solutions)
    Tim Buckwalter (Linguistic Data Consortium)
    Miriam Butt (Konstanz University, Germany)
    Violetta Cavalli-Sforza (Carnegie Mellon University)
    Joseph Dichy (Lyon University)
    Abdelkadir Fassi Fehri (Mohammed V University-Souissi Rabat, Morocco)
    Andrew Freeman (University of Washington)
    Nizar Habash (University of Maryland, College Park)
    Masayo Iida (Inxight Software, Inc)
    Simin Karimi (University of Arizona)
    Martin Kay (Stanford University)
    Kevin Knight (USC/Information Sciences Institute)
    Farhad Oroumchian (University of Wollongong in Dubai)
    Ahmed Rafea (The American University in Cairo)
    Jean Senellart (SYSTRAN Software)
    Bonnie Glover Stalls (University of Southern California)
    Rémi Zajac (SYSTRAN Software)



    This archive was generated by hypermail 2b29 : Wed Jul 14 2004 - 19:48:32 MET DST