Corpora: Automatic Language Detection (Web Documents)

From: Arno Scharl (scharl@wu-wien.ac.at)
Date: Mon Aug 27 2001 - 00:42:48 MET DST

  • Next message: John Sinclair: "Corpora: Announcement: Course - Corpora in language teaching"

    Dear CORPORA subscribers,

    in order to extend the functionality of a prototype to analyze the textual
    content of Web-based information systems (see preceding publication alert
    on "Evolutionary Web Development"), we are currently working on a component
    to automatically detect various languages. Thus we would be interested in

    (a) general papers or books on automatic language detection (based on
    words, n-grams,...).
    (b) lists of the most common or typical words in certain languages.

    Please reply to me personally and I'll post a summary of the responses to
    the list.

    Thank you
    & best regards,
    ~ Arno Scharl

    ------------------------------------------------------------------------------
    DDr. Arno Scharl, Associate Professor
    Information Systems Department
    Vienna University of Economics & Business Administration
    Augasse 2-6, A-1090 Vienna, Austria
    email: scharl@wu-wien.ac.at
    tel: ++(43) 1-31336-4444; fax: ++(43) 1-31336-746
    ------------------------------------------------------------------------------
    (c) 2000 Springer London:
    EVOLUTIONARY WEB DEVELOPMENT
    http://webdev.wu-wien.ac.at/
    ------------------------------------------------------------------------------



    This archive was generated by hypermail 2b29 : Sun Aug 26 2001 - 15:35:59 MET DST