Re: [Corpora-List] I need texts in Tagalog, Indonesian, etc in electronic form

From: Mike Maxwell (maxwell@ldc.upenn.edu)
Date: Fri May 23 2003 - 17:48:10 MET DST

  • Next message: Larry Spitz: "Re: [Corpora-List] I need texts in Tagalog, Indonesian, etc in electronic form"

    Yuri Tambovtsev wrote:
    > Dear colleagues, could you be so kind as to send me some Web sites
    > with the texts in more seldom languages? Do you know any email
    > address in Vatikan to ask for the Bible text in some exotic language
    > like Hakas, Tatar, Turkish, Choockchee, Koriak, Itelmen (Kamchadal),
    > Hawaiian, Phillippino (Tagalog), Swahili, Ainu, Indonesian or
    > Tibetan, etc, etc in the electronic form? How is it possible to get
    > it? Looking forward to hearing from you to yutamb@hotmail.com Remain
    > yours most cordially Yuri Tambovtsev

    The Vatican is probably not the most likely place to look for Bibles.
    Most Bible translation has been done by Protestant organizations, at
    least since the Reformation. There are quite a few web sites that
    contain lists of Bibles in web-accessible form. Try

        http://bible.gospelcom.net/languages/

    http://directory.google.com/Top/Society/Religion_and_Spirituality/Christianity/Bible/Various_Languages/

    http://www.seekgod.org/bible/links.html#Multiple%20Language%20OnLine%20Bibles

    http://dmoz.org/Society/Religion_and_Spirituality/Christianity/Bible/Various_Languages/
        http://bible.com/bible_read.html
        http://www.acm.ndsu.nodak.edu/NDSU_Christian/tracts/stl/trkjv.htm
        http://scriptureresources.com/downloads.asp (Guatemalan languages)
        http://www.htmlbible.com
        http://benjamin.umd.edu/parallel/

    SIL (www.sil.org) has done a lot of translation work in minority
    languages, but their translations are not in general accessible on-line.

    Also, if you're interested in particular languages, you can either
    search for "Bible Tatar", or plug a few well-chosen words from your
    target language into a search engine. We've had very good luck using
    that technique to unearth all kinds of texts in a variety of languages.
    Of course you'll have a language ID problem, if you don't know the
    language you're searching for.

    Be aware that some of the on-line translations are older version--newer
    translations are likely to be copyrighted and not available on the web.
    Some day...

    Also, we occasionally have found translations which are distributed as
    PDF files from which one cannot extract the text; Persian/Farsi is one
    example. (Of course one can extract text from many PDF files, I just
    mean that some PDF files are essentially images of the page. And text
    extracted from "ordinary" PDF files is often in what amounts to an
    unknown encoding.)

         Mike Maxwell
         Linguistic Data Consortium
         maxwell@ldc.upenn.edu



    This archive was generated by hypermail 2b29 : Fri May 23 2003 - 17:50:19 MET DST