Re: [Corpora-List] sorting OHG (non-ASCII) in PERL

From: Lars Nygaard (lars.nygaard@ilf.uio.no)
Date: Tue Feb 04 2003 - 16:28:57 MET

  • Next message: Thomas Schmidt: "Re: [Corpora-List] sorting OHG (non-ASCII) in PERL"

    Hi,

    This can be done with the "locale" pragma. It's all in the "perllocale"
    manpage.

    Regards,
    lars nygaard

    At 15:56 04.02.2003 +0100, you wrote:
    >Hi,
    >
    >stupid question but perhaps the freaks can help me:
    >
    >we're building a database of Old High German words. Obviously, there are
    >some characters that are not in ASCII (diacritics like stress marks ' and
    >carots ^) and chars that do not follow the 'normal' sorting order (like
    >'uu' for 'w'). One possibility would be to recode these chars (e.g. get
    >rid off the diacritics for sorting and put them back on in the output),
    >but is there a more elegant and general way (e.g. in case one would like
    >to have a long 'e' after the short 'e' etc.) so that one could use it for
    >other scripts as well (UTF puts chars in an order that does not
    >necessarily reflect the 'intuitiv' sequence in a language). - Is there a
    >modul to tell PERL which sorting sequence one would like to use or do I
    >have to program it myself?
    >
    >Thanx for any hints.
    >
    >Henning Reetz

    ________________________________________________
    larsnyg @ glossa.uio.no 22 84 40 42 (jobb)
    http://folk.uio.no/larsnyg 90 63 23 19 (mobil)



    This archive was generated by hypermail 2b29 : Tue Feb 04 2003 - 16:28:24 MET