[Corpora-List] re: pronunciation (caveat)

From: Damon Allen Davison (linguist@socal.rr.com)
Date: Wed Jul 24 2002 - 18:08:03 MET DST

  • Next message: LDC Office: "[Corpora-List] New Release from the LDC"

    A caveat to all about relying too much on Google (and other search
    engines) for corpus research:

    Although Google allows you to define the page language for searches, it
    looks at ISO tags in the HTML source to determine this. Many people who
    have their own web sites use software that by default inserts an
    English-language ISO tag into their source. Therefore, any spelling
    that happens to be a word in another language may indeed be written in
    another language, despite what the search engine claims.



    This archive was generated by hypermail 2b29 : Wed Jul 24 2002 - 18:26:24 MET DST