Re: [Corpora-List] Portuguese thesaurus/dictionary

From: Cédrick Fairon (fairon@tedm.ucl.ac.be)
Date: Wed Mar 17 2004 - 21:41:53 MET

  • Next message: Magali Jeanmaire: "[Corpora-List] ELRA News - Reduced Price on SpeechDat-Car American-English SLR"

    Hello,
    Have you tried the LabEL resources? : http://label.ist.utl.pt/ (click on
    Recursos Publicos).

    The labEL is giving away its data for research purpose:

    Dicionários de palavras simples
    -Dicionário de formas canónicas (cerca de 120 000 entradas)
    -Dicionário de formas flexionadas (cerca de um milhão)
    -Dicionário de siglas e acrónimos (cerca de 4 000)
     
    Dicionários de palavras compostas
    -Dicionário de nomes compostos (amostragem com 10 000 entradas)
    -Dicionário de advérbios compostos (amostragem com 1 000 entradas)
    -Dicionário de preposições compostas
    -Dicionário de conjunções compostas
     
    Their full resources are also available in an Open Source corpus processor:
    http://www-igm.univ-mlv.fr/~unitex/. But in this case, you don't have access
    to the raw dictionaries directly (they are compressed).

    Best,

    Cedrick

    Le Mercredi 17 Mars 2004 15:02, Mark Davies a écrit :
    > I'm looking for a thesaurus (and perhaps also a dictionary) of Portuguese
    > in machine-readable form. In other words, I don't want an off-the-shelf
    > Portuguese electronic dictionary with which I have to use the regular user
    > interface. Rather, I need one where I can access the raw data directly. I
    > know that I can access this type of information via web-based dictionaries,
    > but it would be easier to do it with a local resource on my own machine.
    >
    > Eventually, the thesaurus (and perhaps dictionary as well) will be
    > converted to relational database form, so the closer it is to that form
    > already, the better. Also, I'm willing to pay for the resource, though
    > hopefully it won't be too much, since this will be strictly for
    > non-commercial use.
    >
    > Thanks in advance.
    >
    > Mark Davies
    >
    > =================================================
    > Mark Davies
    > Assoc. Prof., Linguistics
    > Brigham Young University
    > (phone) 801-422-9168 / (fax) 801-422-0906
    > http://davies-linguistics.byu.edu
    >
    > ** Corpus design and use // Web-database scripting **
    > ** Historical linguistics // Functional-typological grammar **
    > ** Spanish and Portuguese historical and dialectal syntax **
    > =================================================

    -- 
    Cédrick Fairon
    Directeur du CENTAL
    Centre de traitement automatique du langage
    Université de Louvain
    Place Blaise Pascal, 1
    1348 Louvain-la-Neuve
    Belgique
    

    ======================================= **** JADT 2004 in Louvain-la-Neuve **** 10-12 March 2004 7th International Conference on the statistical analysis of textual data 7th Journées internationales d'analyse statistique des données textuelles http://www.jadt.org

    Visit our web sites: http://cental.fltr.ucl.ac.be http://glossa.fltr.ucl.ac.be =======================================



    This archive was generated by hypermail 2b29 : Thu Mar 18 2004 - 11:38:53 MET