Re: Summary of Responses - Computerise Dialect Dictionary

J. Fix (jf4@ukc.ac.uk)
Fri, 8 Mar 1996 21:41:38 +0000 (GMT)

On February 18th 1996 I posted a message asking for help on how to
computerize a German dialect dictionary. Finally I found the time to put
together a short summary of all responses.

People who answered were:

John Clifton <JMClifton@aol.com>
Will Dowling <will@franklin.com>
Matthias Heyn <100633.1517@compuserve.com>
Jacques Van Keymeulen <Jacques.VanKeymeulen@rug.ac.be>
Alexander King <adk8c@darwin.clas.virginia.edu>
Nenad Koncar <nk3@doc.ic.ac.uk>
Wilfried Kuhn <100737.3261@compuserve.com>
Andrea de Leeuw van Weenen <LeeuwvW@RULLET.LeidenUniv.nl>
Robin Lombard <lombard@langlab.uta.edu>
Kazuto Matsumura <kmatsum@tooyoo.l.u-tokyo.ac.jp>
Jon Mills <jon.mills@luton.ac.uk>
Ole Norling-Christensen <olenc@coco.ihi.ku.dk>
Elisabeth Seitz <elisabeth.seitz@uni-tuebingen.de>
George Smith <gsmith@zedat.fu-berlin.de>
C. M. Sperberg-McQueen <U35395@UICVM.CC.UIC.EDU>
Julie Thornton <JTHORNTO@eagle.call.gov>
Tony Vital <vitale@dectlk.enet.dec.com>
Ralf Vollmann <ralf@kfs.oeaw.ac.at>

I want to thank everyone very much indeed for their time, interest, and
patience in dealing with my queries.

The nature of my question makes it virtually impossible to give a
concise summary. Sorry, if I have collected the bits and pieces
here rather than offering a homogeneous overview.


SGML, TEI.
~~~~~~~~~
Quite a lot of replies recommended to look at SGML, the Standardized
General Markup Language. This is a kind of metalanguage which allows you
to create your own markup language. As far as I understand, you mark up the
data, either manually or automatically, and view it with an appropriate
program (comparable to HTML documents - one of those markup language based
on SGML - which is parsed and viewed by a WWW browser).

A recommended web site including many pointers to other SGML resources is:
http://www.sil.org/sgml/sgml.html

A recommended newsgroup is comp.text.sgml

Beside SGML as such, there is the Text Encoding Initiative (TEI) which
has published so-called TEI Guidelines intending to provide a kind of
standardized framework for text encoding for the humanities. For dictionary
people, especially interesting is chapter 12 on printed dictionaries.

TEI's web site is http://www-tei.uic.edu/orgs/tei
TEI's mailing list is TEI-L at LISTSERV@UICVM.CC.UIC.EDU

Programs to use - among others I assume - in order to turn a dictionary
(or any other document) into SGML, viz. to use it once it is in electronic
form are
- "sgmls" (free)
- "Author/Editor" (SoftQuad, http://www.softquad.com)
- "XGML" (the company is called Exoterica, based in Canada,
http://www.exoterica.com).
Special dictionary parsers are
- "DIPA" (used at the Danish Dictionary) and
- "LexParse" (used at the University of Tuebingen, Germany).

Other programs.
~~~~~~~~~~~~~~
Suggested and/or used by replicants to build databases (among them
dictionaries) are:
- "the SIL program Shoebox"
unable to comment on this one
- Access (Microsoft; for Win)
well-known RDBMS
- FileMaker Pro (Claris; for Mac and Win)
as well
- HyperCard (for Mac)
one of the first hypertext tools
- AskSam (for DOS)
DBMS
- World Translator (for Win and Mac)
look at http://www.net-shopper.co.uk/software/ibm/trans/index.htm
- Folio VIEWS
"a free-text database management tool"; http://www.folio.com
(educational price approx. 300 USD)
- MultiTerm (for Win)
look at http://www.trados.com "a commercial product and market
leader in the field of terminology database systems"

Misc.
~~~~
A suitable programming language to create a database that can
include graphics and sound seems to be LPA Win_Prolog.

Dictionary and similar projects I was referred to are:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The New OED (http://bluebox.uwaterloo.ca/OED/index.html)
- The Danish Dictionary (email: olenc@coco.ihi.ku.dk)
- Sound Database (email: ralf@kfs.oeaw.ac.at)
- Dictionary of Gamilaraay/Kamilaroi (put on the W3 at
http://coombs.anu.edu.au/WWWVLPages/AborigPages/LANG/GAMDICT/GAMDICT.HTM)
- De Woordenboek van de Vlaamse Dialecten (email:
Jacques.VanKeymeulen@rug.ac.be)
- Dictionary of the Slovene Language (no contact address)
- Atlante Linguistico del Ladino Dolomitico e Dialetti Limitrofi (ALD)
(http://www.sbg.ac.at/rom/people/proj/ald/allgemei.htm)

Books.
~~~~~
An overview of electronic dictionaries in connection with SGML is given in
- Bergenholtz & Tarp (eds.): Manual of Specialized Lexicography. John
Benjamins Publishers. 1995 (in particular, pp. 37-46).
ISBN (Europe): 90 272 1612 6
ISBN (USA): 1-55619 693-8

The following book was quite useful to get a first impression of SGML:
- van Herwijnen, Eric: Pracitcal SGML. 2nd edtion. Kluwer Academic
Publishers. 1995. (ISBN: 0-7923-9434-8)

Two interesting and pretty specialized titles for the lexicographer are:
- Frakes, William B. and Ricardo Baeza-Yates: Information Retrieval. Data
Structures and Algorithms. Prentice Hall. 1992.
- Witten, Ian H., Alistar Moffat, and Timothy C. Bell: Managing Gigabytes.
Compressing and Indexing Documents and Images. Van Nostrand Reinhold. 1994.

In reference to MS Access although not focusing on dictionaries there
were two books recommended:
- Rob, Peter and Treyton Williams: Database Design and Application
Development with Microsoft Access 2.0. New York, London: McGraw-Hill.
1995. (ISBN: 0070530513)
- Ortmann, Dirk: Access 2.0 fuer Datenbankentwickler. Muenchen: Hanser
(= Hanser Programmier Praxis.) 1995. (ISBN: 3-446-18122-9) [German]

This is the first summary I have written to a mailing list so far. If
this one is too short, too long, too imprecise, etc. please tell me. Although
I have looked at several others before composing it I am not sure if it
fulfills its purpose.


------------------------------------------------------------------
Jakob Fix, University of Kent at Canterbury, jf4@ukc.ac.uk
------------------------------------------------------------------