Brown Corpus Semantic Concordance & WordNet

t-markl@microsoft.com
Thu, 8 Jun 95 10:50:06 PDT

Hi Russell,

> I recently read somewhere that there exists a version of the Brown Corpus (or
> part of it) annotated with word meanings taken from WordNet.
>
> If anyone knows any more information on this then please let me know.

The Brown Corpus version you refer to comes with the current release
of WordNet. I've included the announcement for WordNet below
(see the section on Semantic Concordance). You might also want
to subscribe to the WordNet users list.

Best wishes,
Mark Lauer
Microsoft Institute
Sydney, Australia

----------
From: "Wordnet" <wn@clarity.Princeton.EDU>
To: <clk@Princeton.EDU>; <geo@clarity.Princeton.EDU>;
<jlg@clarity.Princeton.EDU>; <wnusers-all@clarity.Princeton.EDU>
Subject: WordNet 1.5
Date: Thursday, 16 March 1995 4:42PM

*** WordNet Version 1.5 is now available ***

WordNet is an online lexical reference system. Word forms in WordNet
are represented in their familiar orthography; word meanings are
represented by synonym sets (synset) - lists of synonymous word forms
that are interchangeable in some context. Two kinds of relations are
recognized: lexical and semantic. Lexical relations hold between word
forms; semantic relations hold between word meanings.

To learn more about WordNet, read "Five Papers on WordNet", available
via anonymous ftp and in printed form. WordNet is available in several
different packages, based on computer platform. This message contains
instructions for obtaining the WordNet system and "Five Papers on
WordNet".

Please address all email concerning WordNet to wordnet@princeton.edu.
If you have received this message via email and do not wish to remain
in the user database, please send a request to be deleted.
Information on joining the user mailing list appears at the end of
this message.

We are also establishing a 'contrib' directory. If you have a package
that you would like to have considered for addition, please send email
to wordnet@princeton.edu.

WordNet is available via ftp, as described below, or you may use and/or
ftp WordNet using a World Wide Web browser such as Mosaic or Netscape.
Our URL is: "http://www.cogsci.princeton.edu/~wn/". We will add links
to user applications and papers. Please send email to the above
address.

Release Information:

WordNet Version 1.5 is now available. The WordNet database is 18.4
megabytes. The "sense index" file (see note below) is an additional
5.6 megabytes, and the search code may be up to 1.5MB, depending on
your computer platform. The installation Makefile as shipped for Unix
systems uses the "install" command to copy the database into the
destinations directory, so an additional 18.4 megabytes is needed for
the installation to proceed normally.

The PC package contains binaries for MS-DOS and Microsoft Windows, and
is available via ftp and on diskette. Source code is not included.

The Macintosh package contains binaries for a Macintosh Operating
System interface, and is available via ftp and on diskette. Source
code is not included.

The Unix packages contains binaries for command line and X Windows
interfaces for the following platforms only, and is available via ftp
and on 8mm tape. Source code is also included in the Unix package for
users of other platforms and application developers. If you do not
use one of the following systems, you must compile WordNet locally.
Listed below is each platform for which a binary is provided, the
version of the operating system that was running, and the version of
the compiler used.

SPARCstation under SunOS: SunOS Release 4.1.3 using gcc version 2.6.2
Silicon Graphics: Mips processor running Irix 5.2 using gcc 2.6.2
SPARCstation under Solaris: SunOS 5.3 using gcc 2.5.8
NeXT: gcc 2.2.2 (not that this is not a NeXTStep application -
you must have X Windows for the NeXT to use the
X Windows interface)
PC: 486 processor running Linux 1.1.18, X11R6

Some of the packages have been prepared using 'gzip', rather than
'compress'. If you don't already have 'gzip', please ask you system
administrator to download it from prep.ai.mit.edu or one of the
numerous ftp sites which mirror GNU software.

What's New:

If you are currently using an earlier version of WordNet you are
strongly encouraged to upgrade to version 1.5. Small bugs and
inconsistencies in both the database and search software have been
corrected, and the database coverage has been expanded. Many more
attribute pointers have been added.

Participial adjectives have been added to the database. A search that
groups senses by similarity of meaning has been added to the interfaces.

Senses are now generally ordered from most to least frequently used,
with the most common sense numbered 1. Senses that have occurred in
semantically tagged corpora determine the frequency of senses. Senses
that have not occurred are presented in haphazard order.

New with release 1.5 is the "sense index". The sense index provides
an alternate method for accessing synsets and word senses in the
WordNet database. It is useful to programs that are interested in
retrieving synsets or other information related to a specific sense,
rather than all the senses of a word. A specific WordNet sense can be
used to directly index into this file and obtain the WordNet sense
number and byte offset of the synset containing the sense. The sense
index is a 5.6MB file. It is included with the Unix package, and can
be obtained separately for the PC and Macintosh. It is not used by
the WordNet searching software, and can be deleted if you will not be
using it.

Semantic Concordance:

Sense tags in the Semantic Concordance of Brown Corpus files released
with WordNet 1.4 have been updated to match the 1.5 database. Other
"cleanups" of the files have been made as a result of ongoing Q/A
functions. The concordance files and Escort are not available at this
time. We expect to release the 1.5 Semantic Concordance package in
May. Note that you must install WordNet 1.5 before installing and
using the semantic concordance package.

FTP Instructions:

We prefer that you ftp the WordNet system via anonymous ftp from one
of the following ftp sites:

In the US: clarity.princeton.edu [128.112.144.1]
In Europe: ftp.ims.uni-stuttgart.de [141.58.127.61]

(Note that the ftp site in Europe may lag the US site by a few days.)

Anyone who wants to mirror this release on an ftp server should notify
Princeton in advance.

**************************************************************************
* IF YOU FTP WordNet, PLEASE SEND MAIL TO wordnet@princeton.edu SO WE *
* CAN UPDATE OUR RECORDS AND KEEP TRACK OF OUR USERS FOR FUTURE MAILINGS *
* AND RELEASES. EVEN IF YOU ARE A CURRENT USER WHO IS UPDATING, IT IS *
* USEFUL TO US TO KNOW THAT YOU HAVE UPGRADED TO 1.5. *
**************************************************************************

In order to facilitate easier downloading, some of the packages have
been split into smaller chunks. If you are ftp'ing a package that has
been split, as indictated by the suffix ".tar.gz.aX" where "X" is a
lower case letter, you must download all of the individual files in
the package.

To rebuild the Unix distribution, download all the parts, and 'cat'
them into one tar.gz file, for example with the following command:

cat wn1.5unix.tar.gz.a* > wn1.5unix.tar.gz

The .gz files can then be unpacked with the 'gunzip' utility, and
untarred with 'tar'. You may also use a pipe to unbundle the whole
thing. Make sure you are in the directory from which you want to
install WordNet before untarring the file.

cat wn1.5unix.tar.gz.a* | gunzip | tar xvf -

These instructions apply to all the split packages - just substitute
the package name for "wn1.4unix.tar.gz.a*"

***** REMEMBER TO FTP IN "binary" MODE!!! *****

Information about ftp site clarity.princeton.edu:

Host: clarity.princeton.edu [128.112.144.1]
Login: ftp
Password: Your e-mail address
Directory: pub/wordnet

(Information about ftp site ftp.ims.uni-stuttgart.de follows)

Unix version:

wn1.5unix.tar.gz.a* WordNet 1.5 for Unix systems

PC version:

readme.pc README file for PC installation

wn15.zip WordNet 1.5 for PC

wn15si.zip WordNet sense index file for PC

pkunzip.exe Program to 'unzip' the files. If you already have
this on your PC you do not need to ftp this file.

Macintosh version:

readme.mac README file for Macintosh installation.

MacWordNet1.5.sit.bin WordNet 1.5 for Macintosh

WordNet1.5si.sit.bin WordNet sense index file for Macintosh

UnStuffIt-Deluxe-TM.bin
Program 'unstuff' the Macintosh version.
If you already have UnStuffit on your
Macintosh, you do not need to ftp this file.

Prolog version of database:

wn1.5prolog.tar.gz.a* WordNet 1.5 database in Prolog format

"Five Papers on WordNet":

5papers.tar.Z troff source format, and Makefile for
formatting and priting

5papers.ps PostScript format

***********************************************************************

Information about ftp site ftp.ims.uni-stuttgart.de:

Host: ftp.ims.uni-stuttgart.de [141.58.127.61]
Login: ftp
Password: Your e-mail address
Directory: /pub/WordNet/1.5
('ls' sometimes doesn't work properly, 'dir' shows everything).

Due to the size of the WordNet distribution, please restrict
downloading to time frames outside office hours Central European Time
(i.e. outside 9 a.m. - 6 p.m.).

All queries regarding the installation of the system, its use and
maintenance should go to wordnet@princeton.edu, and _not_ to us. We're
just providing disk space here.

The directory contains the WordNet 1.5 release. Please read all
README* files.

README A copy of this file
README.MAILING-LIST how to subscribe to the mailing list
README.MIRROR about our WordNet mirror
README.ORDER how to order WN on disks

Obtaining WordNet through the mail:

You can order WordNet on diskette for the PC or the Macintosh. You
can also order WordNet on 8mm tape for Unix systems. See the order
form below for more information. If you have received an earlier
version of WordNet on magnetic media, you may return the media to us
and receive an upgrade for $10.

WordNet 1.5 Order Form

Name: _______________________________

Address: _______________________________

_______________________________

_______________________________

_______________________________

Email address: _______________________________

You can receive an upgrade for $10 if you return a set of
diskettes or an 8mm tape to us. For each upgrade, change
"Price" to $10.

Quantity x Price = Total

WordNet for IBM PC or Compatible
3.5" high density diskettes ______ x $25 = ______
5.25" high density diskettes ______ x $25 = ______

Sense Index for IBM PC or Compatible
3.5" high density diskettes ______ x $10 = ______
5.25" high density diskettes ______ x $10 = ______

WordNet for Macintosh
3.5" high density diskettes ______ x $25 = ______

Sense Index for Macintosh
3.5" high density diskettes ______ x $10 = ______

WordNet for Unix systems
8mm tape ______ x $30 = ______

"Five Papers on WordNet" ______ x $6 = ______

TOTAL = ______

Please send a check, payable in US currency to
"Princeton University", to:

Pamela Wakefield
Department of Psychology
Green Hall
Princeton University
Princeton, NJ 08544-1010
Attn: WordNet


******* WordNet users' mailing list **********

We have a WordNet users' mailing list that is administered here
at Princeton. Items addressed to the mailing list will be
automatically forwarded to all users on the list. Please note that
this mailing list is separate from the user database. In order to
participate in the mailing list, you must specifically request to be
added. We hope that the mailing list will be a place for useful
discourse about WordNet to take place. We at Princeton are always
interested to hear what our users are doing with WordNet, and we
imagine many users wonder what other users are using it for.
Hopefully this mailing list will help to bring researchers together to
exchange their ideas, experiences, code and philosophies.

To post a message to the mailing list, address mail to
'wn-users@princeton.edu'. Requests to be added to or removed from the
mailing list should be sent to 'wn-users-request@princeton.edu'.
Although you have received this announcement, you will only be added
to the mailing list if you send a request to
'wn-users-request@princeton.edu'. Please be sure to include your
correct e-mail address in the body of your request. Also, to help us
keep our records up to date, if you are a current WordNet user it
would be helpful to us if you would include the version of WordNet you
are using (the latest release is 1.4) and the platform(s) that you are
running on.

If you have code or various flavors of the WordNet database that you
would like to share with others you can announce it to users via the
mailing list. We will be setting up a "contrib" directory on our ftp
site for user submitted programs. Please send email to
'wordnet@princeton.edu' if you have a package to contribute.

To help with the administrative end of things, items sent to
'wn-users-request@princeton.edu' should use the 'Subject' of the
message to convey the intent of the request. To be added to the
mailing list, please specify a subject of 'Add user'. Similarly, to
be removed from the list, specify a subject of 'Remove user'. Other
types of requests should attempt to make intelligent use of the
message subject.

PS. Administrative requests may only be handled once a week so please
be patient.