on the meaning of 'word sense'

Ted Dunning (ted@crl.nmsu.edu)
Tue, 2 May 1995 09:32:14 -0600

Ted Dunning writes:
> the fact is that most humans have very great difficulty performing
> sense disambiguation. doesn't this seriously bring into whether the
> task is pertinent to language processing?

You might do me and other computational linguists a service by
summarizing the psychological evidence you're referring to
(and providing a reference or two to get started).

i am talking about the task where a subject is given dictionary
definitions, some training text and a chance to discuss with others
which senses various words are. the goal is to tag a number of
instances of a particular word with particular dictionary senses.

what i find particularly interesting is the question of whether two
such subjects will agree on the various sense-tags. i find the
question of whether the two subjects can be coerced into agreeing much
less interesting.

the actual efforts which i am aware of include rebecca bruce's efforts
to get data for her dissertation research and a recent private comment
by bob amsler. generally, subjects can only agree about (roughly) 70%
of the time. it is possible to find words for which this task is
easier, and words for which this task is harder. i get the impression
that people who do work on automated sense disambiguation tend to
select the words they disambiguate so that they can get useable
training data.

a real example might be helpful. taking the word stock, LDOCE shows
the following 21 major senses split across three homographs:

stock
0100 a supply (of something) for use: a good stock of food 0200 goods for
sale: Some of the stock is being taken without being paid for 0300 the
thick part of a tree trunk 0400 (a) a piece of wood used as a support
or handle, as for a gun or tool (b) the piece which goes across the
top of an ANCHOR^1 (1) from side to side 0500 (a) a plant from
which CUTTINGs are grown (b) a stem onto which another plant is
GRAFTed 0600 a group of animals used for breeding 0700 farm animals usu.
cattle; LIVESTOCK 0800 a family line, esp. of the stated character 0900
money lent to a government at a fixed rate of interest 1000 the money
(CAPITAL) owned by a company, divided into SHAREs 1100 a type of garden
flower with a sweet smell 1200 a liquid made from the juices of meat, bones,
etc., used in cooking 1300 (in former times) a stiff cloth worn by men round
the neck of a shirt -compare TIE 1400 \fB in/out of stock \fR kept/not
kept in the shop at the present moment and therefore able/not able to be
bought: ``Have you any blue shirts in stock?'' ``No, I'm afraid they're out
of stock, but we shall be having some more in next month'' 1500 \fB out of
stock \fR having none for sale: ``Have you any blue shirts in stock?''
``No, I'm afraid we're out of stock (of them) at the moment'' 1600 \fB take
stock (of) \fR to consider the state of things so as to take a decision
(often in the phr. \fB take stock of the situation \fR ) -compare
STOCKTAKING; see also LAUGHINGSTOCK, LOCK^2 (8), \fB stock and barrel\fR

stock
0100 to keep supplies of: They stock all types of shoes 0200 to supply: a
shop well stocked with goods 0300 to store: They've stocked their crops in
the BARN -see also STOCK UP

stock
0100 commonly used, esp. without much meaning: a stock greeting such as
``Good morning'' 0200 kept in STOCK^1 (14), esp. because of a standard or
average type: stock sizes

and here are some uses of stock taken from the wall street journal.
note that stock is a particularly easy word to pick senses for in the
wall street journal since you just have to guess homograph 1, sense 10
to get pretty good accuracy. of course, this is wrong since that
definition isn't really an accurate definition of the most common
uses. many other words are much more difficult.

in order to emphasize differences, i have selected sentences here.
these are not a representative distribution.

<s> Stocks of manufacturers were up 0.6% and their inventory-sales ratio reached
1.57 months. </s>

is stock here stock_1_10 or is it referring to inventory?

<s> Wholesale stocks inched up 0.1% with the ratio of inventories to sales at 1.
28. </s>

again, same question.

<s> Finally, with regard to stock market system reform, Mr. Melloan cavalierly a
sserts that I made the comment that it "would be better to do the wrong thing th
an nothing at all." </s>

this one is relatively easy, except that stock market doesn't quite
match the generative meaning.

<s> In composite trading on the American Stock Exchange, the company closed unch
anged yesterday at $8.375 a share. </s>

isn't this part of a proper noun? does it have the same sense?