Corpora: Linguistics metrics query

William E Hefley (hefley+@andrew.cmu.edu)
Mon, 9 Feb 1998 22:38:11 -0500 (EST)

Hello,

I'm hoping that readers of this "corpora" discussion board might make
some suggestions that would help direct me towards appropriate metrics
for use in a project.

I have a set of free text responses to a set of prompt questions for
several hundred subjects. I'm looking for measures of the richness of
the expressions of each subject; these measures could be things like
total number of unique words/tokens used, number of concepts in a
generated concept map, number of links in concept maps, etc., that will
allow me to have a metric for each subject.

Are there a "standard" set of measures or some measures that have been
used in previous work that you might commend me towards? I'd be
especially interested in ideas about what others have done. I'll freely
admit that this is an area I don't have much experience in, so any
suggestion would be gratefully appreciated.

Cheers,

bill

-----------------------------------------------------------------------
Bill Hefley, CCP, CDP -- Social and Decision Sciences, PH-208G
Carnegie Mellon University, Pittsburgh, PA 15213 U.S.A.
Office: 412-268-3238 // Fax: 412-268-6938
Email: hefley@andrew.cmu.edu // Web: http://sds.hss.cmu.edu