Re: authorship testing

BOB WYATT (bob.wyatt@mandic.com.br)
Fri, 02 Feb 96 20:56:00 -0200

|PF> |
|PF> Hi All, |
|PF> |
|PF> I'm a reporter in D.C. and I'm wondering if anyone out there knows |
|PF> something about determining authorship through textual analysis. I'm |
|PF> betting there is software that could help one do something of the |
|PF> sort... |
|PF> |
|PF> If anyone has ideas, please send them to: pfairley@cais.com or feel |
|PF> free to call me at 202-628-3728. |
|PF> |
|PF> Many thanks, |
|PF> Peter Fairley |
+---------------------------------------------------------------------------+

Dear Peter,

Two things come to mind. First, Dr. Mike Scott at the University of
Liverpool has developed a program called 'Wordsmith', which is
available commercially through Oxford U. Publishers. It is not a
program that allows you put a text in one end and get a
judgement out of the other, but, it does give you all the tools
necessary for comparing corpuses. What it can do is create
dictionaries from texts, concordance the texts, give the statistics
of word frequency and percentage ranking and last but not least,
collocates. The collocates can be done with up to a five word
horizon on both sides of key words. Depending on which lexical
items you choose to look at you can get a pretty fair idea of who
wrote what. It requires some understanding of statistics and some
patience until you get the hang of it, but it's a good tool. We use
it here in the department all the time to classify texts by their
lexical composition.

Second, a visiting professor from the UK last year (his name
escapes me at the moment. :-\ ) showed me a confession that was
used against an accused thief. The accused said that the confession
padded by the interrogators, naturally, the police said:
'nonsense'. Applied linguists were called in to analyze the text.
They managed to show statistically that it was not just the words
of the defendant. Since the confession had to contain only the
words of the defendant, they could demonstrate that compromising
material was written in the style that the constables use for
police reports. It was judged that the defendant would not (could
not) have emulated the 'official' style and after considerable
examination it was obvious that he couldn't have planned his
confession. The bottom line is the he was absolved of th crime.
There are people doing research in this type of analysis. As soon
as I can ask someone, I'll get back to you.

Regards,
----------
Bob Wyatt
Departamento de pos-graduacao em linguistica aplicada (LAEL)
Pontificia Universidade Catolica de Sao Paulo (PUC-SP)
bob.wyatt@mandic.com.br

---
 * MegaMail 2.10 #0: