Corpora: Re: corpus of on-line dialogues; Summary of replies

harold@ccl.umist.ac.uk
Thu, 2 Apr 1998 14:06:09 +0100

On Sat, 21 Mar 1998 11:56:54 +0100 I asked

> Can anyone point me towards a corpus of on-line dialogues? I mean
> dialogues between humans using a computer as the medium of
> transmission... like talk in UNIX or chat rooms on the Internet?
> I would prefer English text, don't care what the subject matter is.
> The more the merrier - even better if it includes typos.

inviting replies to me directly.

Here is a summary of the replies that I got.

David Roger <3dmr5@qlink.queensu.ca> is doing a PhD thesis in French
linguistics; having searched unsuccessfully for such a corpus himself,
he is currently capturing text (in French) from an IRC chat room.
"Alex Collier" <alex@rdues.liv.ac.uk> gives the following URL for a
list of such sites.
http://www.cs.cf.ac.uk/User/Andrew.Wilson/MUDlist/

Eric Atwell eric@scs.leeds.ac.uk says that Gavin Churcher at BT labs is
collecting such a `corpus', but would like to hear about any others.

Nicolas Nicolov <nicolas@cogs.susx.ac.uk> and John McNaught
jock@ccl.umist.ac.uk both mention the MAP TASK project at Edinburgh,
which involved recording the interactions of two people trying to
negotiate directions based on map reading. See
see: http://www.hcrc.ed.ac.uk/dialogue/maptask.html

Simeon J. Yates s.j.yates@open.ac.uk has a number of such corpora,
which he has collected for his own research.

"Andrea Mulloni" <MULLONI@bued29.kfunigraz.ac.at> pointed me to the
following site which contains corpora of transcribed telephone
conversations, whihc is not exactly what I was after, but others may
find useful: www.cse.ogi.edu/CSU/corpora/corpus_list.html

Thanks everyone for your information.