Re: Corpora: Santa Barbara Corpus

From: Chris Manning (manning@CS.Stanford.EDU)
Date: Mon Aug 07 2000 - 17:50:28 MET DST

Next message: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"

Previous message: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
In reply to: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Next in thread: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Reply: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 7 August 2000, Lou Burnard wrote:
> Hmm. So instead of using pre-existing standards which at least have a
> chance of being implemented across different computer platforms, it's
> better to make up an entirely arbitrary set of codes of your own for
> which *everyone* has to write their own software?

This is a little harsh. The transcription format used has existed and
been developed for many years in the conversational/discourse analysis
community -- and versions of it can be found in books such as Edwards'
Talking Data: Transcription and Coding in Discourse Research or
Schiffrin's Approaches to Discourse.

At most the LDC could be faulted for leaving the data in such a format
-- one clearly designed more for human observation than easy computer
manipulation -- rather than converting it to a more computer friendly
standard markup.

Chris Manning

Next message: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Previous message: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
In reply to: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Next in thread: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Reply: Lou Burnard: "Re: Corpora: Santa Barbara Corpus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Mon Aug 07 2000 - 17:48:41 MET DST