Corpora: New Corpus

From: LDC Office (ldc@unagi.cis.upenn.edu)
Date: Wed Mar 08 2000 - 21:37:28 MET

Next message: Tony Rose: "Corpora: Job Opportunity in Information Retrieval"

Previous message: Doris Faehndrich: "Corpora: ETAPS 2000 - 2nd Call for Participation"
Next in thread: LDC Office: "Corpora: New Corpus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

********************************************************
Santa Barbara Corpus of Spoken American English - Part I
********************************************************

LDC is pleased to announce the availability of the
Santa Barbara Corpus of Spoken American English -
Part I. This release contains 14 speech files from
the Santa Barbara Corpus of Spoken American English,
which was collected by the University of California,
Santa Barbara Center for the Study of Discourse under
the direction of John W. Du Bois. Associate Editors
were Wallace L. Chafe (UCSB), Charles Meyer (UMass,
Boston), and Sandra A. Thompson (UCSB). The Santa
Barbara Corpus of Spoken American English is part of
the International Corpus of English (Charles W.
Meyer, Director), representing the American Component.

The Santa Barbara Corpus of Spoken American English
is based on hundreds of recordings of natural speech
from all over the United States, representing a wide
variety of people of different regional origins,
ages, occupations, and ethnic and social backgrounds.
It reflects many ways that people use language in
their lives: conversation, gossip, arguments,
on-the-job talk, card games, city council meetings,
sales pitches, classroom lectures, political
speeches, bedtime stories, sermons, weddings, and
more.

Each speech file is accompanied by a transcript in
which phrases are time stamped with respect to the
audio recording. Personal names, place names, phone
numbers, etc, in the transcripts have been altered to
preserve the anonymity of the speakers and their
acquaintances and the audio files have been filtered
to make these portions of the recordings
unrecognizable.

For the latest information on this corpus, please refer to
the UCSB and Linguistic Data Consortium (LDC) web sites
devoted to it:

http://humanitas.ucsb.edu/depts/linguistics/research/csae/
http://www.ldc.upenn.edu/Publications/SBC/

These sites may also contain software or revised
versions of data which may be downloaded.

Institutions that have membership in the LDC during
the 2000 Membership Year will be able to receive this
corpus free of charge. Nonmembers may purchase the
Santa Barbara Corpus of Spoken American English -
Part I for $75.

If you would like to order a copy of this corpus,
please email your request to <ldc@ldc.upenn.edu>. If
you need additional information before placing your
order, or would like to inquire about membership in
the LDC, please send email or call (215) 898-0464.

Next message: Tony Rose: "Corpora: Job Opportunity in Information Retrieval"
Previous message: Doris Faehndrich: "Corpora: ETAPS 2000 - 2nd Call for Participation"
Next in thread: LDC Office: "Corpora: New Corpus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Wed Mar 08 2000 - 21:38:51 MET