Brown Corpus computers

SONODA Katsuhide (ksonoda@ilcs.hokudai.ac.jp)
Thu, 26 Sep 1996 18:10:53 +0900

A few days ago I asked:

> I have a question just out of curiosity. What were the hardware and
> software that were used in the earliest days of the Brown Corpus,
> i.e.,. around 1965?
>
> Prof. Kajita, who used the Brown Corpus extensively for his 1967
> Princeton dissertation, once said to me that the computer center of
> Princeton University complained to the linguistics department that he
> used too much CPU time! Was it the famous IBM 360 and Fortran, or
> still earlier model? I appreciate any info about the computational
> environment of the mid 60's that corpus linguists enjoyed.

A definitive answer came the following day from Prof. Henry Kucera. I
am very surprised to learn that the first Brown Corpus computer had
less than 40 KB(yte) of core memory, that the programming language was
an assembler, that the Brown Corpus text was stored on 100,000 of
punched cards, etc. I realized what an ambitious and challenging
project the Brown Corpus was.

I'd like to thank Prof. Henry Kucera and share his description with
the members of this mailing list.

> The first computer used for the Brown Corpus project was an IBM
> 7070, installed at Brown University in 1960. It was a "decimal"
> machine, i.e.,.internally binary, of course, but presenting to the
> user a decimal facade (not hexadecimal). It had, if I remember
> correctly, 10,000 "words" of core memory (a "word" consisted of five
> 6-bit strings), an extremely modest capacity by today's
> standards. Input was via punched cards, output via printer and/or
> punched cards. Fortunately the machine also had 6 high-capacity tape
> drives (7 channels), so that sorting, etc. was possible. The
> initial sort of the one million records of the Brown Corpus took 17
> hours of uniterrupted processing when I had to reserve the machine
> for the entire weekend. The initial programming language was IBM
> Autocoder, essentially an assembly language of the first level.
>
> Around 1964 or so, the 7070 was replaced by an IBM 360 and the
> language to which we switched was PL/I which we then used for most
> of the rest of the project. We did not use Fortran.
>
> At Brown, we received an excellent cooperation from the Computer
> Center and its directors. When I gave a talk, I was often introduced
> jokingly by my computer science colleagues as a "heavy user" (which
> was moderately funny since I then weighed some 180 pounds. It would
> wishes,
>
>
> Henry Kucera

And he adds:

> the original corpus text was on punched cards (about 100,000 of
> them, since several words fit on a card) but once the text was
> punched it could be transferred to tapes and processed further that
> way. But the initial input was indeed via cards.

-- Sonoda Katsuhide