RE: Corpora: Using a relational database to store conc pointers

From: Chris Brew (cbrew@ling.ohio-state.edu)
Date: Fri Mar 31 2000 - 20:09:57 MET DST

  • Next message: andrew mccrum: "Corpora: word use-frequency corpora"

    There's a discussion of compressing inverted indices
    in the excellent "Modern Information Retrieval" by Baeza-Yates and
    Ribeiro-Neto, pp 184 ff.

     
    >
    > Hi Mickel,
    >
    > can you post the reference to that Moffat article? Sounds interesting.
    >
    > Thanks & regards,
    > Jochen
    >
    > > It is also possible to sensibly reduce the above mentioned array
    > > by compressing ordered lists of occurrence positions. I found a
    > > paper by Alistair Moffat at the Dept. of Computer Science of Univ.
    > > of Melbourne describing a method for compressing ordered list of
    > > numbers.
    > [...]
    > > 18 bits on average, which is almost half of the 32 bits you would need
    > > when storing such a list of numbers in the obvious way.
    > > If you want, I can send you the program to have a look at it.
    >
    >
    > --
     



    This archive was generated by hypermail 2b29 : Sun Apr 02 2000 - 10:26:11 MET DST