Re: [Corpora-List] summary: free sentencizers ; test differentsentencizers with cgi script

From: Joerg Schuster (js@cis.uni-muenchen.de)
Date: Mon Mar 10 2003 - 10:05:03 MET

  • Next message: Shlomo Yona: "Re: [Corpora-List] summary: free sentencizers ; test differentsentencizers with cgi script"

    > 1. The test passes the text using GET method and does not "escape" the
    > text before sent to the server. This can easily crash your test program.

    I will improve this. But it may take some time, because our system
    administrators will install a new operating system on our web server
    tomorrow. (And I am not sure how and if things will work after that.)

    > 3. As I am also on this mailing list, I'd be happy to accept bug-reports and
    > feature requests and further develop this software. Hopefully, if there is
    > enough interest it will grow to be good enough so everyone can use
    > it.

    I think one of the disandvantages of your program is that it stores
    all data in main memory. You have to say something like

     my $sentences=get_sentences($in);

    Though this is very comfortable when dealing with small files, I would
    like to rather say something like

    while(<>) {
              print_sentences;
    }

    Then huge files could easily be sentencized, too.

    Jörg



    This archive was generated by hypermail 2b29 : Mon Mar 10 2003 - 10:13:39 MET