Corpora: TiMBL 4.0 - new release of Tilburg Memory-Based Learner

From: Antal van den Bosch (Antal.vdnBosch@kub.nl)
Date: Tue Aug 14 2001 - 14:05:00 MET DST

  • Next message: Veronika Koller: "Corpora: summary Brill's vs. CLAWS"

    ----------------------------------------------------------------------
     Software release: TiMBL 4.0
                            Tilburg Memory Based Learner
                            ILK Research Group, http://ilk.kub.nl
                            CNTS - Language Technology Group
    ----------------------------------------------------------------------

    Apologies for double postings.

    The ILK (Induction of Linguistic Knowledge) Research Group at Tilburg
    University, The Netherlands, and the CNTS - Language Technology Group,
    at the University of Antwerp, Belgium, announce the release of a new
    version of TiMBL, the "Tilburg Memory Based Learner", version 4.0.

    TiMBL is a machine learning program implementing a family of
    Memory-Based Learning techniques. TiMBL stores a representation of the
    training set explicitly in memory (hence `Memory Based'), and
    classifies new cases by extrapolating from the most similar stored
    cases.

    TiMBL is being developed with a focus on classification tasks with
    symbolic data, large numbers of features and values, and very large
    case bases, as typically found in natural language processing. However,
    TiMBL can be applied to any machine learning or data mining task for
    which labeled examples with fixed numbers of features are available.
    The main features of the system are:

    - Support for symbolic, numeric and binary features.
    - Automatic feature weighting. Information Gain, Gain Ratio,
      Chi-squared, and Shared Variance weighting are provided for dealing
      with features of differing importance.
    - Stanfill & Waltz's / Cost & Salzberg's (Modified) Value Difference
      metric for making graded guesses of the match between two
      different symbolic values.
    - Speed up optimizations that enhance the underlying k-nearest
      neighbor classifier kernel: Conversion of the flat instance memory
      into a decision tree, and inverted indexing of the instance memory,
      both yielding faster classification.
    - Further compression and pruning of the decision tree, guided by
      feature information gain differences, for even larger speed-ups
      (the IGTREE and TRIBL learning algorithms).
    - Verbose output to enable the monitoring of the process of
      extrapolation from nearest neighbors.
    - A multithreaded TiMBL server that can be used as a classification
      agent.
    - Fast leave-one-out testing.

    Version 4.0 offers a number of new features:

    - Class voting weighted by distance (inverse, linear, or decayed
      exponentially) or by user-defined exemplar weights.
    - Emulation of the IB2 algorithm, an incremental editing variant of
      IB1 (Aha, Kibler and Albert, 1991).
    - Internal n-fold cross-validation testing.
    - Various additional verbosity options, bug-fixes and code improvements.

    For more information: The reference guide ("TiMBL: Tilburg
    Memory-Based Learner, version 4.0, Reference Guide.", Walter
    Daelemans, Jakub Zavrel, Ko van der Sloot, and Antal van den
    Bosch. ILK Technical Report 01-04) can be downloaded separately and
    directly from

           http://ilk.kub.nl/downloads/pub/papers/ilk0104.ps.gz

    -[download]-----------------------------------------------------------

    You are invited to download the TiMBL package for educational or
    non-commercial research purposes. When downloading the package you are
    asked to register, and express your agreement with the license
    terms. TiMBL is *not* shareware or public domain software. If you have
    registered for a previous version, please be so kind to re-register
    for the upgrade. TiMBL can be downloaded from

        http://ilk.kub.nl/

    by following the `Software' link.

    The TiMBL package contains:

    - Source code (C++) with a Makefile.
    - A reference guide containing descriptions of the incorporated
      algorithms, detailed descriptions of the commandline options,
      and a brief hands-on tutorial.
    - Some example datasets.
    - The text of the license agreement.

    The package should be easy to install on most UNIX systems.

    -[contact]---------------------------------------------------------

    For comments and bugreports relating to TiMBL, please send mail to

           Timbl@kub.nl

    ----------------------------------------------------------------------



    This archive was generated by hypermail 2b29 : Tue Aug 14 2001 - 13:59:40 MET DST