banner

Calendar

< March 2006 >
MonTueWedThuFriSatSun
12345
6789101112
13141516171819
20212223242526
2728293031

Wednesday, 01.03.06

Released Lucenemodule 1.7.9

I've released the 1.7.9 version of my pet-project the MMBase Lucenemodule and these are the main changes:

New Features:

  1. Extractor plugins
    Lucenemodule's dependencies on pdf and word content extraction are now handled by Extractors. This makes the module less depending on some external jars. Extractors have to implement an interface and have to be declared in the definition xml file. Extractors are executed by their mimetype by the Lucenemodule. You can register the same extractor with different mimetypes if you wish. The previously used dependencies for word and pdf are now included automatically as the default and can be overruled by defining a extractor with the same mimetype.
  2. Support for Excel and RTF
    These formats are now supported as Extractors.
  3. StandardCleaningAnalyzers
    This Analyzer which is available for Dutch and English are useful in most cases where you want to search and index word with special characters.
    Words like 'één' can then be found in their original form 'één' and base form 'een'.

Improvements:

  1. Cache cleanup
    MMBase cache is now cleaned when indexing of nodes takes place to keep the memory usage low, the behaviour of the 1.8 HugeNodeListIterator was simulated here.
  2. SearchTag custom queries
    The custom queries in the search tag are not supported anymore, the taglibs match tag should be used.
  3. re-use-index
    This option in the module xml file will keep the index in tact and only does updates on the index on a restart of MMBase. Use this option in combination with a large interval time to keep the module from re-indexing.
  4. Usage of Lucene 1.9
    The new version of Lucene is used and the module now uses the new Lucene 2.0 api.

Bugfixes:

  1. Various bugs were fixed with the help of Alban Hertroys and others.
wouter - pencil 14:04:43 - Events - pencil permalink -