Wednesday, September 23, 2009

Why is Lucene MoreLikeThis Final?

Lucene has a class MoreLikeThis which can used to build a search query finding documents similar to a passed example document.

This is very useful; in my RSS reader I allows users to find articles similar to an article they find interesting.

I'd like to do something a bit advanced: given an instance of class A, find similar instances of class B. The two classes are stored in separate Lucene indices; the vast majority of the time they are to be queried separately.

For some reason they have chosen to implement MoreLikeThis as final. I can't see any reason for this. I'd like to be able to extend it to add a new public Query like... method but no dice. The primary methods I need (createQuery()) to call are private so there doesn't seem to be a way around it:

I'll need to make my own version of the whole class. Ick. Looking at it positively, this will encourage me to learn the internals of the class instead of using it as a black box.

Saturday, September 19, 2009

Upgrading to EhCache 1.6.2 with Hibernate to enable JMX Monitoring

I couldn't (easily) find this answer, so here goes: Yes, it is completely OK to use 1.6.2 with Hibernate (3.3.2 in my case).

For whatever reason, the maven package hibernate-ehcache uses an old version of ehcache, version 1.2.3. This version doesn't support JMX (which is why I wanted to upgrade) but supposedly there are significant performance improvements in newer ehcache versions as well.

My app is maven/spring/hibernate, so first in your spring config make sure to use the Singleton version of the ehcache provider, otherwise you won't be able to turn on JMX:

<entry key="hibernate.cache.provider_class" value="net.sf.ehcache.hibernate.SingletonEhCacheProvider" />


Then in maven, exclude the old version and bring in the new one:

        <dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-ehcache</artifactId>
<version>3.3.2.GA</version>
<exclusions>
<exclusion>
<groupId>net.sf.ehcache</groupId>
<artifactId>ehcache</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>net.sf.ehcache</groupId>
<artifactId>ehcache</artifactId>
<version>1.6.2</version>
</dependency>


Finally you will need to register JMX mbeans for your cache when your app starts:

        CacheManager manager = CacheManager.getInstance();
MBeanServer mBeanServer = ManagementFactory.getPlatformMBeanServer();
ManagementService.registerMBeans(manager, mBeanServer, false, false, false, true);


Now you can see how many items are in each cache region using JConsole. Enjoy!

Friday, September 18, 2009

Hibernate Search FTW

In praise of Hibernate Search:

-Search is both easy to learn and easy to use
-A great book "Hibernate Search in Action" is available to tame the learning curve
-It is very well designed, exposing just the right amount of the underlying Lucene extendability. I am using Filters, programmatic construction of complex queries, FieldBridges to search custom types, and a custom Scorer for result ranking

Just wanted to put this out there: If you're thinking of using Hibernate Search, jump in. To the Hibernate Search team: thank you!