Zend_Search_Lucene: Not enterprise-ready
Posted on November 7th, 2008 in PHP, Zend Framework | 1 Comment »
Zend Framework has been attracting more and more attention from the PHP community lately, and while it lacks certain things (like code generation) that other frameworks (like Rails) have implemented to great effect, Zend Framework 2.0 is slowly taking shape and it looks like it will be the framework of choice for startups and enterprises alike. (Yes, it will even have code generation.)
But despite having several “enterprise-ready” components, I’ve found that one in particular is not: Zend_Search_Lucene, Zend Framework’s native PHP implementation of Apache Lucene, written in Java.
Don’t get me wrong; Zend_Search_Lucene is great for a small site or blog. However, from extensive personal experience, it is not appropriate for a site with a medium or large index. I think this should be noted upfront in the documentation.
Against my better judgment, the company I work for migrated our previous search solution to Zend_Search_Lucene. On pretty heavy-duty hardware, indexing a million documents took several hours, and searches were relatively slow. The indexing process consumed vast amounts of memory, and the indexes frequently became corrupted (using 1.5.2). A single wild card search literally brought the web server to its knees, so we disabled that feature. Memory usage was very high for searches, and as a result requests per second necessarily declined heavily as we had to reduce the number of Apache child processes.
We have since moved to Solr (a Lucene-based Java search server) and the difference is dramatic. Indexing now takes around 10 minutes and searches are lightning fast. What a difference a language makes.