I just returned from Solr Summit in Frankfurt, a half day mini conference about Solr, the search server based on Apache Lucene. It has been a really worthwile event with a lot of insight into large scale implementations of Solr.
The first half of the conference Marc Krellenstein, a Co-Founder of Lucid Imagination, presented trends in enterprise search as well as Lucids commercial Solr environment Lucid Works Enterprise. After an outline of the history of search systems he presented different characteristics of a successful search system. Though being held by someone who is obviously biased towards Solr and Lucene he also summarized where commercial search systems like Autonomy and Fast have their strengths, good to have an insight into competing systems.
Afterwards Oliver Schönherr and Thomas Kwiatkowski spoke on how Solr is used at Immobilienscout24, where it powers full text search. Solr had been selected after an evaluation period where commercial as well as non-commercial systems were compared. The way Solr is used probably is not a common use case. IS24 uses a custom build search system for doing their structured search, where you basically refine the search results using different form fields. Solr is used to search within this result list by intersecting the Solr search results with the results of the legacy system. They are using a plain Solr 1.4 without any patches and only two additional components, a scheduler for the data import handler that indexes a database and a component that provides fast access to only the ids of documents because that's all that is needed for the intersection.
The last talk was held by Olaf Zschiedrich of eBay Kleinanzeigen, formerly known as Kijiji. eBay Kleinanzeigen seems to use nearly all features that Solr has to offer, most notably facetting, autocompletion and more-like-this for displaying related articles. The site is being developed by a relatively small team and seems to be blazing fast though there are lots of hits on Solr, on peak times 1500 requests/s. Of course this is only possible as Solr is designed to be scalable by means of its replication features, its internal caching and the external caching support through ETags. At eBay Kleinanzeigen there are 12 Solr instances that are used for searching but according to Olaf, 8 would still be enough to keep the resource consumption under 50%.
All of the talks were really interesting, Olaf Zschiedrichs being the one with the most laughters. I have learned a lot and appreciate the time and costs Lucid Imagination and its partners have invested to make this event possible.