11/12/2014

Liferay 7 : Elasticsearch vs. SOLR

Liferay 7 will switch from lucene to elasticsearch as the standard search provider. But what is elasticsearch and why is it a good idea to have it in your liferay installation ? And where´s the difference to SOLR, which is already a good alternative and can be integrated into Liferay pretty easily ?
Let´s take a look at the google search queries from the last years to see how many people care for which search technology.



As you can see, elasticsearch surpassed SOLR in the last year. Question is: why ? Let´s talk a little about both technologies to see if there is an actual technological explanation for what we see here. Note: There are so many ways to compare both technologies that there is an own site just for that. If you want to dig down really deep I would suggest taking a look here: http://solr-vs-elasticsearch.com/

This video is a nice introduction to SOLR:




And this one is a good one about elasticsearch:





Consider having more requirements to your liferay search then just providing a search field. Consider specifying how your search behaves, which data types will be indexed and consider a clustered, fail-save environment with a lot of data.

If you have a lot of data, that you could also group into several indexes then you need "shards". A shard is an index of its own that can be accessed by your search provider. SOLR and elasticsearch allow you to define shards and also allow you to define schemas for each shard. Using shards you can define a shard for the users, another one for documents and a third one for car information. All shards can have different attributes that you can search for. Those attributes will be defined either out of the box (SOLR and elasticsearch support this) or you can write your own finegrained schema file. This approach is my choice for huge liferay projects.

You can even distribute your shards over several machines. Elasticsearch does this ootb, SOLR has the so called SOLRCloud, that supports distribution.

Elasticsearch also allows you to define replicated shards and thus supports securing your applications search data a little better then SOLR does. Both search providers allow you to process JSON documents, SOLR also allows you to work with XML or CSV.

Elasticsearch is made for the cloud, and it supports big data integration with the "ELK stack" - Elasticsearch, Logstash, Kibana, as a Big Data analysis solution. Visit the following sites to learn:

Logstash: http://www.elasticsearch.com/products/logstash/
Kibana: http://www.elasticsearch.com/products/kibana/

While going through all the features that both search providers have, one could get the impression that there is no killer argument for one side of the "search battle". Both SOLR and elasticsearch can be used to perform the same things. While it is a very good idea from the liferay crew to integrate elasticsearch instead of a simple lucene index, there is no technological reason not to use SOLR. Elasticsearch seems to be a little easier to use out of the box and it is definetely a trending topic. Those might be two key points that lead to liferay´s decision to integrate it instead of SOLR.

So the bottom line is: Take the technology you prefer, there is no need to migrate to elasticsearch as long as you don´t need one of the central features they´re offering that SOLR doesn´t have. For all those of you who never thought about the search technology used by liferay under the hood: You will be able to distribute your indexes and have a much faster search then before.

Since elasticsearch is based on lucene it will be interesting to see how and if a migration will be possible.


Do you need expertise in SOLR, elasticsearch and / or liferay ? 
Just Contact me !

If you have any questions, feel free to leave a comment.