Autocompletion / autosuggestion using Solr

There are several options available for getting autocompletion/autosuggestion to work with Solr. Below is an overview of these options and their advantages respectively disadvantages. If you have a question about one of these techniques then please feel free to leave a comment (just added DISQUS as comment system to my blog).

I will write another blog post the upcoming days about the “Combine faceting and query of (Edge)NGram filtered text” approach. This technique is not described in the resources below and actually the one I currently use.

(Edge)NGram filtering + text query

Pro:

  • Supported by stable Solr (1.4.x)
  • Results sorted by best match
  • Prefix query support using EdgeNGram filter
  • Infix query support using NGram filter

Contra:

  • Duplications a very likely to occur for non unique source data

Faceting

Pro:

  • Supported by stable Solr (1.4.x)
  • No duplications

Contra:

  • Only prefix queries supported
  • Only sorted by usage count

Combine faceting and query of (Edge)NGram filtered text

Pro:

  • Supported by stable Solr (1.4.x)
  • Supports prefix queries (using NGram filter)
  • Supports infix queries (using EdgeNGram filter)

Contra:

  • Doesn’t work with multi-valued fields
  • Only sorted by usage count

Terms

  • Easy query syntax
  • Prefix support out the box by Solr 1.4.x
  • Sorted by usage count
  • Infix (using regular expression matching) may be slow
  • Infix only supported by Solr 3.x and above
  • No filter query (fq) to restrict collection to search

Suggester

Pro:

  • Created for exactly this autosuggestion task
  • Results sorted by best match

Contra:

  • No filter query (fq) to restrict collection to search
  • Needs to build its own index (`/suggest?spellcheck.build=true’)
  • Only available in Solr 3.x branch (or above)

Field collapsing

Pro:

  • Results sorted by best match
  • No duplications
  • Support for prefix queries (using EdgeNGram filter)
  • Support for infix queries (using NGram filter)

Contra:

  • Only available in development branch of Solr (4.x branch)
  • No multi-valued fields supported yet (but planed)
  • Some performance issues (work in progress)

Resources

  • http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
  • http://lucidworks.lucidimagination.com/display/LWEUG/Spell+Checking+and+Automatic+Completion+of+User+Queries
  • http://solr.pl/en/2010/10/18/solr-and-autocomplete-part-1/
  • http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/
  • http://solr.pl/en/2010/11/29/solr-and-autocomplete-part-3/
  • http://www.mattweber.org/2009/05/02/solr-autosuggest-with-termscomponent-and-jquery/
  • http://www.packtpub.com/article/faceting-in-solr-1.4-enterprise-search-server
  • http://wiki.apache.org/solr/FieldCollapsing
  • http://wiki.apache.org/solr/Suggester
  • http://wiki.apache.org/solr/TermsComponent