Recently I had the pleasure to implement a REST endpoint which offers automatic search suggestions based on the user’s input. Often this is also refered to as “type ahead search”, “autocompletion” or e.g. “search as you type”.

The technology stack consisted of Apache Solr 7, Java 8 and Spring Boot. I had some troubles connecting the dots to get everything working. This is partly because in recent Solr versions some things have changed (e.g. the schema configuration) and therefore there is a lot of outdated information floating around in the internet.

Prepare Solr

To start let’s run a Solr server inside a docker container using the following command: $ sudo docker run -p 8983:8983 -it solr:alpine sh. Next create a new core by executing $ solr-create -c mars_core.

Note: In Solr, the term core is used to refer to a single index and associated transaction log and configuration files (including the solrconfig.xml and Schema files, among others).

After the core has been created a Solr server will be started. Add some documents using the HTTP API of Solr:

curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/mars_core/update' --data-binary '
{
  "add": {
    "commitWithin": 1000, "overwrite": false,  
    "doc": {"id": "1", "title": "the book of life"}
  }
  "add": {
    "commitWithin": 1000, "overwrite": false,
    "doc": {"id": "2", "title": "the book of eli"}
  } 
  "add": {
    "commitWithin": 1000, "overwrite": false,  
    "doc": {"id": "3", "title": "orange is the new black"}
  }
  "add": {
    "commitWithin": 1000, "overwrite": false,
    "doc": {"id": "4", "title": "a clockwork orange"}
  }
}'

Note: Though this is not valid JSON because of the usage of a duplicate key (add) that’s how Solr expects this input. This stackoverflow post deals with this.

We did not have to define any schema before adding our test documents because when the default configuration is used Solr starts in schemaless mode. This is not recommended for production use. If you want to know how to define your schema be sure to read the notes about it at the end of the post (section “Recent changes in Solr schema configuration”).

After this switch to the terminal which is connected to the running Solr container and end the Solr process by sending it a SIGTERM signal (just press Ctrl + C). Then open the file solr/server/solr/mars_core/solrconfig.xml and add the suggest search component at the end of the file (just before the </config> tag):

<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">default</str>
    <str name="lookupImpl">FuzzyLookupFactory</str>
    <str name="suggestAnalyzerFieldType">string</str>
    <str name="field">title</str>
    <str name="buildOnStartup">true</str>
  </lst>
</searchComponent>

Then add the corresponding request handler:

<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.count">5</str>
  </lst>
  <arr name="components">
    <str>suggest</str>
  </arr>
</requestHandler>

Further documentation about the suggest search component can be found in the Solr Reference Guide. Please be aware that when using earlier versions of Solr automatic suggestions where often implemented using it’s spell check component (see e.g. here). When searching for other tutorials dealing with automatic suggestions make sure that they are about the dedicated SuggestComponent of Solr.

Now you can start Solr again using the $ solr-create -c mars_core command.

Implementing the REST API

Implementing the REST API is as simple as shown in the following code snippets:

File SearchController.java:

package net.mars.auth.rest.search;

import java.io.IOException;
import java.util.List;

import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.SuggesterResponse;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import net.mars.commons.domain.exceptions.ApplicationException;

@RestController
@RequestMapping("/snpapi/suggest")
public class SearchController {

  @Autowired
  SolrClient solrClient;
  
  @GetMapping
  public List<String> createFavoriteIfNonExistent(@RequestParam(name="q") String term) throws ApplicationException, SolrServerException, IOException {
    SolrQuery query = new SolrQuery();
    query.setRequestHandler("/suggest");
    query.setQuery(term);
    
    QueryResponse queryResponse = solrClient.query(query);
    SuggesterResponse suggesterResponse = queryResponse.getSuggesterResponse();
    List<String> suggestions = suggesterResponse.getSuggestedTerms().get("default");
    return suggestions;
  }

}

File SolrConfiguration.java:

package net.mars.migration;

import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.solr.core.SolrTemplate;
import org.springframework.data.solr.repository.config.EnableSolrRepositories;

@Configuration
@EnableSolrRepositoriec(basePackages = "net.mars")
public class SolrConfiguration {

  @Value("${solr.server.host}")
  private String solrUrl;
  
  @Bean
  public SolrClient solrClient() {
      return new HttpSolrClient(solrUrl);
  }

}

Spring Boot config file:

# baseURL The URL of the Solr server. 
#
solr.server.host=http://localhost:8983/solr/mars_core

Testing

Finally a test:

$ curl http://localhost:5000/snpapi/suggest?q="ora" && echo "" 
[ "orange" ]
$ curl http://localhost:5000/snpapi/suggest?q="orage" && echo ""
[ "orange" ]
$

Did you notice the typo in the second request (“orage” instead of “orange”)? The Solr suggest component also fixes them.

Recent changes in Solr schema configuration

When searching for information about how to configure the Solr schema of a core please bear in mind that since version 6 Solr uses the managed-schema.xml file instead of the schema.xml file. Here are the key points to keep in mind:

  • The schema editor in the WebUI can only be used when using a managed schema.
  • You can edit the managed-schema.xml file by hand. But you have to restart Solr afterwards (see here).
  • The prefered way to change the managed-schema.xml file is through Solr’s API.

A note about Netcup (advertisement)

Netcup is a German hosting company. Netcup offers inexpensive, yet powerfull web hosting packages, KVM-based root servers or dedicated servers for example. Using a coupon code from my Netcup coupon code web app you can even save more money (6$ on your first purchase, 30% off any KVM-based root server, ...).