Implementing automatic search suggestions for your web app using Solr 7 and Java
Programming Estimated reading time: ~4 minutes
Recently I had the pleasure to implement a REST endpoint which offers automatic search suggestions based on the user’s input. Often this is also refered to as “type ahead search”, “autocompletion” or e.g. “search as you type”.
The technology stack consisted of Apache Solr 7, Java 8 and Spring Boot. I had some troubles connecting the dots to get everything working. This is partly because in recent Solr versions some things have changed (e.g. the schema configuration) and therefore there is a lot of outdated information floating around in the internet.
Prepare Solr
To start let’s run a Solr server inside a docker container using
the following command: $ sudo docker run -p 8983:8983 -it solr:alpine sh
.
Next create a new core by executing
$ solr-create -c mars_core
.
Note: In Solr, the term core is used to refer to a single index
and associated transaction log and configuration files
(including the solrconfig.xml
and Schema files, among others).
After the core has been created a Solr server will be started. Add some documents using the HTTP API of Solr:
curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/mars_core/update' --data-binary '
{
"add": {
"commitWithin": 1000, "overwrite": false,
"doc": {"id": "1", "title": "the book of life"}
}
"add": {
"commitWithin": 1000, "overwrite": false,
"doc": {"id": "2", "title": "the book of eli"}
}
"add": {
"commitWithin": 1000, "overwrite": false,
"doc": {"id": "3", "title": "orange is the new black"}
}
"add": {
"commitWithin": 1000, "overwrite": false,
"doc": {"id": "4", "title": "a clockwork orange"}
}
}'
Note: Though this is not valid JSON because of the
usage of a duplicate key (add
) that’s how Solr
expects this input. This
stackoverflow post deals with this.
We did not have to define any schema before adding our test documents because when the default configuration is used Solr starts in schemaless mode. This is not recommended for production use. If you want to know how to define your schema be sure to read the notes about it at the end of the post (section “Recent changes in Solr schema configuration”).
After this switch to the terminal which is connected to the running Solr
container and end the Solr process by sending it a SIGTERM
signal (just press
Ctrl + C
). Then open the file
solr/server/solr/mars_core/solrconfig.xml
and add the suggest
search component at the end of the file (just before the
</config>
tag):
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">default</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="field">title</str>
<str name="buildOnStartup">true</str>
</lst>
</searchComponent>
Then add the corresponding request handler:
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">5</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Further documentation about the suggest search component can be found in the Solr Reference Guide. Please be aware that when using earlier versions of Solr automatic suggestions where often implemented using it’s spell check component (see e.g. here). When searching for other tutorials dealing with automatic suggestions make sure that they are about the dedicated SuggestComponent of Solr.
Now you can start Solr again using the $ solr-create -c mars_core
command.
Implementing the REST API
Implementing the REST API is as simple as shown in the following code snippets:
File SearchController.java
:
package net.mars.auth.rest.search;
import java.io.IOException;
import java.util.List;
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.client.solrj.response.SuggesterResponse;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import net.mars.commons.domain.exceptions.ApplicationException;
@RestController
@RequestMapping("/snpapi/suggest")
public class SearchController {
@Autowired
SolrClient solrClient;
@GetMapping
public List<String> createFavoriteIfNonExistent(@RequestParam(name="q") String term) throws ApplicationException, SolrServerException, IOException {
SolrQuery query = new SolrQuery();
query.setRequestHandler("/suggest");
query.setQuery(term);
QueryResponse queryResponse = solrClient.query(query);
SuggesterResponse suggesterResponse = queryResponse.getSuggesterResponse();
List<String> suggestions = suggesterResponse.getSuggestedTerms().get("default");
return suggestions;
}
}
File SolrConfiguration.java
:
package net.mars.migration;
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.solr.core.SolrTemplate;
import org.springframework.data.solr.repository.config.EnableSolrRepositories;
@Configuration
@EnableSolrRepositoriec(basePackages = "net.mars")
public class SolrConfiguration {
@Value("${solr.server.host}")
private String solrUrl;
@Bean
public SolrClient solrClient() {
return new HttpSolrClient(solrUrl);
}
}
Spring Boot config file:
# baseURL The URL of the Solr server.
#
solr.server.host=http://localhost:8983/solr/mars_core
Testing
Finally a test:
$ curl http://localhost:5000/snpapi/suggest?q="ora" && echo ""
[ "orange" ]
$ curl http://localhost:5000/snpapi/suggest?q="orage" && echo ""
[ "orange" ]
$
Did you notice the typo in the second request (“orage” instead of “orange”)? The Solr suggest component also fixes them.
Recent changes in Solr schema configuration
When searching for information about how to configure the Solr schema of a core please bear in mind that since version 6 Solr uses the managed-schema.xml file instead of the schema.xml file. Here are the key points to keep in mind:
- The schema editor in the WebUI can only be used when using a managed schema.
- You can edit the
managed-schema.xml
file by hand. But you have to restart Solr afterwards (see here). - The prefered way to change the
managed-schema.xml
file is through Solr’s API.
A note about Netcup (advertisement)
Netcup is a German hosting company. Netcup offers inexpensive, yet powerfull web hosting packages, KVM-based root servers or dedicated servers for example. Using a coupon code from my Netcup coupon code web app you can even save more money (6$ on your first purchase, 30% off any KVM-based root server, ...).