Using TensorFlow in a Solr Query Parser
06-19, 11:10–11:50 (Europe/Berlin), Frannz Salon

Tutorial for writing Solr Query Parser that use TensorFlow for Java to augment queries.


Typically, when you need to expand a query through a model - for example, to do entity recognition or query tagging - you'd use a separate service. While this architecture is perfectly valid, the extra network hops to the "query expansion microservices" will impact query latency.

For autocomplete and other low-latency use-cases, you might want to trade some complexity for speed by implementing a custom query parser. In this talk, we'll show a working example:
- we'll build a model using TensorFlow in Python that does query expansion
- we'll load it with TensorFlow for Java in a Solr Query parser
- now we can run queries and get them expanded directly in Solr

One can use this talk and the resources we'll share in order to implement a query parser for their own use-case. We'll also expand on the architecture trade-offs. For example, as you add more nodes and replicas to handle more query throughput, you'll expand the capacity for query expansion. Should you need to scale these separately, you can use coordinator nodes.

See also: Slides (527.3 KB)

Radu Gheorghe works mainly as a search consultant at Sematext, working with clients of all sizes on their Elasticsearch, OpenSearch and Solr projects. He is also a trainer and does production support for both these search engines.

Sometimes he helps out with the development of Sematext Cloud (an observability SaaS), mostly when it comes to Elasticsearch and log shippers (e.g. Logstash, rsyslog...). He also writes on the Sematext blog or helps other publish new articles.

He co-authored a book (Elasticsearch in Action, Manning), recorded a video tutorial (Working with Elasticsearch, O'Reilly) and was a speaker at a number of conferences, such as Berlin Buzzwords, LuceneSolrRevolution (later Activate) and Kubecon.

Software engineer, trainer, consultant and author from time to time - some would say that he is an all in one battle weapon concentrated mostly on Lucene, Solr and Elasticsearch. Currently an Engineering Lead in Archipelo. However he also likes all the other cool stuff that is happening in the IT world. Likes to share his knowledge by giving talks at various meet ups and conferences.