October 28th, 2020
The field of information-retrieval and text search has come a long way since its inception, several dozen years ago. Join us on this session, where we will discuss the modern text search practice with Elasticsearch, the Lucene-based search engine server and today's de-facto standard for full-text search applications. We will start from the basic keyword search - analyzers, term normalization, stemming and morphologic properties. We will, of course, discuss the common challenges it has, such as boosts, synonyms, ontologies, phrases and how to deal with them. Continuing from there, we will review the modern and future approaches for full-text search, from vector search to word embedding methods like BERT, and how those come into play. We will also discuss how we can improve precision and recall by using judgment lists, click-streams and search logs.