The future of search: Conversational, semantic, and vectorized models
Enterprise search is at the nexus of the most recent and impactful developments shaping our contemporary knowledge ecosystem. It’s a primary beneficiary of the rapid advancements AI has made in natural language queries—and natural language generation of results—in an intelligent, conversational manner.
Simultaneously, however, search typifies the surplus of unstructured data that is overwhelming organizations, which reinforces the need to swiftly transform such cumbersome data into timely information. By spanning both internal and external systems (including the internet), search operates at the greatest scale possible. Implicit in this consideration is the mounting distribution of data assets outside and inside enterprise boundaries, which involve on-prem, hybrid, and multi-cloud locations.
That today’s search mechanisms are able to surmount these obstacles for the expedient retrieval of information upon which to create business action is a testament to its effectiveness. It also implies that a composite of search techniques, almost none of which are mutually exclusive in their application, is necessary to meet relevance goals.
A litany of these approaches includes cognitive search, semantic search, vector-based search, federated search, and similarity search, as well as corollaries such as search relevance and relevance rankings. Yet the individual and collective merit of these approaches depends on what is arguably the nucleus of enterprise search: the relevance model.
“Some relevance models are built entirely on human-defined taxonomies,” explained Matt Riley, Elastic’s general manager of enterprise search. “Other relevance models can be entirely mathematically or statistically defined. And then, as new techniques have emerged in AI and machine learning, there are new models coming out all the time that are mathematical and statistical models, but they just take a different form than the ones that have preceded it.”
Relevance models
Relevance models are designed to ensure searchers retrieve results that accurately answer queries. They’re determined by a host of factors that vary across users, applications, datasets, and organizational goals. The concept of context, which some consider distinct from that of formal relevance, has implications for delivering the most relevant search results. According to Qlik’s CTO Mike Potter, “Relevancy is a mathematical approach, and it often loses out on the notion of experience or perspective. That’s just as important, or harder, to capture.”
Nonetheless, the readily quantifiable nature of relevance models has several benefits. These models lend themselves to objective forms of understanding the usefulness of search via user behavior analytics and reporting. The former is influential for enabling developers “to capture the search queries that their users are making, the things [searches] are returning, and whether or not users are clicking on them,” Riley noted.
Credible reporting capabilities have downstream consequences for adjusting relevance models too. Examples include “the list of the top 100 most common search terms that went through your search application,” Riley specified. “And then, given those, which ones are getting the most click-throughs on the results that came back.” Such quantifiable metrics are critical for adjusting relevance models in order to improve search results and efficiency.