Enterprise Search in the Age of AI
In this special guest feature, Aleksandar, Chief Technology Officer at Supplyframe, poses the question – is there a place for Enterprise Search as a stand-alone category in the age of modern, AI-powered big data systems? Supplyframe is a Big Data platform for the electronics industry, powering innovative user experiences transforming day-to-day workflows of millions of engineers and procurement professionals worldwide. Before joining Supplyframe, Aleksandar was the Chief Data Architect at Vast.com, a white label vertical search technology platform powering search experiences for large consumer categories on websites such as Bing, Yahoo, and AOL.
Enterprise Search is a product category that has historically been a generator of big new ideas and paradigms, both in technology and user experience. From traditional information retrieval systems to big data analytics, concepts that have emerged within the context of Enterprise Search eventually ended up branching out into categories and markets of their own. Inverted indices became search engines; facet computation turned into real-time analytics; and document frequency models formed the basis for modern machine learning systems. In this process of evolution, however, Enterprise Search ended up losing a lot of its identity. Its core problem domain – enterprise data is now served by an ever-growing list of new product categories, from log analytics and business intelligence to data lakes and content management systems. So, the question is: Is there a place for Enterprise Search as a stand-alone category in the age of modern, AI-powered big data systems?
My opinion is that, if Enterprise Search is to regain a significant share of the business tools market, it can only do so by refocusing on its core value proposition: search. When it comes to the public web, we might feel that there’s little room left for improvement in the search space, but I believe that there’s a lot more ground to explore on the enterprise side of things. Part of the reason for claiming this comes from the insight that our needs seem to almost universally follow Pareto’s law, at least when it comes to the public web. For the most part, we keep searching for the same things by posing similar queries and land on the same websites. The fact that the corpus of all web documents is immense presents more of a problem than an opportunity, as most of it is irrelevant to us. Google understands this well, which is why, over the last decade, it hasn’t been investing in expanding its search experience, but instead slowly reducing it to merely providing the “one true answer,” personalized for each user.
Enterprise data, however, is entirely different. Whereas on the public web, we might not care about 99% of the content, in the enterprise context, every single bit of information might be a key to unlocking a business opportunity. While contemporary analytics and data warehousing platforms do an excellent job at enabling on-the-fly aggregates and data visualizations, the ability to fully explore the space of raw enterprise data leaves a lot to be desired. Part of it seems related to the Pareto law mindset that comes from the world of consumer tools – by optimizing for extracting “signal” from “noise,” we lose the ability to produce the best possible tool for dealing with cases where all the data is pure “signal.”
The search paradigm has a lot to offer in this regard. By thinking of search results as representing raw data entries (for example – business-related transactions), the information retrieval model provides a proven, highly efficient way of navigating the entire enterprise data corpus. However, the problem of search fatigue remains – our ability to think creatively often shuts down when faced with a search box and our queries reduce to simple single-term searches. This issue is at the heart of why natural language search platforms haven’t seen greater success and why we tend to prefer looking at graphs and dashboards over actively engaging with our big data tools.
In my opinion, the path forward is in the reversal of the search paradigm – though the ability to efficiently process search queries is still essential, asking questions should no longer be the primary way users interface with search systems. Instead, an enterprise search should focus on generating the most valuable questions that users are able to query them with. Recent advances with models such as GPT and BERT are certainly making such functionalities seem within reach. Having search systems capable of generating questions based on their indexed data enables a whole different entry point for engagement. Instead of thinking of Enterprise Search as another standalone tool that we use only for a specific task, the notion of question generation allows the search to become ubiquitous and embedded throughout the entire enterprise tool stack. In this way, the promise of AI- enabled search doesn’t end with the “one true result,” but instead opens the door for much deeper engagement with the vast universe of raw enterprise data.
Sign up for the free insideBIGDATA newsletter.
Read More …