Has the data warehouse lost its luster? Have dashboards fallen out of fashion? They have and they are, according to Eldad Farkash, the CEO and co-founder of Firebolt. What’s driving sales of the humble data warehouse these days, he says, is a whole new ballgame: Serving clean and correct data to AI models.
After leaving his previous startup, embedded BI provider Sisense, in 2018, Farkash founded Firebolt in 2019 with the goal of creating the ultimate distributed data store for high-speed, low-latency data analytics. He developed a third-generation data warehouse that had the capability to ingest large amounts of fast and complex data while executing large numbers of concurrent SQL queries, ostensibly to keep real-time dashboards updated and ad hoc queriers happy.
But somewhere along the way, the name of the game shifted. Instead of being used to serve real-time analytics for massive online games, for instance, or aggregating huge amounts of telemetry data to spot performance issues–two common use cases for real-time analytics in the big data era–customers wanted to do something else.
For starters, the search for the perfect dashboard was over. “Nobody’s looking for a dashboard anymore,” Farkash says. “Everyone is using the same dashboards, but as embeddings within answers, versus starting with the dashboard and looking for the right data.”
In other words, customers still want answers to their questions, but they just don’t want to go through their dashboards to get it. Instead, customers are initiating SQL queries via natural language prompts to GPT-3.5, Gemini, Perplexity, and other large language models (LLMs).
“People receive the answer with a set of widgets that we used to call one dashboard,” Farkash tells BigDATAwire. “Some BI analyst previously used to own that dashboard. Now there is only data ownership, metadata ownership, and the AI. Nothing in between.”
You can call it the deconstruction of the dashboard. Companies that placed the data warehouse and dashboard at the center of their enterprise analytics universe were suddenly reconsidering their assumptions and their options.
“There is not a single data team out there or a single engineer out there that isn’t sitting right now and rethinking their whole stack,” Farkash says. “And I think one of the big changes is the role of the data warehouse within that framework.”
Previously, the data warehouse was the most important component of many components within a big, multilayered stack, the Firebolt CEO says. Now many of those layers are being removed and replaced with a more streamlined stack that consists of the AI model and the database, or multiple databases, depending on the need.
“Basically, AI engineers today are mostly focused on how do we connect to data, enterprise-grade data,” Farkash continues. “What is that? Cleansed, nurtured, owned. It’s correct data. It’s great data. There is no hallucination. There is no making mistakes.”
Early in the generative AI revolution, administrators were hesitant to let LLMs loose on their most prized asset: structured data describing customers and transactions. LLMs were lousy at writing SQL, and would make errors or hallucinate the responses to questions. Business intelligence and analytics companies either sought to supercharge the data analyst herself with AI, or created their own intermediate models to compensate for the shortcomings of the language model itself.
But that has changed, Farkash says. LLMs have demonstrated remarkable growth in their capabilities. The SQL itself is much-improved, he says (although you still wouldn’t let an LLM write a massive 50-page SQL query hitting multiple databases). Everyday business questions are capably handled by AI models that can understand nuances in questions, turn natural language into SQL code, submit the SQL to the data warehouse, reason about the results, and then generate a natural language response. LLMs aren’t joining data, Farkash says, but in some cases, they’re logically joining results together after the fact.
None of this would work without the transparency that today’s LLMs are showing, which is critical for validating the results, Farkash says.
“You see the SQL and you see the data set,” he says. “So as an engineer, you understand completely what the data behind the answer is, how the data was generated, the query that received it…All the SQL gets stored. It’s traceable. There’s no hidden things that the AI can run on your data that you can’t control or see or understand.”
While the LLM is responsible for turning questions into queries and then making sense of the answers, there’s still a need for a fast SQL engine in this new stack. In fact, the need may be even greater, Farkash says, because a single question asked of the LLM may generate 50 or more individual queries that need to be executed within a couple of seconds. If you make the user wait for too long to get the answer, they will lose faith in the system.
This AI-driven usage pattern is what’s giving data warehousing back its mojo, Farkash says.
“Data warehousing was not sexy anymore. Data warehousing lost its shine,” he says. “Now we have purpose for the data warehouse, which is great data for AI, which is to feed AI with correct data, and replace all of our dashboards. This is it. This is the new reality. This is what people are mostly working on now when it comes to data.”
It’s also helping to drive sales for Firebolt, which has raised $270 million in venture capital and has 150 employees located around the world. The core Firebolt offering is open source, while commercial offerings are available on all three major clouds (it’s available as a full managed service on AWS). Farkash says the packaging of Firebolt also gives it an advantage over competing data warehouses, particularly when a customer is scaling up the data warehouse usage by 5x to 10x.
“The demand is exploding, and we want to make sure we’re leading with the new thing that’s happening with data warehousing,” he says. “We’re opening three or four new offices around the globe, moving fast, moving aggressive, moving big.”
Related Items:
Your Next Big Job in Tech: AI Engineer
Firebolt Touts Massive Speedup in Cloud Data Warehouse
Slicing and Dicing the Real-Time Analytics Database Market
The post Data Warehousing for the (AI) Win appeared first on BigDATAwire.
0 Commentaires