About six years ago, Alex Kvamme and his co-founder, CTO Trey Doig, started Pathlight to provide companies with insights into conversations they have with their customers. They adopted the most powerful natural language processing (NLP) technologies available at the time, but they left much to be desired.
“We were leveraging early machine learning frameworks to do keyword and topic detection, sentiment,” CEO Kvamme says. “None of them were very good and required tons of training and work ahead of time. And just honestly, they were throwaway features for us, because they weren’t really moving the needle. They just weren’t going that deep.”
The needle started twitching when LLMs like OpenAI’s Generative Pretrained Transformer (GPT) hit the market. And when OpenAI launched ChatGPT 10 months ago, Kvamme knew that it would be a gamechanger.
“That was the immediate kind of light bulb,” he tells Datanami in an interview. “We had already built a product for customers to manually review conversations within our platform and so we just kind of rebuilt it from scratch to be based on the LLM to automatically review it.”
The success of that first project led Pathlight to do a significant amount of research and development into LLMs over the past year. During the process of integrating LLMs into its conversation intelligence platform, the company learned a lot about how to work with the new technology, and it also developed a significant amount of its own tooling.
One of the most important lessons for Kvamme was the importance of adopting a hybrid or multi-LLM strategy, which gave Pathlight the flexibility to change LLM models and providers as needed.
“Summarization might go to ChatGPT. Tagging might go to a Llama-2 internally hosted. Custom questions might go to Anthropic,” he says. “Our perspective is, we would rather get really good at being multi-LLM and LLM-agnostic today, because that’s a superpower of ours. That’s what allows us to scale and create more consistency.”
ChatGPT might be working fine today, but tomorrow it might start giving crazy answers. Similarly, some customers get allergic to the idea of sending any piece of data to OpenAI. That’s fine, because Pathlight’s engineers have the capability to simply reroute the requests to another LLM and provider.
“They actually never give us a good reason, but it’s more like, I just don’t trust OpenAI,” Kvamme says. “And so in that case, we have to find the right kind of model situation for them.”
It took lots of work to build that level of flexibility into the Pathlight offering. The company also built its own tools to automate common tasks like model provisioning, hosting, testing, and deployment. Some jobs need batch processing, so Pathlight built a layer for job queuing, retry processing, and logging. It developed tools for prompt engineering. It made tools for interacting with AI agents at the customer layer.
“The layers that we’re building, these layers exist in normal SaaS,” Kvamme says. “They just haven’t existed in LLMs yet.”
The company didn’t set out to build its own tools for integrating GenAI into its business. It’s just that the tools haven’t been built yet. Or sometimes, the tools are available, but they’re so immature that you might as well roll your own.
“It’s always like the three guys in a garage type of thing,” Kvamme says. “So it’s kind of a question of, do we want those three guys in the garage, or our three guys, the three engineers on our side, to build it?”
Compute infrastructure is supposed to be a solved problem in the world of SaaS. Need some more CPUs? Just dial up your EC2 capacity on AWS. If your offering is serverless, it will automatically scale to consume the CPUs needed at peak processing, then scale back to save you dough when demand drops. Easy, peasy.
That’s not the way the GenAI world works. Demand for high-end GPUs is so high, and compute expenses are so great, that SaaS veterans like Kvamme have been forced to become bean counters again.
“I’ve been doing SaaS for awhile. I’ve never had to think this hard about unit economics,” Kvamme says. “I’ve had to do more thinking than I have had to do in many years on the actual unit economics, how much to charge for this, so we don’t lose money from the transaction cost.”
The San Francisco-based company also built out its own internal LLM to analyze a massive amount of raw audio data. Pathlight could never have gotten enough time in the cloud to analyze more than 2 million hours of audio in a timely manner, so it built its own Llama-2 system to do that.
Fitting the right model to the right job is an important part of building a profitable business with GenAI. Pathlight, like other early adopters of GenAI, has learned this the hard way.
“It feels like right now, we’re using the Ferrari to drive to grocery store for a lot of the jobs to be done,” Kvamme says.
The good news is that, as the technology improves on both the hardware and the software fronts, businesses won’t have to rely on the sportscar of GPUs, or the all-knowing but expensive “God models” like GPT-4, to do everything.
“I certainly see a path where LLMs are going to be evolving much closer to just commodity hardware,” Doig, the CTO, says. “So getting off of this extreme high-end GPU requirement in order to do anything at scale, I think it’s going to be sort of a relic of the past.”
The industry is moving forward with new methods, such as quantization, that will reduce the size of models down to something that can be run on an Apple M2 chip, he says. This will coincide with a fragmentation of the LLM market, providing more and better GenAI options for businesses like Pathlight.
“You might have LLMs that are really good at text generation. You’re already seeing it with code generation,” Doig says. “I think that that fragmentation, that specialization of models, is going to continue, and as a result they’ll get smaller and more capable of running on the ridiculous amounts of CPU power that we have available today.”
In the end analysis, GenAI is an extremely powerful technology that holds a lot of promise for doing more with the huge amounts of unstructured text out there. It’s giving us another interface to computers, which is shaking up markets. But actually incorporating GenAI into a functioning business is easier said than done.
“The underlying truth is that it’s never been easier to build a demo,” Kvamme says. “A really cool demo. But it’s been harder and much more complex to scale. That’s sort of the interesting creative tension that we’ve seen.”
“I think it’s more fun than frustrating,” he continues. “It’s like we are building on quicksand at any point. These things are changing so quickly. And so it’s another thing when I talk to our customers who might consider building some stuff themselves. Customers are always doing that. And again, it’s very easy to build the demo.”
Related Items:
GenAI and the Future of Work: ‘Magic and Mayhem’
DSPy Puts ‘Programming Over Prompting’ in AI Model Development
The post Pathlight Finds a Route to Real-World GenAI Productivity appeared first on Datanami.
0 Commentaires