Businesses are investing hundreds of billions of dollars in generative AI with the hope that it will improve their operations. However, the majority of these companies have yet to see a return on their investment in large language models and the emerging GenAI stack, outside of a few use cases. So what’s keeping us from achieving the big GenAI payoff that’s been promised?
“There is something going on,” Nvidia CEO Jensen Huang declared in his GTC keynote last month. “The industry is being transformed, not just ours…The computer is the single most important instrument in society today. Fundamental transformations in computing affects every industry.”
Nvidia sits at the epicenter of the GenAI industry, which emerged practically overnight on November 30, 2022, when OpenAI launched ChatGPT into the world. Suddenly, everyone seemed to be talking about the new AI product that mimics human communication to an astounding degree. Whether it’s chatting about sports, answering customer service calls, or rhyming like Shakespeare, ChatGPT seemed to do it effortlessly.
Since then, the GenAI business has taken off, and tech giants have been its biggest cheerleaders. Microsoft invested $13 billion into OpenAI while Amazon recently topped off its investment in Anthropic with $2.75 billion, bringing its total investment to $4 billion. Google has made a $2 billion investment of its own in Anthropic, Databricks bought MosaicML for $1.3 billion, and SAP has invested $1 billion across a series of LLM providers.
While the software stack for GenAI is blossoming, the hardware has benefited primarily one company. Nvidia owns more than 90% of the market for training LLMs. That has been quite good for the firm, which has seen its revenues explode and its total valuation shoot above the $2-trillion level.
Frothy Parrots
Most of the GenAI action has been in software and services. Practically overnight, hundreds of software vendors that build data and analytics tools pivoted their wares to be part of the emerging GenAI stack, while venture capitalists have flooded billions into innumerable AI startups.
It’s gotten rather frothy, what with so many billions floating around. But the hope is those billions today turn into trillions tomorrow. A McKinsey report from June 2023 estimated that GenAI “could add the equivalent of $2.6 trillion to $4.4 trillion annually” across a few dozen use cases. The majority of the benefits will come from just four use cases, McKinsey says, including automation customer operations, marketing and sales, software engineering, and R&D.
Not surprisingly, private businesses are moving quickly to seize the new business opportunity. A KPMG survey of business leaders last month found that 97% plan to invest in GenAI in the next 12 months. Out of that cohort, nearly 25% are investing between $100 million and $249 million, 15% are investing between $250 million and $499 million, and 6% plan to invest more than $500 million.
There are valid reasons for the excitement around GenAI and huge sums being invested to exploit it. According to Silicon Valley veteran Amr Awadallah, today’s large language models represent a fundamental shift in how AI models work and what they can do.
“What they are being trained on is to understand and reason and comprehend and being able to parse English or French or Chinese and understand the concepts of physics, of chemistry, of biology,” said Awadallah who co-founded a GenAI startup called Vectara in 2020. “They’ve been trained for understanding, not for memorization. That’s a key point.”
LLMs don’t just repeat words like stochastic parrots, but have shown they can apply learnings to solve novel problems, said Awadallah, who also co-founded Cloudera. That capability to learn is what has people so excited and is what’s driving the investment in LLMs, he said.
“This random network of weights and parameters inside of the neural network strains evolves in a way that makes it go beyond just repeating words. It actually understands. It literally understands what the world is about,” he told Datanami. “They’re only going to get smarter and smarter. There’s no question. Everybody in the industry concurs that by 2029 or 2030, we’re going to have LLMs that exceed our intelligence as humans.”
However, there are several issues that are preventing LLMs from working as advertised in the enterprise, according to Awadalla. Those include a tendency to hallucinate (or make things up); a lack of visibility into how the model generated its results; copyright issues; and prompt attack. These are issues that Vectara is tackling with its GenAI software, and other vendors are tackling them, too.
Regulatory Maw
Ethics, legal, and regulatory concerns are also hampering the GenAI rollout. The European Union voted to officially adopted the AI Act, which outlaws some forms of AI and requires companies to get prior approval for others. Google pulled the plug on the image-generating feature of its new Gemini model following concerns over historically inaccurate images.
OpenAI last week announced its new Voice Engine could clone a person’s voice after only a 15-second sample. However, don’t expect to see Voice Engine be publicly available anytime soon, as OpenAI has no plans to release it yet. “We recognize that generating speech that resembles people’s voices has serious risks, which are especially top of mind in an election year,” the company wrote in a blog post.
For the most part, the computing community has yet to come to grips with ethical issues of GenAI and LLMs, said İlkay Altıntaş, a research scientist at UC San Diego and the chief data science officer at the San Diego Supercomputer Center.
“You don’t need a data scientist to use them. That’s the commoditization of data science,” she said. “But I think we’re still in the ‘how do I interact with AI, and trustworthiness and ethical use’ period.”
There are ethical checks and ethical techniques that should be used with GenAI applications, Altıntaş said. But figuring out exactly in what situations those checks and techniques should be applied is not easy.
“You might have an application that actually looks pretty kosher in terms of how things are being applied,” she told Datanami. “But when you put two techniques or two data sets or multiple things together, the integration pushes it to a point of not being private, not being ethical, not being trustworthy, or not being accurate enough. That’s when it starts needing those technical tools.”
Hardware and Latency
Another issue hampering the arrival of the GenAI promised land is an acute lack of compute.
Once the GenAI gold rush started, many of the biggest LLM developers snapped up available GPUs to train their massive models, which can take months to train. Other tech firms have been hoarding GPUs, whether running on-prem or in the cloud. Nvidia, which contracts with TSMC to manufacture its chips, has been unable to make enough GPUs to satisfy demand, and the result has been a “GPU Squeeze” and price escalation.
Nvidia’s hardware competitors have sensed an opportunity, and they are charging hard to fill the demand. Intel and AMD are busy working on their AI accelerators, while other chipmakers, such as Cerebras and Hailio, are also bringing out new chips. All of the public cloud providers (AWS, Azure, and Google Cloud) also have their own AI accelerators.
But in the future, it’s doubtful that all GenAI workloads will run in the cloud. A more likely future is that AI workloads will be pushed out to run on edge devices, which is a bet that Luis Ceze, the CEO and founder of OctoAI, is making.
“There’s definitely clear opportunities now for us to enable models to run locally and then connect it to the cloud, and that’s something that we’ve been doing a lot of public research on,” Ceze said. “It’s something that we are actively working on, and I see a future where this is just unavoidable.”
In addition to GenAI workloads running in a hybrid manner, the LLMs themselves will be composed and executed in a hybrid manner, according to Ceze.
“If you think about the potential here, it’s that we’re going to use generative AI models for pretty much every interaction with computers today,” he told Datanami. “Rarely it’s just a single model. It’s a collection of models that talk to each other.”
To really take full advantage of GenAI, companies will need access to the freshest possible data. That requirement is proving to be a boon for database vendors that specialize in high-volume data ingestion, such as Kinetica, which develops a GPU-powered database.
“Right now, we’re seeing the most momentum in real-time RAG [retrieval-augmented generation], basically taking these real time workloads and being able to expose them so that generative solutions can take advantage of that data as it’s getting updated and growing in real time,” Kinetica CEO Nima Negahban told Datanami at the recent GTC show. “That’s been where we’ve seen the most momentum.”
Cracks in the GenAI Baloon
Whether the computing community will come together to address all of these challenges and fulfill the massive promise of GenAI has yet to be seen. Cracks are starting to appear that suggest the tech has been oversold, at least up to this point.
For instance, according to a story in the Wall Street Journal last week, a presentation by the venture capital firm Sequoia estimated that only $3 billion in revenue was obtained by AI players who had invested $50 billion on Nvidia GPUs.
Gary Marcus, an NYU professor who has testified on AI in Congress last year, cited that WSJ story in a Substack blog published earlier this year. “That’s obviously not sustainable,” he wrote. “The entire industry is based on hype.”
Then there is Demis Hassabis, head of Google DeepMind, who told the Financial Times on Sunday that the billions flowing into AI startups “brings with it a whole attendant bunch of hype and maybe some grifting.”
At the end of the day, LLMs and GenAI are very promising new technologies that have the potential to radically change how we interact with computers. What isn’t yet known is the extent of the change and when they will occur.
Related Items:
Rapid GenAI Progress Exposes Ethical Concerns
EU Votes AI Act Into Law, with Enforcement Starting By End of 2024
GenAI Hype Bubble Refuses to Pop
The post What’s Holding Up the ROI for GenAI? appeared first on Datanami.
0 Commentaires