Snowflake Touts Speed, Efficiency of New ‘Arctic’ LLM

Snowflake today took the wraps off Arctic, a new large language model (LLM) that is available under an Apache 2.0 license. The company says Arctic’s unique mixture-of-experts (MoE) architecture, combined with its relatively small size and openness, will enable companies to use it to build and train their own chatbots, co-pilots, and other GenAI apps.

Instead of building a generalist LLM that’s sprawling in size and takes enormous resources to train and run, Snowflake decided to use an MoE approach to build an LLM that’s smaller than massive LLMs but can offer a similar level of language understanding and generation with a fraction of the training resources.

Specifically, Snowflake researchers, who hail from the Microsoft Research team that built deepspeed, used what they call a “dense-MoE hybrid transformer architecture” to build Artic. This architecture routes training and inference requests to one of 128 experts, which is significantly more than the eight to 16 experts used in other MoEs, such as Databricks’ DBRX and Mixtral.

Arctic is a dense-MOE hybrid transformer LLM (Source: Snowflake)

Arctic was trained on what it calls a “a dynamic data curriculum” that sought to duplicate the way that humans learn by changing the mix of code versus language over time. The result was a model that displayed better language and reasoning skills, said Samyam Rajbhandari, a principal AI software engineer at Snowflake and one of the deepspeed creators.

In terms of capabilities, Arctic scored similarly to other LLMs, including DBRX, Llama3 70B, Mistral 8x22B, and Mixtral 8x7B on GenAI benchmarks. These benchmarks measured enterprise use cases like SQL generation, coding, and instruction following, as well as for academic use cases like math, common sense, and knowledge.

All told, Arctic is equipped with 480 billion parameters, only 17 billion of which are used at any given time for training or inference. This approach helped to decrease resource usage compared to other similar models. For instance, compared to Llama3 70B, Arctic consumed 16x fewer resources for training. DBRX, meanwhile, consumed 8x more resources.

That frugality was intentional, said Yuxiong He, a distinguished AI software engineer at Snowflake and one of the deepspeed creators. “As researchers and engineers working on LLMs, our biggest dream is to have unlimited GPU resources,” He said. “And our biggest struggle is that our dream never comes true.”

Arctic was trained on a cluster of 1,000 GPUs over the course of three weeks, which amounted to a $2 million investment. But customers will be able to fine tune Arctic and run inference workloads with a single server equipped with 8 GPUs, Rajbhandari said.

“Arctic achieves the state-of-the-art performance while being incredibly efficient,” said Baris Gultekin, Snowflake’s head of AI. “Despite the modest budget, Arctic not only is more capable than other open source models trained with a similar compute budget, but it excels at our enterprise intelligence, even when compared to models that are trained with a significantly higher compute budget.”

Snowflake performed inline with other MoE’s on Snowflake’s LLM benchmarks (Source: Snowflake)

The debut of Arctic is the biggest product to date for new Snowflake Sridhar Ramaswamy, the former AI product manager who took the top job from former CEO Frank Slootman after Snowflake showed poor financial results. The company was expected to pivot more strongly to AI, and the launch of Arctic shows that. But Ramaswamy was quick to note the importance of data and to reiterate that Snowflake is a data company at the end of the day.

We’ve been leaders in the space of data now for many years, and we are bringing that same mentality to AI,” he said. “As you folks know, there is no AI strategy without a data strategy. Good data is the fuel for AI. And we think Snowflake is the most important enterprise AI company on the planet because we are the data foundation. We think the house of AI is going to be built on top of the data foundation that we are creating.”

Arctic consumed fewer resources than other similiar LLMs, according to Snowflake (Source: Snowflake)

Arctic is being released with a permissive Apache 2 license, enabling anybody to download and use the software any way they like. Snowflake is also releasing the model weights and providing a “research cookbooks” that allow developers to get more out of the LLM.

“The cookbook is designed to expedite the learning process for anyone looking into the world class MoE models,” Gultekin said. “It offers high level insights as well as granular technical details to craft LLMs like Arctic, so that anyone can build their desired intelligence efficiently and economically.”

The openness that Snowflake has shown with Arctic is commendable, said Andrew Ng, the CEO of Landing AI.

“Community contributions are key in unlocking AI innovation and creating value for everyone,” Ng said in a press release. “Snowflake’s open source release of Arctic is an exciting step for making cutting-edge models available to everyone to fine-tune, evaluate and innovate on.”

The company will be sharing more about Arctic at its upcoming Snowflake Data Cloud Summit, which is taking place in San Francisco June 3-6.

It’s a Snowday! Here’s the New Stuff Snowflake Is Giving Customers

Snowflake: Not What You May Think It Is

The post Snowflake Touts Speed, Efficiency of New ‘Arctic’ LLM appeared first on Datanami.