As infrastructure prices rise and companies begin in search of precise outcomes from their AI investments previously two years, Crimson Hat thinks it has the reply in open-source software program libraries that make giant language fashions (LLMs) run extra effectively.
The trick, it believes, is to cut back the price of producing AI outputs and decreasing the dependance on Nvidia’s much-sought-after graphics processing items (GPUs) which can be used for a lot of at the moment’s AI heavy lifting.
By way of open-source software program libraries which can be compiled to assist run LLMs sooner – even on competing {hardware} like on AMD and Intel – Crimson Hat is betting that it could actually enhance effectivity sufficient to beat at the moment’s bottlenecks and enhance AI adoption.
Beforehand, IBM (Crimson Hat’s dad or mum firm) had been advising buyer to go for smaller AI fashions, mentioned Brian Stevens, the AI chief expertise officer (CTO) for Crimson Hat.
Nonetheless, companies can now depend on extra greater fashions as a result of they gained’t have to fret as a lot about the price of GPUs to get the job completed, he instructed Techgoondu in an interview in Singapore final week.
“How will we get present prospects to be extra environment friendly? We dropped 30 per cent of inference prices… to allow them to begin a platform for innovation,” he mentioned.
In March, Crimson Hat launched its AI Inference Server that guarantees to let companies generate AI outputs extra effectively.
It packs in software program libraries from an open-sourced venture known as vLLM, that are used to run AI fashions on completely different {hardware}, together with customized chips made by Google and Amazon Internet Providers.
The way it improves inference efficiency is partly by chopping again on GPU reminiscence utilization and higher allocating numerous assets to completely different workloads.
Maybe extra importantly, Crimson Hat guarantees to run effectively on non-Nvidia {hardware} as properly, so there are extra {hardware} decisions for, say, a financial institution to select from whether it is constructing its personal AI infrastructure.
Nvidia’s highly effective Cuda software program instruments, which speed up the corporate’s GPUs to run AI workloads, have been instrumental in preserving it within the lead previously couple of years.
Nonetheless, if different platforms and accelerators make use of Crimson Hat’s software program instruments to achieve good efficiency at a extra environment friendly price, then they may transform stronger alternate options in future.
“This frees up organisations from worrying about AI infrastructure,” mentioned Stevens. “As a substitute, you consider the right way to construct your agentic AI app or reasoning service… you don’t have to fret concerning the platform.”
Nvidia additionally works with Crimson Hat on vLLM, he famous, and the event groups have “a number of conferences” each week. “We are going to make it the perfect interface for Nvidia.”
May the present AI gold rush transform just like the dot.com growth greater than 20 years in the past? Again then, Solar Microsystems have been the one ones making the highly effective servers wanted to deal with the excessive site visitors volumes for any fashionable web site.
Nonetheless, it stumbled when low cost servers working commodity Intel chips proved simply as highly effective, primarily delivering the early cloud computing mannequin that enabled anybody to run an internet site cheaply.
May extra cheaply out there AI servers ship the identical impression now? Stevens, who labored for 14 years at Digital Gear Corp, a Solar rival, mentioned this might be the way in which ahead.
Doing extra with much less is nice for companies to unlock the potential of AI that has been elusive for a lot of due to the prices concerned, he defined.
A extra environment friendly manner ahead will profit these trying to undertake new AI fashions, corresponding to Meta’s Llama 4 and DeepSeek, which can be popping out quick, he famous.
A yr from now, inference or the technology of AI outputs and analyses shall be cheaper and simpler, he famous, as a result of the expertise can be extra “democratised and commoditised”.

