Fermyon First to Make Enterprise AI Apps 100x Faster to Run With Game-Changing WebAssembly Compute Innovation
Fermyon Serverless AI Efficiently Timeshares Each GPU for Thousands of AI Developers Concurrently
Fermyon Technologies, the serverless WebAssembly company, announced Fermyon Serverless AI, a new capability that radically alters the field of the technology industry’s largest, most dominant paradigm shift in decades, AI. Serverless AI is now available on Fermyon Cloud’s free tier, showcasing Fermyon’s supersonic startup time for AI inferencing with LLMs.
Read More about AiThority: Developing Responsible AI Solutions for Healthcare: A CTO’s Perspective
“Enterprises wishing to build AI applications that go beyond simple chat services face a largely insurmountable dilemma – it’s either cost prohibitive or it’s abysmally slow and, therefore, often abandon plans to build AI apps. Fermyon has used its core WebAssembly-based cloud compute platform to run fast AI inferencing workloads. It achieves this by using its technology to only use the GPU for the duration of the inferencing request, thus multiplexing thousands of requests into a single GPU,” said Omdia analyst Roy Illsley.
Inferencing on large language models (LLMs) is one of the most popular workloads in computing today. Demand for GPUs is high but the equipment itself is scarce and expensive. As a result, developers tasked with building and running enterprise AI apps on LLMs like LLaMA2 face a 100x compute expense for access to GPUs at $32/instance-hour and upwards. Alternatively, they can use on-demand services but then experience abysmal startup times. This makes it impractical to deliver enterprise-based AI apps affordably.
AiThority Insights: Can Artificial Intelligence Detect Business Logic Attacks Early?
Fermyon Serverless AI has solved this problem by offering 50 millisecond cold start times, over 100x faster than other on-demand AI infrastructure services. This breakthrough is made possible because of serverless WebAssembly technology powering Fermyon Cloud, the fastest, most secure, most flexible and most affordable serverless solution on the market. Fermyon Cloud is architected for sub-millisecond cold starts and high-volume time-slicing of compute instances which has proven to alter compute densities by a factor of 30x. Extending this runtime profile to GPUs makes Fermyon Cloud the fastest AI inferencing infrastructure service.
“At Fermyon, we set out to build the next wave of cloud computing by squeezing every last bit of efficiency out of CPU utilization. With the boom in AI interest, we extended this same performance profile to high-end GPUs. GPUs are essential to AI. But compared to CPUs, GPUs are massively more expensive. The solution is to improve efficiency and time-sharing of GPU usage. And we do that with a WebAssembly-powered serverless platform that boasts supersonic startup speed, a strong security sandbox, and most of all, platform neutrality that extends beyond just OS and CPU, but to GPU architecture as well. Fermyon’s new Serverless AI is the easiest, fastest and cheapest way to build enterprise AI inferencing apps,” said Matt Butcher, co-founder and CEO of Fermyon.
Fermyon Serverless AI brings a new tool to the fullstack developer’s toolbox. Combined with Fermyon’s NoOps SQL Database and Key Value Storage, developers can quickly build advanced AI-enabled serverless applications without needing external vector databases or storage.
Fermyon Serverless AI has been added to both Fermyon Cloud and Spin and is currently in private beta. Developers can work locally with the AI inferencing technology in Spin, the popular open source product, with more than 3900 GitHub stars and over 105,000 downloads, that is the easiest way for developers to build WebAssembly serverless apps. And with one command they can deploy the application to Fermyon Cloud, taking advantage of powerful AI grade GPUs. Developers can sign up to join Fermyon’s private beta.
Latest AiThority Insights : “What Will Happen to All the Horses?” – Surviving the Coming AI Revolution
[To share your insights with us, please write to firstname.lastname@example.org]