Cerebras Systems and National Energy Technology Laboratory Set New Milestones for High-Performance

By AIT News Desk On Nov 11, 2022

The Cerebras CS-2 and its Wafer-Scale Engine Outperforms Leading Supercomputer by Over Two Orders of Magnitude in Time to Solution, Delivering CPU and GPU-Impossible Performance

Cerebras Systems, the pioneer in high performance artificial intelligence (AI) compute, announced record-breaking performance on the scientific compute workload of forming and solving field equations. In collaboration with the Department of Energy’s National Energy Technology Laboratory (NETL), Cerebras demonstrated its CS-2 system, powered by the Wafer-Scale Engine (WSE), was as much as 470 times faster than NETL’s Joule Supercomputer in field equation modeling, delivering speeds beyond what either CPUs or GPUs are currently able to achieve.

Latest NaturalAI Insights: InspireXT Announces Acquisition Of NaturalAI – A Conversational Artificial Intelligence Platform To Expand Its Solution Portfolio

“Cerebras is proud of our collaboration with NETL. Together we have produced extraordinary results in advancing foundational workloads in scientific compute”

The workload under test was field equation modeling using a simple Python API that enables wafer-scale processing for much of computational science, achieving gains in performance and usability that cannot be obtained on conventional computers and supercomputers. This domain-specific, high-level programmer’s toolset is called the WSE Field-equation API, or WFA. The WFA outperforms OpenFOAM® on NETL’s Joule 2.0 supercomputer by over two orders of magnitude in time to solution. While this performance is consistent with hand-optimized assembly codes, the WFA provides an easy-to-use, high-level Python interface that allows users to form and solve field equations effortlessly. This new WFA toolset has the potential to change the way computers are used in engineering in a positive and basic way.

This work demonstrates the fastest known time-to-solution for field equations in computing history at scales up to several billion cells. The speed was achievable because of the WSE provides memory and point-to-point bandwidths high enough that there are no communication bottlenecks for tensor instructions and computation proceeds at or faster than the clock rate. In the past, field equations have been memory bound, and in distributed systems, they are limited by node-to-node communication bandwidth. These limitations create a need for memory hierarchies and complex programming methods to ensure maximum possible utilization. All of this complexity is eliminated because of the exceptionally high bandwidths afforded on the WSE, and tensor instructions can proceed at rates not possible with conventional, distributed Von Neuman architectures.

“NETL and Cerebras collaborated to develop a new programming methodology with a team of three people on never-before seen hardware with a unique instruction set, all in less than 18 months. To put that in perspective, efficient distributed computing efforts often take years to decades of work with very large groups of developers,” said Dr. Brian J. Anderson, Lab Director at NETL. “By using innovative new computer architectures, such as the Cerebras WSE, we were able to greatly accelerate speed to solution, while significantly reducing energy to solution on a key workload of field equation modeling. This work combining the power of supercomputing and AI will deepen our understanding of scientific phenomena and greatly accelerate the potential of fast, real-time, or even faster-than-real-time simulation.”

AI News: Infobip Creates AI-powered Chatbot for Uber

The Joule 2.0 supercomputer is the 139^th fastest supercomputer in the world as ranked by the TOP 500.org, and contains 84,000 CPU cores and 200 GPUs. By bringing together exceptional memory performance with massive bandwidth, low latency inter-processor communication and an architecture optimized for high bandwidth computing, the CS-2 time-to-solution and energy consumption was superior to the Joule supercomputer when running a standardized multi-dimensional, time-variant field equation test problem.

The CS-2 was proven as much as 470 times faster in time-to-solution than the largest cluster of CPUs that NETL’s Joule 2.0 supercomputer could allocate to the problem of this size. It was also found to be more than two orders of magnitude more energy efficient than distributed computing.

“Cerebras is proud of our collaboration with NETL. Together we have produced extraordinary results in advancing foundational workloads in scientific compute,” said Andrew Feldman, co-founder and CEO, Cerebras Systems. “Conventional supercomputers consume enormous amounts of energy, are complex to set up, time consuming to program and as demonstrated by our work, slower to produce answers than the Cerebras CS-2. Through our partnership with NETL, the CS-2 proves that wafer-scale integration is a viable solution to many of the leading scientific problems in high performance computing—in fact, we showed that the CS-2 produces results that are hundreds of times faster than the biggest supercomputers, while using hundreds of times less energy.”

The research was led by Dr. Dirk Van Essendelft, Machine Learning and Data Science Engineer at NETL; Robert Schreiber, Distinguished Engineer at Cerebras Systems; and Michael James, co-founder and Chief Architect for Advanced Technologies at Cerebras. The results came after months of work and continue the close collaboration between the Department of Energy’s NETL laboratory scientists and Cerebras Systems. In November 2020, Cerebras and NETL announced a new compute milestone on the key scientific workload of Computational Fluid Dynamics (CFD).

With every component optimized for AI work, the CS-2 delivers more compute performance at less space and less power than any other system. It does this while radically reducing programming complexity, wall-clock compute time, and time to solution. Depending on workload, from AI to HPC, CS-2 delivers hundreds or thousands of times more performance than legacy alternatives. A single CS-2 replaces clusters of hundreds or thousands of GPUs that consume dozens of racks, use hundreds of kilowatts of power, and take months to configure and program. At only 26 inches tall, the CS-2 fits in one-third of a standard data center rack.

Latest Aithority Insights : NVIDIA Raises the Standard of Low Code DevOps with the NVIDIA AI Enterprise 2.1

[To share your insights with us, please write to sghosh@martechseries.com]

Cerebras Systems and National Energy Technology Laboratory Set New Milestones for High-Performance

The Cerebras CS-2 and its Wafer-Scale Engine Outperforms Leading Supercomputer by Over Two Orders of Magnitude in Time to Solution, Delivering CPU and GPU-Impossible Performance

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

Cerebras Systems and National Energy Technology Laboratory Set New Milestones for High-Performance

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

﻿Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought. Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy

Please fill your details and we’ll get in touch with you!

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2026 AiThority. All Rights Reserved. Privacy Policy