NVIDIA Blackwell Platform Promises GenAI on Trillion-Parameter LLM’s
What Is The News About?
Artificial intelligence using quantum mechanics. Pharmaceutical research and development. Power from fusion reactors. Rapid advancements in computing power and artificial intelligence are setting the stage for the next great leaps forward in scientific computing and physics-based simulations, which will have a profound impact on many areas of human endeavor. During GTC in March, NVIDIA introduced the NVIDIA Blackwell platform, which claims to be able to generate AI on trillion-parameter LLMs with up to 25 times less energy usage and cost than the NVIDIA Hopper architecture.
Besides having far-reaching effects on AI workloads, Blackwell’s technological prowess can aid in the delivery of findings in any area of scientific computing, from conventional numerical simulation to more novel areas. Accelerated computing and AI boost eco-friendly computing by lowering energy expenses. Benefiting from this are numerous scientific computer applications. Digital twin simulations use 58 times less energy and cost 65 times less than typical CPU-based systems, whereas weather simulations use 300 times less energy and cost 200 times less.
Why Is This News Important?
The use of double-precision formats, or FP64 (floating point), is common in scientific computing and simulations grounded on computational physics. Compared to Hopper, Blackwell GPUs provide 30% better performance in FP64 and FP32 FMA (fused multiply-add). When creating new products, it is essential to use simulations grounded in physics. Researchers and developers save billions of dollars by testing and developing goods in simulation. This applies to a wide range of industries, including transportation, software, medicines, silicon chips, and bridges.
A new age in HPC has begun with the release of the NVIDIA GB200. Optimized to speed up inference workloads for LLMs, its architecture has a second-generation transformer engine. New opportunities for high-performance computing are unlocked by this, which provides a 30x speedup for resource-intensive applications compared to the H100 generation, such as the 1.8-trillion-parameter GPT-MoE (generative pretrained transformer-mixture of experts) model. To speed up scientific discovery, HPC applications can benefit from LLMs’ ability to handle and interpret massive volumes of scientific data.
In a lengthy and intricate process that includes analog analysis to determine voltages and currents, modern application-specific integrated circuits (ASICs) are nearly always created on central processing units (CPUs). Although that is evolving. An analog circuit design solver is exemplified by the Cadence SpectreX simulator. Compared to a conventional CPU, SpectreX circuit simulations on a GB200 Grace Blackwell Superchip, which bridges the gap between Blackwell GPUs and Grace CPUs, should complete thirteen times faster. Another important technology is computational fluid dynamics (CFD) which is GPU-accelerated. It is used by engineers and equipment designers to forecast how designs will behave. According to Cadence Fidelity, computational fluid dynamics (CFD) simulations using GB200 devices can be up to 22 times quicker than on conventional CPU-powered systems. Flow details may be captured to an unprecedented level with the help of 30TB of memory per GB200 NVL72 rack and parallel scalability.
Benefits
- 1. Energy Efficiency: NVIDIA Blackwell platform promises AI generation with significantly reduced energy usage, making scientific computing more eco-friendly and cost-effective.
- 2. Improved Performance: Blackwell GPUs offer enhanced performance in double-precision formats, crucial for scientific simulations, aiding researchers in various industries.
- 3. Cost Savings: Simulating designs with Blackwell GPUs can save billions in product development costs across industries like transportation, medicine, and technology.
- 4. Accelerated Simulations: With Blackwell GPUs, simulations such as digital twins and weather forecasts run significantly faster, saving time and resources for engineers and researchers.
- 5. Enhanced High-Performance Computing (HPC): NVIDIA GB200 unlocks new possibilities in HPC, speeding up inference workloads and enabling faster scientific discoveries with LLMs.
Digital twin software from Cadence Reality has various potential uses, one of which is to model an entire data center in virtual reality, down to the servers, cooling systems, and power supply. This kind of virtual model helps engineers save time and money by allowing them to try out various setups and scenarios before committing to them in the real world. Cadence Algorithms grounded in physics are responsible for the magic of reality, which allows us to mimic data centers’ effects on heat, airflow, and power consumption. To better manage capacity, anticipate any operational issues, and optimize the data center’s layout and operation for increased efficiency and capacity utilization, this aids engineers and data center operators. These simulations are expected to perform up to 30 times quicker than CPUs with Blackwell GPUs, providing faster timeframes and better energy economy.
An LLM copilot for parallel programming is being built at Sandia National Laboratories. When it comes to high-performance computing (HPC) applications, traditional AI is good at generating fundamental serial computing code, but LLMs struggle. In an ambitious endeavor, researchers at Sandia are taking on this problem directly. They are using Kokkos, a specialized programming language developed by various national labs, to generate parallel code that can execute operations across the tens of thousands of processors found in the most powerful supercomputers in the world. The RAG method, which Sandia is employing, integrates language generation models with information retrieval capabilities. The group is using RAG to build an AI database that is compatible with Kokkos.
Wrapping Up
The first findings are encouraging. Various RAG methods developed at Sandia have shown the capability of producing Kokko’s code for use in parallel computing. The goal of Sandia is to open up new opportunities in high-performance computing (HPC) at the world’s top supercomputers by removing obstacles in AI-based parallel code development. Research in the fields of climate science, drug discovery, and renewable energy are a few more examples.
For numerous fields, including fusion energy, climate research, medication development, and many more, quantum computing opens the door to a time-machine journey. To create and test quantum algorithms at an unprecedented rate, researchers are diligently working to simulate future quantum computers using software and systems based on NVIDIA GPUs. Thanks to its unified programming approach for CPUs, GPUs, and QPUs (quantum processing units), the NVIDIA CUDA-Q platform makes it possible to simulate quantum computers and build hybrid applications.
CUDA-Q is accelerating NERSC’s quantum chemistry simulations, Stony Brook’s high-energy and nuclear physics workflows, and BASF’s chemical workflows. The Blackwell architecture from NVIDIA will propel quantum simulations to unprecedented levels of performance. The most recent multi-node connectivity technology from NVIDIA, NVLink, allows for quicker data shuttles, which in turn speeds up quantum simulations.
[To share your insights with us as part of editorial or sponsored content, please write to sghosh@martechseries.com]
Comments are closed.