NVIDIA and Google Unleash Game-Changing AI Optimizations for Gemma
Optimizations for Gemma on AI Platforms
Gemma, Google’s new lightweight 2 billion- and 7 billion-parameter large language models, can be run anywhere, reducing costs and speeding up innovative work for domain-specific use cases. NVIDIA and Google launched optimizations across all NVIDIA AI platforms for Gemma.
Read: How to Incorporate Generative AI Into Your Marketing Technology Stack
Using NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, coupled with NVIDIA GPUs in the data center, the cloud, and locally on workstations with NVIDIA RTX GPUs or PCs with GeForce RTX GPUs, the companies’ teams worked closely to speed up Gemma’s performance. Gemma is built from the same research and technology as the Gemini models. Because of this, developers may aim their products at the more than 100 million high-performance AI PCs around the world that have NVIDIA RTX GPUs installed.
Read: 10 AI In Energy Management Trends To Look Out For In 2024
NVIDIA and Google Supercharge AI Platforms for Gemma
On top of that, developers can use Gemma on NVIDIA GPUs in the cloud, such as the A3 instances on Google Cloud that are built on the H100 Tensor Core GPU and the soon-to-be-deployed H200 Tensor Core GPUs from NVIDIA, which come with 141GB of HBM3e memory and 4.8 terabytes per second. To further enhance Gemma and implement the optimized model in their production applications, enterprise developers can leverage NVIDIA’s extensive ecosystem of technologies, which includes NVIDIA AI Enterprise with the NeMo framework and TensorRT-LLM.
Read: Intel’s Automotive Innovation At CES 2024
Find out how TensorRT-LLM is boosting Gemma’s inference and other details for developers. All of the model versions, including the FP8-quantized one and the many Gemma checkpoints, were optimized with TensorRT-LLM.
[To share your insights with us as part of editorial or sponsored content, please write to sghosh@martechseries.com]
Comments are closed.