Artificial Intelligence | News | Insights | AiThority
[bsfp-cryptocurrency style=”widget-18″ align=”marquee” columns=”6″ coins=”selected” coins-count=”6″ coins-selected=”BTC,ETH,XRP,LTC,EOS,ADA,XLM,NEO,LTC,EOS,XEM,DASH,USDT,BNB,QTUM,XVG,ONT,ZEC,STEEM” currency=”USD” title=”Cryptocurrency Widget” show_title=”0″ icon=”” scheme=”light” bs-show-desktop=”1″ bs-show-tablet=”1″ bs-show-phone=”1″ custom-css-class=”” custom-id=”” css=”.vc_custom_1523079266073{margin-bottom: 0px !important;padding-top: 0px !important;padding-bottom: 0px !important;}”]

NVIDIA and Google Unleash Game-Changing AI Optimizations for Gemma

Optimizations for Gemma on AI Platforms

Gemma, Google’s new lightweight 2 billion- and 7 billion-parameter large language models, can be run anywhere, reducing costs and speeding up innovative work for domain-specific use cases. NVIDIA and Google launched optimizations across all NVIDIA AI platforms for Gemma.

Read: How to Incorporate Generative AI Into Your Marketing Technology Stack

Using NVIDIA TensorRT-LLM, an open-source library for optimizing large language model inference, coupled with NVIDIA GPUs in the data center, the cloud, and locally on workstations with NVIDIA RTX GPUs or PCs with GeForce RTX GPUs, the companies’ teams worked closely to speed up Gemma’s performance. Gemma is built from the same research and technology as the Gemini models. Because of this, developers may aim their products at the more than 100 million high-performance AI PCs around the world that have NVIDIA RTX GPUs installed.

Read: 10 AI In Energy Management Trends To Look Out For In 2024

 

Related Posts
1 of 40,520

NVIDIA and Google Supercharge AI Platforms for Gemma

On top of that, developers can use Gemma on NVIDIA GPUs in the cloud, such as the A3 instances on Google Cloud that are built on the H100 Tensor Core GPU and the soon-to-be-deployed H200 Tensor Core GPUs from NVIDIA, which come with 141GB of HBM3e memory and 4.8 terabytes per second. To further enhance Gemma and implement the optimized model in their production applications, enterprise developers can leverage NVIDIA’s extensive ecosystem of technologies, which includes NVIDIA AI Enterprise with the NeMo framework and TensorRT-LLM.

Read: Intel’s Automotive Innovation At CES 2024

Find out how TensorRT-LLM is boosting Gemma’s inference and other details for developers. All of the model versions, including the FP8-quantized one and the many Gemma checkpoints, were optimized with TensorRT-LLM.

[To share your insights with us as part of editorial or sponsored content, please write to sghosh@martechseries.com]

Comments are closed.