Expedera NPUs Run Large Language Models Natively on Edge Devices

By PRNewswire On Jan 8, 2024

Expedera NPU IP adds native support for LLMs, including stable diffusion

Expedera, Inc, a leading provider of customizable Neural Processing Unit (NPU) semiconductor intellectual property (IP), announced that its Origin NPUs now support generative AI on edge devices. Specifically designed to handle both classic AI and Generative AI workloads efficiently and cost-effectively, Origin NPUs offer native support for large language models (LLMs), including stable diffusion. In a recent performance study using the open-source foundational LLM, Llama-2 7B by Meta AI, Origin IP demonstrated performance and accuracy on par with cloud platforms while achieving the energy efficiency necessary for edge and battery-powered applications.

Zoom Pioneers the Next Era of Custom Enterprise AI with NVIDIA

Oct 29, 2025

Radaris Launches AI-Powered People Search Platform to Help Americans Reconnect in a Disconnected Age

Oct 29, 2025

NVIDIA and Partners Build America’s AI Infrastructure and Create Blueprint to Power the Next Industrial Revolution

Oct 29, 2025

Prev Next 1 of 42,187

LLMs bring a new level of natural language processing and understanding capabilities, making them versatile tools for enhancing communication, automation, and data analysis tasks. They unlock new capabilities in chatbots, content generation, language translation, sentiment analysis, text summarization, question-answering systems, and personalized recommendations. Due to their large model size and the extensive processing required, most LLM-based applications have been confined to the cloud. However, many OEMs want to reduce reliance on costly, overburdened data centers by deploying LLMs at the edge. Additionally, running LMM-based applications on edge devices improves reliability, reduces latency, and provides a better user experience.

“Edge AI designs require a careful balance of performance, power consumption, area, and latency,” said Da Chuang, co-founder and CEO of Expedera. “Our architecture enables us to customize an NPU solution for a customer’s use cases, including native support for their specific neural network models such as LLMs. Because of this, Origin IP solutions are extremely power-efficient and almost always outperform competitive or in-house solutions.”

Expedera’s patented packet-based NPU architecture eliminates the memory sharing, security, and area penalty issues that conventional layer-based and tiled AI accelerator engines face. The architecture is scalable to meet performance needs from the smallest edge nodes to smartphones to automobiles. Origin NPUs deliver up to 128 TOPS per core with sustained utilization averaging 80%—compared to the 20-40% industry norm—avoiding dark silicon waste.

[To share your insights with us, please write to sghosh@martechseries.com]

Expedera NPUs Run Large Language Models Natively on Edge Devices

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy

Expedera NPUs Run Large Language Models Natively on Edge Devices

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

﻿Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought. Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy

Please fill your details and we’ll get in touch with you!

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy