FriendliAI Announces New System to Hike Up the Serving Efficiency of Large-scale AI Models at OSDI 2022

By AIT News Desk On Jul 13, 2022

The New System ‘Orca’ reduces the costs of using large-scale generative models to make it affordable for a wide range of users

FriendliAI released the serving system ‘Orca’ which dramatically enhances the serving efficiency of large-scale generative models at OSDI 2022. FriendliAI is a startup company that provides PeriFlow, a platform for developing large-scale AI models.

Orca is a serving system that enables the efficient operation of large-scale AI models. It can remove the inefficient delays of existing serving systems by using two core techniques: ‘iteration level scheduling’ and ‘selective batching.’

For understanding, imagine a group of friends that would like to ride a four-seated tandem bicycle along the Hudson River. Some want to bike for only 10 minutes, whereas others want a full hour. Under existing serving systems, once a bike ride has begun, all riders would be forced to bike until all passengers were satisfied (the longest target bike time on that team) and no other friends would be able to join until the former group was finished and returned to the starting point.

Orca solved this problem by using a ‘shuttle run’ system where the group returns to the starting point every 10 minutes, so riders can hop off individually soon after they satisfy their goals and late-comers can join the bike ride without a long wait. This shuttle run system corresponds to iteration-level scheduling. Additionally, Orca provides another technique, selective batching, to group the originally ungroupable riders before the biking starts.

BLiNK AI Integrates with Tekion to Bring Seamless Connectivity to the Automotive Ecosystem

Nov 19, 2025

AIM Intelligence Spotlighted for ‘AI Security Technology’ at OpenAI Dev Day

Nov 19, 2025

Coalition for Secure AI Releases Two Actionable Frameworks for AI Model Signing and Incident Response

Nov 19, 2025

Prev Next 1 of 42,250

With Orca, large-scale models can perform their generative tasks more than tens of times faster than existing serving systems (with GPT-3 175B). Besides, the cost of using large-scale models like GPT-3 is a hundredth smaller with Orca. The challenges of using large-scale models fade away with the new serving system. Orca can serve these models to a much wider range of users through a new level of accessibility.

The research on Orca, “Orca: A Distributed Serving System for Transformer-Based Generative Models”, was presented to OSDI 2022 (16th USENIX Symposium on Operating Systems Design and Implementation) on July 12th, which is a top-notch conference in the field of Computer Systems. Orca is already being used in production.

“Not only is it important to acquire data to improve model learning, but maximizing the efficiency of the serving system itself allows users to make use of large-scale generative models like OpenAI’s GPT-3,” said Byung-Gon Chun, the CEO of FriendliAI. “I expect this research will increase opportunities to use large-scale models on a variety of products.”

[To share your insights with us, please write to sghosh@martechseries.com]

FriendliAI Announces New System to Hike Up the Serving Efficiency of Large-scale AI Models at OSDI 2022

The New System ‘Orca’ reduces the costs of using large-scale generative models to make it affordable for a wide range of users

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy

FriendliAI Announces New System to Hike Up the Serving Efficiency of Large-scale AI Models at OSDI 2022

The New System ‘Orca’ reduces the costs of using large-scale generative models to make it affordable for a wide range of users

Quick Links

Visit Our Other Sites

Follow Us

Interested in our Customized Editorial Services?

﻿Please fill your details and we’ll get in touch with you!

NEWS

INTERVIEWS

INSIGHTS

AI RADAR

SERVICES

SUBSCRIBE

CONTACT US

Brought to you by

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought. Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy

Please fill your details and we’ll get in touch with you!

To repurpose or use any of the content or material on this and our sister sites, explicit written permission needs to be sought.

Copyright © 2025 AiThority. All Rights Reserved. Privacy Policy