AiThority Interview with Glenn Jocher, Founder & CEO, Ultralytics
Glenn Jocher, Founder & CEO, Ultralytics chats about the importance of democratizing AI in this AiThority interview:
__________
Hi Glenn, tell us about your journey in tech…
My journey began in particle physics, seeking to understand the universe’s fundamental building blocks. That work taught me how to approach complex problems systematically, but I eventually found myself drawn to a different kind of challenge: making powerful technology usable in the real world.
I moved into machine learning and computer vision because I saw a gap between what was possible in research labs and what people could actually deploy. The tools were fragmented, the barriers to entry were high, and too much incredible work stayed locked behind academic papers or proprietary systems. I felt frustrated and thought I could bridge that gap with Ultralytics to create the YOLO ecosystem to tear down the walls around vision AI and make it accessible to anyone with the drive to use it. YOLO is our accessible computer vision system that brings the latest object detection technology from the lab into the real world.
What started as an open-source project became a global movement with millions of developers, 125,000 GitHub stars, and 2.5 billion daily model inferences across robotics, healthcare, manufacturing, and beyond. Throughout this journey, I’ve focused on removing friction, making models faster, easier to deploy, and ready for real-world constraints. AI shouldn’t require permission or massive budgets to use. It should just work.
Also Read: AiThority Interview With Arun Subramaniyan, Founder & CEO, Articul8 AI
We’d love to know more about Ultralytics and your latest feature – YOLO26?
Ultralytics is the global leader in open-source vision AI. Our mission has always been to democratize AI by making it accessible and performant anywhere, from cloud to edge.
YOLO26 is the most advanced and deployable YOLO model we’ve ever built. We designed it from the ground up for edge and low-power devices, environments where vision AI runs in the real world. The breakthrough is our end-to-end, NMS-free architecture that eliminates the post-processing complexity holding back production deployment. No more fragile cleanup steps or platform-specific workarounds. The model outputs final predictions directly.
What does that mean practically? Up to 43% faster CPU inference compared to YOLO11. Better small-object detection through Progressive Loss Balancing and STAL. Dramatically simplified export across hardware platforms. YOLO26 removes the barriers between research and production; it’s faster to integrate, more reliable to deploy, and built for constrained environments like robotics, embedded systems, and edge accelerators.
This is vision AI designed for where it needs to perform, not only where it’s easy to benchmark. We’ve removed the friction that traditionally slows teams down, so they can move from experimentation to real-world impact faster.
What are some of the key complexities that have often held back edge deployment over the years?
The biggest barrier has been that traditional object detection pipelines were designed for cloud GPUs, not real-world edge constraints. These systems rely on heavy post-processing steps, particularly Non-Maximum Suppression, that run outside the neural network and don’t map well to edge hardware. NMS adds latency, requires custom implementations for every runtime, and introduces fragility. What works in development often breaks or behaves differently in production.
Edge devices operate under constraints most research ignores: limited memory, tight power budgets, no active cooling, and heterogeneous hardware mixing CPUs with specialized accelerators. Models optimized for cloud GPUs perform poorly in these environments. Post-processing that’s negligible on a server becomes a bottleneck on a robot or embedded camera.
There is also an integration problem. Exporting models for edge deployment has historically meant maintaining separate codebases for different platforms, dealing with inconsistent behavior across runtimes, and writing glue code to bridge the gap between what the model outputs and what production systems need. Each deployment target becomes its own project.
YOLO26 eliminates these complexities by solving the root problem. It removes NMS entirely through end-to-end inference that is optimized for CPU-first performance from day one, and we built the model to produce final predictions directly. No external cleanup. No platform-specific workarounds. What you export is what you run. That’s how you deploy at scale.
How can modern tech and IT teams deploy edge AI faster and with scale?
Focus on models designed for predictable deployment. When what you test in development matches what runs in production, you eliminate the integration headaches that slow teams down.
Optimize for real hardware such as CPUs and edge accelerators, not only cloud GPUs. Unify your workflow with platforms that handle training through deployment in a single framework, so teams aren’t constantly switching tools or writing custom integration code.
Can you talk about the various ways in which AI and tech innovators should simplify and democratize AI for the greater good?
It starts with open source as a real commitment, not marketing. When models and techniques are freely available, anyone can build production systems without gatekeeping or prohibitive costs. But accessibility also requires simplicity, which means having tools that work out of the box, without requiring specialized expertise or weeks of custom code.
Edge-first design matters because the most impactful applications run where connectivity is limited or latency is critical: factory floors, agricultural fields, remote clinics. If models only run efficiently on expensive cloud infrastructure, you’re excluding most real-world use cases.
Our view at Ultralytics is AI should serve people, not the other way around. That means building technology that works in production, not purely in controlled research environments. Background and budget shouldn’t determine who gets to innovate.
Five thoughts you’d leave our SaaS and tech readers with before we wrap up?
- Ship faster by removing complexity, not adding abstraction. The teams that win are the ones who keep it simple and eliminate unnecessary steps rather than build elaborate systems to manage them.
- Second, optimize for where your users actually are. Edge, mobile, constrained environments, that’s where most real-world value gets created. Cloud-only thinking leaves impact on the table.
- Build in the open. A global community will stress-test your work in environments you’d never access alone and solve problems you haven’t thought of yet. No internal team can match that collective intelligence.
- Design for production from day one. The gap between “it works in the notebook” and “it works in the field” is where most projects die. Close that gap early.
- Stay close to the problem. The best technology decisions come from deeply understanding real constraints, power budgets, latency requirements, integration complexity, not from chasing benchmarks that don’t reflect reality.
Also Read: Cheap and Fast: The Strategy of LLM Cascading (Frugal GPT)
[To share your insights with us, please write to psen@itechseries.com]
Ultralytics, is the company behind YOLO (You Only Look Once), the world’s most widely adopted open-source computer vision models.
Glenn Jocher is the Founder and CEO of Ultralytics, the company behind YOLO (You Only Look Once), the world’s most widely adopted open-source computer vision models. On a mission to democratize vision AI, Glenn has spent over 20 years spanning machine learning, geospatial intelligence, and vision systems. He has dedicated his career to making AI accessible, efficient, and easy to use.
Comments are closed.