[bsfp-cryptocurrency style=”widget-18″ align=”marquee” columns=”6″ coins=”selected” coins-count=”6″ coins-selected=”BTC,ETH,XRP,LTC,EOS,ADA,XLM,NEO,LTC,EOS,XEM,DASH,USDT,BNB,QTUM,XVG,ONT,ZEC,STEEM” currency=”USD” title=”Cryptocurrency Widget” show_title=”0″ icon=”” scheme=”light” bs-show-desktop=”1″ bs-show-tablet=”1″ bs-show-phone=”1″ custom-css-class=”” custom-id=”” css=”.vc_custom_1523079266073{margin-bottom: 0px !important;padding-top: 0px !important;padding-bottom: 0px !important;}”]

Agentic Ops Is Coming — Is Your Infrastructure Ready?

Network and infrastructure management has always been deeply personal and manual. Traditionally, network and infrastructure engineers were like esoteric wizards who knew the right incantations to make their infrastructure do the thing that they wanted it to do. They worked by enacting those incantations very directly on the network: logging into devices, knowing protocols and configuration languages by heart.

But over the past decade or so, scale, complexity, and criticality have all compounded. We’ve gone from networks enabling cool new ideas to now running the world. Networks and the infrastructure supporting them got more important and simultaneously more complicated, and the bespoke approach of the 1980’s to the 2000’s simply won’t work in the age of AI and automation.

As agents proliferate and increasingly become part of a day-to-day workflows in network and infrastructure management, there are three critical factors that will dictate successful and safe agentic adoption: 1. Structured context to automate from, 2. software that is optimized for both the human and agent experience (what we like to call “two front doors”) and 3. clear guardrails, audit trails and validation that are established early in the adoption process.

Also Read: AiThority Interview with Matej Bukovinski, Chief Technology Officer at Nutrient

Structured Context

Automation is impossible without a structured representation of what the network and infrastructure should look like. While visibility into the operational state of your network and infrastructure can be useful, it gains significance when it’s compared to the intended state.

When it comes to automation, it’s awfully hard to automate something you don’t have a plan for. The key word in defining intent is should: “How should my network be behaving today? What topology should it be in? What should be allowed to talk to what?” or “How should my datacenter fabric be cabled in my new site? Which interfaces should this cable terminate at? Which cable run should it be part of?”

However, most network engineers and IT teams know that once you have intent defined, reality immediately diverges from it. The real world and intent are never aligned because reality is noisy and imperfect: Backhoes cut fiber or technicians rack and plug cables into the wrong port. That gap, drift, is where risk lives, and it should be the direct driver of automation. When you know “I want my network to look like this, it actually looks like this, and here’s the difference,” what do you automate? You automate away that difference.

Two Front Doors

When CIOs are looking for new software today, it’s crucial that they ensure that software is built with “two front doors” – a “user experience” optimized for humans, and an “agent experience” that is optimized for AI agents. The human UX unlocks strategic insight and exploration – this is where human operators will spend their time as agents pick up more of the tactical work – and should include visualizations, AI copilots that allow for natural language investigation, and analysis capabilities that allow teams to present information in the most impactful and digestible way. Future proofed “AX” requires every capability of the software and every element of the underlying data to be addressable by agents – via MCP, APIs, skills,and other emerging agentic standards.

Guardrails, Audit Trails and Validation

The important thing about guardrails is that they’re not just protection. They’re what let you move faster. Agents that can autonomously refine their outputs based on self-service validation drive real velocity by “shifting validation left”. In addition, a team that has clear boundaries for what agents can do autonomously vs. what requires human-in-the-loop (HITL) review and approval will delegate more to those agents over time. A team without guardrails will second-guess every automated action, and eventually stop trusting automation altogether.

Not every agent action carries the same risk. Updating a device description is different from modifying a routing policy. As agentic infrastructure management capabilities improve, our comfort with autonomy in lower risk situations will grow, but we will always need human review and approval workflows when agents are involved in high impact changes, just like we require reviewers and approvals for high impact human work.

In practice, infrastructure changes proposed by agents should land in an isolated workspace, not directly in production. Change management workflows should do this through branching, where proposed changes are isolated, reviewed against defined policies, and validated by automated checkers and custom validators that enforce organizational rules like IP addressing constraints or naming conventions before merge – ensuring agents can self-validate and humans can approve high risk changes that matter.

Audit Trails

Related Posts
1 of 23,055

When an agent changes infrastructure, there needs to be a permanent, attributable record of what changed, when, which agent made the change, and in what context. If something breaks at 2 AM and the on-call engineer needs to reconstruct what happened, the audit trail is the first thing they reach for.

This matters even more as agents get faster and more autonomous. The volume of changes will outpace anyone’s ability to watch in real time. The audit trail is what gives human teams the ability to stay in control without being in the loop on every action. Agents move at machine speed. Humans review at human speed. The audit trail bridges the gap.

Look for built-in change management workflows, along with built-in changelogs and audit logging, to provide traceability when agents drive change in your infrastructure intent. Branch changes should be preserved after merge so the history of proposed changes are preserved. Reverts are logged explicitly. A changelog entry tells you what changed. A change request tells you why, who approved it, and what else was part of the same batch.

Lastly, validation of the operational infrastructure is a critical safety valve for ensuring the running infrastructure remains configured in line with intent. Without it, neither guardrails nor audit trails confirm that the real infrastructure actually matches the intended state.

In an agentic world, the rate of change in infrastructure will outpace the ability of human teams to audit operational configuration for alignment with intent. An agent that provisions 200 devices in an afternoon generates more changes than a team could review in a week. It is increasingly critical that teams get automated assurance tooling in place to quickly find and fix operational drift.

This is where agents themselves become part of the solution. When validation is automated and connected to the infrastructure model, agents can detect their own drift, flag their own mistakes, and trigger correction workflows without waiting for a human to notice. That’s the real unlock. Not agents that execute blindly, but agents that validate continuously. This validation encourages teams to accelerate agentic AI adoption and move more quickly with confidence.

The Window Is Now

Every team that has started using AI agents in software development knows this pattern. The tools get powerful fast. The organizations that adopted CI/CD and automated testing early didn’t just avoid failures. They shipped faster than everyone else. The same dynamic is playing out in infrastructure.

Every company should be building for both sides of this. The agent experience, giving automated systems a complete infrastructure model to reason against, guardrails to operate within, and validation loops for self-correction. And the human experience, giving infrastructure teams the visibility, control, and confidence to let agents do more, not less.

The teams that put guardrails, audit trails, and validation in place now won’t just avoid the next outage. They’ll be the ones running truly agentic infrastructure operations while everyone else is still debating whether it’s safe to start.

Also Read: ​​AI systems – Interoperable AI systems: Connecting models across platforms

[To share your insights with us, please write to psen@itechseries.com]

About The Author Of This Article

Kris Beevers is the co-founder and CEO of NetBox Labs, where he is building the future of network automation and infrastructure management. An engineer by background, Kris has spent over 15 years helping organizations design, deploy, and scale some of the world’s most demanding network and infrastructure environments – from AI datacenters and neoclouds to global enterprises and industrial OT systems. He is also the former co-founder and CEO of NS1, the managed DNS provider acquired by IBM in 2023. Deeply rooted in open-source and community-driven innovation, Kris remains actively engaged with the NetBox community and is passionate about empowering engineers through practical, scalable infrastructure automation, believing that the best technology is built not just for users, but with them.

 

About NetBox Labs

NetBox Labs is making networks easier than ever to operate, observe, automate, and secure.

Comments are closed.