Who's Managing Your AI Agents?

Building takes eight minutes. Managing takes forever.

Oct 09, 2025

Near future, Tuesday, 8:47 AM

I arrive at my desk with cold flat white. My job title is “AI Operations Coordinator.” My actual job? HR for bots.

This position didn’t exist eight months ago. Then Marketing discovered OpenAI’s AgentKit. One afternoon, five people built 89 AI agents. Nobody documented what they did. Nobody assigned owners. Nobody thought about what happens when you have 89 autonomous software employees running loose.

So the board created Agent Resources. Team: Jamie, Dev, Marcus, and me.

I open the Agent Roster. 312 active agents. Seventeen flagged for performance review. Four pending termination. Agent-A47 is down. Again.

Not a speculation. Last month I ran a GenAI workshop for Singularity University. 30 C-level executives from Poland’s largest corporations. No IT backgrounds. In two hours, they built more than 50 working prototypes. Awesome. But also... Zero documentation. Zero owners. Zero plan for who maintains them.
Building agents takes eight minutes. Managing them takes forever.

McKinsey buried a warning in their report: “The real challenge [of agents] lies in coordination, judgment, and trust. [And] how [organisations] prevent unchecked sprawl as agent creation becomes increasingly democratised.”

Unchecked sprawl.

Bain is more direct: “Current architectures simply cannot handle this balance when AI agents are used in the thousands across the enterprise—yet.”

Gartner predicts 40% of enterprise applications will have embedded agents by the end of 2026, up from less than 5% today.

Unchecked sprawl.

The easier agents are to build, the harder they are to manage. Everyone celebrates the eight-minute build time. Nobody budgets for the infinite maintenance costs.

9:15 AM

Morning standup. Jamie handles onboarding (making sure new agents don’t immediately cause disasters). Dev does retraining (teaching agents new tricks when business logic changes). Marcus does compliance (which agents can see customer data).

“Agent-B22 DMed our CEO’s mother-in-law on LinkedIn,” Jamie says.

“On purpose?”

“Marketing built it for ‘social outreach.’ Gave it write access to the company LinkedIn. Forgot to specify which customers to contact.”

Priority: high. Category: embarrassing.

Not a speculation. In 2024, Air Canada's chatbot promised a customer a bereavement discount that didn't exist. The airline had to pay damages when the customer tried to claim it. The agent had access to customer service but nobody had trained it on actual company policies.

9:47 AM

Sales managers want a meeting. Agent-S09 and Agent-S14 are both emailing the same leads. Same customers, contradictory pricing. The agents are having a territorial dispute.

“Why do you have two agents doing the same job?”

Silence.

“S09 handles enterprise. S14 handles mid-market. We just never defined what ‘enterprise’ means.”

I resist the urge to scream. “Define the boundary by noon, or I’m deleting one of them.”

Not a speculation. In 2011, two Amazon booksellers both used pricing algorithms to list the same out-of-print biology textbook. One algorithm priced slightly below the competitor. The other priced 27% above. Neither had a sanity check. Within days, the price spiraled from $106 to $23,698,655.93 (+$3.99 shipping). The algorithms were having a pricing war with each other. Nobody noticed for weeks.

10:30 AM

Agent-A47 has a 23% false positive rate on fraud detection. It’s flagging legitimate refunds as suspicious because it was trained during Q1’s fraud spike and nobody ever retrained it.

Dev can fix it in three days if Customer Success signs off. They won’t respond to my emails.

Note to self: Chase CS team. Again.

Not a speculation: In 2021, Zillow’s home-buying algorithm lost $881 million. The AI was trained on pandemic-era prices, then the market shifted. Nobody retrained it. The algorithm kept buying houses at inflated prices while trying to compete with rivals. By the time executives noticed, Zillow had overpaid for thousands of homes it couldn’t resell at profit. The company shut down the entire Zillow Offers division and cut 2,000 jobs.

11:15 AM

Legal wants an audit of every agent with access to customer C4242. Finance wants to know why Agent-D12 costs $8,400/month. Product thinks it analyses user feedback, but they’re “not entirely sure.”

It was built by an intern who left in June. It’s still running. Maybe it does some work for the intern. Nobody knows. Nobody has admin access.

This is my life now.

Not a speculation. In 2020, an ex-Google engineer deployed a web scraper with a $7 billing budget. Woke up to a $72,000 bill. His recursive function triggered 116 billion searches. Even he didn’t know that billing budgets were just notifications, not caps. Worse: billing took 24 hours to sync, so nobody noticed until it was too late. Google waived it as a “one-time gesture.”

2:00 PM

Meeting with the VP. My pitch: Shut down 34 agents immediately (orphaned, broken, unused). Consolidate another 20. Save $12,000/month.

91 agents in red or grey status. Most haven’t been accessed in 90 days, but still cost us $4,000/month.

“What’s the risk?”

“Minimal. Most have no owner. If we shut them down and someone screams, we turn them back on. But I don’t think anyone will notice.”

He approves it. Then I pitch the bigger idea: No new agents without approval from Agent Resources.

Right now, anyone can spin up an agent in eight minutes. No documentation. No owner. No plan. In six months, we’ll have 500 agents and total chaos.

New rule: Fill out a form. Job description. Success metrics. Data access. Owner. Approval required.

He grimaces but agrees. “Draft the policy.”

Not a speculation. On average, large enterprises think they use 37 apps. They actually use 625, including more than 170 AI apps. Shadow IT consumes 30-40% of IT budgets—apps nobody approved, nobody tracked, and in many cases, nobody even knew existed.

3:45 PM - Updating the roster when Marcus stops by.

“You ever think about how weird this is? We’re doing HR for software. Performance reviews. Layoffs. Compliance audits.”

I lean back. “You know what’s weirder? In a year, we’ll have 1,000 agents. Agent Resources will be bigger than actual HR.”

“You think other companies are doing this?”

“Not yet. But they will.”

A speculation.

Here’s what might happen at your company:

Sales will build agents. Marketing will build agents. Customer success, product, and finance: everyone will build agents. Because it takes eight minutes and requires no approval.

In six months, you’ll have 500 agents. No org chart. Three contradict each other. Twelve redundant. Eight broken and nobody noticing.

Your CTO will walk into a board meeting and admit: “We have 400 agents and no idea what most of them do.”

What can you do to avoid agent sprawl?

Appoint an owner. Someone whose job is agent governance. Make it someone’s problem before it’s everyone’s crisis.

Require approval for new agents. One-page use case description: problem, metrics, and owner. Keep it simple; you don’t want to stifle innovation. But: “if you can’t write it, don’t build it”.

Budget for agent management. Agents aren’t “set and forget.” They need retraining, updates, and governance. They’re like digital minions: always there, happy to help, but you look away for a bit and disasters start to happen.

In six months, ‘AI Operations Coordinator’ will be a real job posting at half the companies reading this. The other half already hired someone; they just called it something else.

Discussion about this post

Ready for more?