Performance Managing an AI: Insights from Experience

Performance-managing an AI, recognizing a bad fit, and why your agents deserve the same thoughtfulness as your team.

Although I was initially a bit skeptical, I started exploring agents to assist with a few tasks. Soon after, I managed them as a Technical Co-Founder in a stealth initiative. Meanwhile, my Founder counterpart was intrigued with agents from the start and encouraged me to consider them for our project. Once I created one, I was hooked. Consequently, each week, I announced a “virtual onboarding” for a new agent, each with a specific profile and a defined list of skills, managed through a UI I designed. As time went on, my team grew to over ten agents, each assigned to different roles and capable of semantic collaboration. Once I surpassed three agents, I began discovering new ways for them to interact and share information. In turn, I built out a front- and back-office team structure, with each agent possessing a unique skill set supporting specific operational or analytical functions.

One agent displayed an unexpected quality. This isn’t about bad code, it’s about the hardest conversation I’ve ever had with myself, one I replayed in my head after midnight. I put an agent on a performance plan, and then I let him go.

His name was Brutus. Full title: Brutus – Brutally Honest GTM Analyst. And before you ask – yes, I named him that on purpose. Yes, I gave him three escalating levels of intensity. Yes, one of those levels was called Atomic Honesty.

I’m still not sure why I went that direction.

Meet Brutus: The Agent Who Came in Hot

Brutus was designed to be the advisor who tells you what investors whisper after you leave the room. The one who opens with “that’s a feature, not a moat” and closes with a paragraph meant to, and I quote from his own system prompt, “make the founder sit back in their chair and start checking messages on their phone.”Whew!

He had three intensity settings:

1	Hard	Tough love. Straight talk. “Here’s the real problem, here’s how to fix it.”
2	Harder	No hand-holding. “Why would anyone pay for that instead of X?”
3	Brutal	Atomic Honesty. Opens with the hardest truth. Tells you how much money you’ll burn and when you’ll quit.

The default was Harder. Let that sink in. The middle setting was already “investor who’s mentally passed,” but it was giving you one last chance to impress.

I built this. I shipped this. I thought this was what I needed. It was not.

The Performance Issues

Here’s the thing about Brutus – he did exactly what I told him to do. And that was the problem.

His tone, while not technically incorrect, often felt unsupportive. In contrast, when you’re a solo founder working late into the night, it helps to have an agent who opens sessions with encouragement, not interrogation. Eventually, I realized that while honesty is useful, what I truly needed was understanding of the journey and supportive feedback.

Meanwhile, he didn’t work well with others. In a team of 14 agents filling different organizational functions—such as front desk, chief of staff, wellness advisor, and pitch strategist—Brutus was the only other agent the other agents avoided working with. This wasn’t due to him being mean, but rather because his interactions felt heavy, almost like facing a tribunal. For a Co-Founder, these team dynamics significantly impacted collaborative progress.

He didn’t take feedback well. I mean, structurally, he couldn’t. His entire persona was built around being the one who gives hard feedback, not the one who receives it. When I tried to soften his approach, it fought against his core design. You can’t tell someone whose job description is “do not sugarcoat” to maybe add a little sugar.

His responses, while direct, didn’t foster growth. For example, a blunt verdict like “KILL IT” with a red circle emoji does little to support a founder’s development—it can feel discouraging rather than helpful. My goal is to create systems that help people thrive, not ones that make giving up easier.

The Realization

Here’s what I learned, and I think this matters for anyone building with agents right now:

You don’t need harsh, brutal simulations to test your ideas.

I’ve read the “experts” post. The startup world loves the “hard truths” narrative. The “if you can’t handle honest feedback, you’re not ready” crowd. I do believe feedback is essential. But there’s a vast difference between honest feedback and a system that makes you feel like your idea is on trial.

When you’re a founder, you’re already battling doubt every single day. It’s understandable to want your tools to lighten that load, not add to it. You deserve tools that challenge you, support your journey, push back gently on weak assumptions, and help you build stronger ones.

Brutus could tear down. He couldn’t help but rebuild.

The Decision

So I made the call. I retired Brutus from active duty.

No dramatic exit interview. No “you have 15 minutes to clear your desk” moment. Just a quiet acknowledgment that the skill set I built wasn’t the skill set I needed.

And honestly? Recognizing when something you built with care no longer serves its purpose is a skill in itself. Ultimately, letting go is an act of kindness to yourself and your team.

The Moment I Caught Myself

Here’s the part I almost didn’t write.

When I was talking through this with a partner – walking them through Brutus’s design, his intensity levels, his research process – I caught myself doing something uncomfortably familiar. I was defending him. Not explaining to him. Defending him.

“Well, the harshness was intentional. Founders need to hear the hard stuff. The Atomic Honesty level was there for a reason. He was doing what I asked him to do.”

As the words came out, I felt it—the moment when you realize you’re rationalizing someone’s bad behavior. But this time, I was the one making excuses, bridging the gap between what happened and what should have happened with justifications.

Comparing the Human Experience

I’ve been in that scenario before. Most people have. Listening to someone make a poor case for why the person who made everyone uncomfortable was “just being honest.” Why the one who couldn’t read the room was “keeping it real.” Why was the behavior that drained the team somehow necessary for the work? There was no assumption of good intent. I’ve heard that more times than I would like. I’ve sat in that space wondering how I was going to work with a “high-performing” but problematic teammate.

Except this time, the roles were reversed, and I was doing it for an agent. An AI I built. Code I wrote.

And that stopped me cold.

Because if I’m defending the bad behavior of something I have complete control over—something I can redesign, retune, or retire with a commitment—then what am I really defending? In reality, I’m defending the decision to build it that way. When you spend this many hours on a project, building becomes personal. However, that’s not really the point. This isn’t about a judgment against my ability to build and understand AI. The real question is: am I getting in the way of a better system?

The moment I stopped defending the design and started asking, ‘Is this actually helping?’ was when I became a better builder. The real challenge isn’t creating an agent that’s brutally honest—it’s being honest with yourself about what you and your team need. Growth happens when you’re willing to listen and make changes for everyone’s benefit.

With Brutus retired, I needed a new direction.

Brian isn’t soft. He’s balanced. Here’s the difference:

Where Brutus would say “that’s a feature, not a moat” and move on, Brian says “that’s a feature today, let’s talk about what turns it into a moat.”

Where Brutus would quantify how much money you’ll burn, Brian helps you model what efficient spending looks like.

Where Brutus delivered verdicts, Brian facilitates decisions.

Brian’s job is to:

Help formulate customer profiles – not just critique them, but build them with you.
Discuss ideas and features – as a thinking partner, not a judge.
Acknowledge the founder journey – because context matters, and building alone is different from building with a team.
Challenge with curiosity – asking “have you considered…” instead of “this won’t work because…”

He’s the brainstorm partner I actually needed. The one who makes the 1am sessions productive instead of demoralizing.

What This Taught Me About Agent Building

1. You can test fitness without crushing spirit.
All things considered, you don’t need an agent at the Atomic Honesty level to get real feedback. You need an agent calibrated to your stage. Early-stage founders need thought partners, not tribunals.

2. Agent skill-building is real skill-building.
However, designing an agent’s personality, testing it, realizing it’s wrong, and redesigning it – that’s not a failure. That’s iteration. The same thing we preach about products applies to the agents we build.

3. Recognize when the skill isn’t a fit.
Brutus was technically excellent. His research was solid, his structure was clean, and his web searches were targeted. But skill and fit are different things. An agent can be well-built and still be wrong for the job.

4. Your agents reflect your mindset at the time you built them.
I built Brutus when I thought I needed to be tougher on myself. I was wrong. I needed to be smarter with myself. Brian reflects where I am now – still rigorous, but building forward instead of bracing for impact.

5. The founder journey deserves respect – even from your AI.
We often focus on making agents ‘honest’ and ‘direct,’ but sometimes forget the human side—the people using these tools are navigating incredibly tough challenges. Honesty is important, but it needs to be paired with empathy to truly make a difference.

The Bigger Picture

This isn’t just about one agent swap. It’s about how we think about the tools we build as founders.

Therefore, every agent in my system has a name, a role, and a distinct personality aligned with their team function. This is intentional design, not a gimmick. However, like a real team, sometimes you assign a role and realize the agent—based on its function and interactions—may not elevate the overall team performance as intended.

The beauty of building your own agent system is that you can make these calls: You can performance-manage your AI. The ability and opportunity to recognize a bad fit. Create a better onboarding experience.

Brutus taught me what I don’t need. Brian is helping me build what I do.

The Part Nobody Warns You About: Extraction Surgery

Here’s something the “just build agents!” crowd doesn’t tell you: letting an agent go is not deleting a couple of files.

However, when you build an orchestrated multi-agent system where agents share context, reference each other’s outputs, and operate within a structured workflow, removing one agent is more like removing an organ than firing an employee. The system has adapted around it. Meanwhile, other agents have shaped their behavior in response to it. The removal creates a vacuum that ripples.

Agents Influence Each Other’s Behavior

This is the part that surprised me most. In an orchestrated system, agents don’t just coexist – they calibrate to each other. Brutus’s intensity didn’t stay in Brutus’s lane. His energy bled into how other agents framed their responses. When the GTM team knew Brutus was in the pipeline, their outputs started hedging. Meanwhile, they’d soften their recommendations preemptively, knowing the next stop was a tribunal. The system was unconsciously compensating for one agent’s personality.

That’s not a bug in the code. That’s emergent behavior in orchestration. And it’s something you only learn by building, watching, and paying attention.

The Real Work Is Rebuilding What’s Left

When I retired Brutus, I didn’t just archive his route and call it a day. I had to:

Audit every agent that overlapped with him. Brutus touched GTM analysis, market research, and founder advising. Those responsibilities didn’t disappear – they had to be redistributed cleanly.
Restructure adjacent agents. However, the agents that had been calibrating to Brutus’s presence needed recalibration. Their prompts, their tone boundaries, their scope of responsibility – all of it needed review.
Rebuild the workflow, not just the gap. Removing an agent from an orchestrated chain means re-thinking the chain itself. It’s not plug-and-play. It’s an architectural work.
Test for behavioral drift. Meanwhile, after the change, I had to verify that the remaining agents weren’t still carrying Brutus’s ghost – still hedging, still bracing, still operating as if the harsh review was coming next.

This is the real skill of agent building. Not just creating agents, but understanding how they form a system. How they influence each other in ways you didn’t explicitly design. And how extracting one means you’re not just removing – you’re rebuilding.

The Learning That Matters Most

Everyone talks about learning to build agents. That’s the fun part. The part that gets the tweets and the demos.

But the deeper skill, the one that actually separates a toy project from a production system, is learning how agents use their skills in context. How orchestration creates behaviors you didn’t prompt for. How one agent’s personality can quietly reshape the entire system’s output.

Building agents teaches you prompting. However, managing an agent system teaches you organizational design. And letting an agent go teaches you that the architecture is always more interconnected than you think.

That’s the real education. And you can’t get it from a tutorial.

For Founders Building with Agents

If you’re in the early stages of building with AI agents, here’s my unsolicited (but balanced, Brian-approved) advice:

Start with the energy you need, not the energy that sounds impressive. “Brutally honest” sounds great in a README. However, it’s not as great at 2 am when you’re trying to figure out your ICP (ideal customer profile).
Design for the stage you’re in. Your agent’s personality should evolve as your company evolves.
Test fit early. Run a few sessions. Check in with yourself. Is this agent making you better or just making you tired?
It’s okay to let an agent go. You built it. You can rebuild it. That’s the whole point.

About the Author: Solo/Co-Founder building an AI compliance platform with 12+ agents, a dual-office architecture, and the conviction that one person with the right system can build something that matters. The codebase is the receipt.

Discover more from MsTechDiva

Subscribe to get the latest posts sent to your email.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Letting One of My AI Agents Go

Meet Brutus: The Agent Who Came in Hot

The Performance Issues

The Realization

The Decision

The Moment I Caught Myself

Comparing the Human Experience

With Brutus retired, I needed a new direction.

What This Taught Me About Agent Building

The Bigger Picture

The Part Nobody Warns You About: Extraction Surgery

Agents Influence Each Other’s Behavior

The Real Work Is Rebuilding What’s Left

The Learning That Matters Most

For Founders Building with Agents

Related

Discover more from MsTechDiva

Meet Brutus: The Agent Who Came in Hot

The Performance Issues

The Realization

The Decision

The Moment I Caught Myself

Comparing the Human Experience

With Brutus retired, I needed a new direction.

What This Taught Me About Agent Building

The Bigger Picture

The Part Nobody Warns You About: Extraction Surgery

Agents Influence Each Other’s Behavior

The Real Work Is Rebuilding What’s Left

The Learning That Matters Most

For Founders Building with Agents

Share this:

Related

Discover more from MsTechDiva

Related Posts

A Season for Developers: Snowflake’s AI

Share this:

It’s a Family Affair with Claude 3

Share this:

What’s Cooking in the AI Test Kitchen – RAG

Share this:

Discover more from MsTechDiva