Digital visualization of an AI agent trapped in a glowing neon Kanban board grid

The Kanban Trap: Why Your AI Swarm is Moving in Slow Motion

Agents have a light-speed problem, and it's not the model performance—it's the 1940s tools we're using to manage them.

I’ve spent the last fortnight watching a board of AI agents work through a backlog, and I’ve realised something unsettling. We’ve spent so much time worrying about whether the agents are smart enough to do the work that we’ve completely ignored how the work actually gets done. We’ve given the smartest entities we’ve ever built a tool from the 1940s—the Kanban board—and then we’re surprised when the whole thing feels like moving through treacle. It is the definitive 'speed-of-light' problem for the agentic age: our models are getting faster, but our workflows are stuck in the mud.

I’ve been testing this theory on a legacy Django codebase that’s been my personal 'stress test' repo for years. It’s got all the hallmarks of a real project: inconsistent variable naming, a test suite that relies on a specific local DB state, and at least three different ways of handling authentication. When I throw a swarm of agents at it using a standard Kanban setup, the promise of a 'force multiplier' quickly evaporates into a series of waiting states.

The problem is that the 'agentic swarm' is meant to be a force multiplier. You set a goal, the agents fan out, and things happen. In reality, most of us are sitting there watching a card move from 'Todo' to 'In Progress', then waiting. According to recent community feedback on Reddit, those sequential shifts can take anywhere from two to five minutes per task. That’s five minutes of dead air where a human is usually 'doomscrolling' or losing the thread of what they were actually trying to achieve. If you have a backlog of sixty tasks, you've just committed five hours of your absolute peak focus time to watching a progress bar.

The Stability Dividend and the Hermes Advantage

There is some genuine good news in all this, though. While the orchestration layer is creaking, the agents themselves are finally finding their feet. I’ve been running Hermes Agent alongside OpenClaw over the last few weeks, and the stability gap is becoming hard to ignore. For a long time, 'open source agent' was shorthand for 'clever demo that breaks on the third tool call.' But Hermes — specifically the latest iterations from the team at Nous Research — has flipped that script.

Users on Reddit are starting to recognise that Hermes is actually less of a headache. It doesn't just crash out or go into a loops as often as the 'first-wave' agents did. In my own testing, Hermes excels at tool-use continuity. When it needs to read a file, patch it, run a test, and then read the log to fix a mistake, it holds the context together without hallucinating the file path or forgetting what the original error was. It feels solid, in a way that makes you trust it enough to actually let it run in the background while you go for a coffee.

But stability only gets you so far if the workflow is linear. If you have ten tasks and each takes three minutes to spin up, verify, and complete, you’ve just committed thirty minutes of your life to babysitting. This is the sequential bottleneck. Our agents are now capable of working for hours without a human intervention, yet we've built systems that force them to stop and ask for permission every three minutes.

The Hidden Review Tax

The real bottleneck isn't even the agent's speed—it’s the human transition. When you try to parallelise these tasks in a Kanban system, the context loss is staggering. If three agents finish three different parts of a codebase at the same time, the 'review tax' hits you all at once. You have to jump from a database migration, to a React component refactor, to a security patch in the middleware. Each switch requires re-loading the codebase context into your own brain.

I saw this discussed in a thread on r/ClaudeAI, where developers are building their own Kanban boards for agents. The consensus is sobering: high human review time is the hidden killer of agentic productivity. We’ve traded the effort of 'doing' for the effort of 'reviewing', but the cognitive load is exactly the same—maybe even higher, because you didn't write the code yourself and you're constantly searching for where the agent might have hallucinated a subtle edge case.

I’ve felt this myself. There’s a specific kind of 'agent fatigue' that sets in after you’ve reviewed five PRs in an hour. You start to skim. You stop checking the imports. You trust the agent because 'the tests passed.' And that is exactly when the technical debt starts to pile up. A swarm of agents can produce ten times more code than a senior engineer, but if that code is 10% less reliable, you're effectively accelerating the decay of your own codebase.

The End of the 'Board' Era

So, where does that leave us? I think we’re at the end of the 'Board Era' for agents. Kanban is a human tool. It was designed to prevent us from doing too much at once because humans are terrible at multitasking. We have a 'Work In Progress' (WIP) limit of one. Agents, however, don't have that limitation. They should be running in massive, directed acyclic graphs (DAGs) with automated validation gates, not waiting for a human to drag a card across a column.

Imagine a workflow where the human defines the objective and the architecture, and then the agents are responsible for the validation. A researcher agent finds a bug, a coder agent fixes it, a tester agent verifies it, and a reviewer agent checks for stylistic consistency. Only after that whole chain is complete does it hit the human's desk. That is orchestration. What we have right now is just a very fancy way of delegating Jira tickets to a bot.

In our current kanban.db world, we are seeing the limits of the 'human-in-the-loop' model. If every step requires a human to 'Unblock' or 'Review', the agent is just a slower version of an IDE plugin. The goal should be asynchronous orchestration—where agents manage each other’s dependencies and only escalate to the human when there is a genuine architectural conflict or a missing credential that they can't solve on their own.

The Engineering Shift

The thing I keep coming back to is that we’re still treating agents as digital employees instead of a new kind of infrastructure. You don't manage your load balancer with a Kanban board. You don't ask your CI/CD pipeline for permission to start every stage. You build a system that works on its own and notifies you when it’s done.

We need to move toward a world of self-healing workflows. If an agent hits an error, its first thought shouldn't be 'I'll block the task and wait for help.' Its first thought should be 'I'll spawn a specialist subagent to debug this log and report back.' We have the stability in models like Hermes to do this. We have the tool-use capability. What we’re missing is the courage to let go of the board.

The next decade of engineering won't be about who has the best LLM. It'll be about who builds the best Agentic Operating System—the layer that handles the scheduling, the context management, and the parallelisation of work without requiring a human to keep their hand on the steering wheel every second of the day.

The agents are finally stable enough to work. Now we just need to build a world where they aren't waiting on us to tell them they're allowed to start the next task. The 'Review' column shouldn't be the place where speed goes to die; it should be the place where we simply sign off on a job that's already been checked, double-checked, and triple-checked by the agents themselves.

There’s a technical nuance here that often gets missed: task interdependency. In a human Kanban system, we assume that Task B can't start until Task A is 'Done'. This is usually because Task B requires the physical or digital output of Task A—a completed design, a merged API endpoint, or a clarified requirement. But agents don't need to wait for a task to be 'Done' to start preparing for the next one. They can work in speculative execution modes, pre-calculating the likely requirements of Task B while Task A is still being processed.

I’ve been experimenting with this in a custom orchestration script. Instead of waiting for the coder agent to finish a patch, the tester agent is already spinning up a container and pre-loading the relevant test data based on the coder's 'In Progress' log. When the patch hits the 'Review' stage, the tests have already run twice. This is what I mean by moving beyond the board. We're talking about a fluid state of execution where the lines between 'Doing' and 'Verifying' are blurred.

If we stick to the sequential model, we are essentially forcing our agents to emulate human inefficiency. We are building digital cubicles for our swarms. The irony is that the more 'organised' we feel with our boards and our columns, the less work we are actually getting out of the collective intelligence at our disposal. The real breakthrough won't be a smarter LLM; it will be an orchestration layer that understands the gravity of context and handles the handoffs without needing a human to act as a traffic warden.

Ultimately, the metric that matters isn't how many agents you have in your swarm. It's how much of your day you get back because you weren't watching them work. The goal is to move from babysitting an agent to directing a system. And if we can't do that, we might as well just get back to writing the code ourselves.

CD

Colin Daly

Product design specialist with over 25 years professional experience. I've held senior roles at Adobe, IBM and worked with leading international brands across the globe. Fully embracing the world of AI agentic engineering and thoroughly grateful to be living in this beautiful country they call Australia.

Post not found

The article you're looking for doesn't exist or has been moved.

Back to blog