Agerix

The advisor pattern: what Claude Code teaches us about delegation

22 April 2026 | Eric Lamy | 13 min read

The Claude Code advisor tool is not a feature. It is a delegation pattern that questions the architecture of your business applications.

In any mature organization, expertise isn’t distributed evenly. A senior consultant isn’t consulted for every minor decision. They’re called upon for the truly worthwhile trade-offs. This economy of rare expertise is nothing new: it’s the hallmark of well-functioning teams, firms, and research offices. It rests on a principle as old as management itself—you start at the lowest possible level of expertise and escalate to the next level when justified.

On April 9, 2026, Anthropic released a new tool for its AI agents: the advisor tool. Behind a technical announcement presented as a simple API tweak lies something more interesting. This organizational principle—the targeted intervention of scarce resources—has now been encoded into a native software pattern. A lightweight model performs the task and consults a powerful model on an ad hoc basis for decisions that are beyond its capabilities.

The thesis of this article can be summed up in one sentence. The advisor tool is not a feature, it is a governance pattern which, once placed on our application architectures, makes visible a systemic over-provisioning that we no longer saw.

We will explore it in four parts. First, the mechanics as designed by Anthropic. Then, the figures that make it a serious pattern and not a marketing gimmick. Next, its direct link to how mature organizations already delegate. Finally, the reflection it offers on our business applications, and three questions that a CTO of an SME or mid-sized company should ask themselves after finishing this article.

The pattern that Anthropic has retained

To understand why the advisor tool breaks with the classic multi-agent logic, we must first recall the model it reverses. In the orchestrator-sub-agent architecture that has dominated until now, a heavyweight model coordinates the work. It breaks down the task, distributes the sub-tasks to lighter models, aggregates the results, arbitrates conflicts, and produces the final output. The resource-intensive model is constantly in use, by design.

The advisor pattern reverses this distribution. The executor—Sonnet 4.6 or Haiku 4.5 in the Anthropic nomenclature—drives the task from start to finish. It calls the tools, reads the results, and iterates toward the solution. When it encounters a decision point it cannot confidently resolve, it consults the advisor, typically Opus. The advisor accesses the shared context, returns a short plan, a correction, or a stop signal, and then the executor resumes its work.

The key detail: the advisor never produces user output and doesn’t call any tools itself. It only provides advice. Technically, everything happens in a single API request, without any additional round trips or context management required from the developer.

The emerging decision-making economy differs from what we’re used to. By default, we pay the cost of lightweight capabilities—fast, efficient, and inexpensive. The cost of heavyweight capabilities only arises when justified by genuine difficulty. What was previously the domain of systems engineering—building custom escalation logic between models, managing context transfers independently—is becoming a native API feature. According to Anthropic, the advisor typically generates only 400 to 700 tokens per query, just long enough to return a short plan. The executor, on the other hand, handles the entire output at its lower cost.

This shift is not a mere implementation detail. It’s an architectural stance. Anthropic is betting that the majority of agentic tasks do not require a boundary model to be constantly mobilized, and that the value lies in the ability to invoke this boundary precisely when it changes something.

Why he outperforms the classical orchestrator

The figures published by Anthropic at the launch of the advisor tool do not, on their own, demonstrate the superiority of the pattern. They constitute an indicator. The indicator is nonetheless solid.

On the SWE-bench Multilingual benchmark, which measures an agent’s ability to solve software development issues in nine programming languages, Sonnet 4.6 coupled with Opus as an advisor scores 2.7 points higher than Sonnet alone. This isn’t surprising—help from a smarter model logically improves the score. The real surprise is that this same configuration simultaneously costs 11.9% less per agent task. Better and cheaper. That’s rare.

On BrowseComp, an agentic web browsing benchmark, the effect becomes spectacular for the lightest model. Haiku 4.5 alone achieves 19.7%. Haiku 4.5 with Opus as an advisor climbs to 41.2%, more than double. This configuration remains 29% below Sonnet alone in terms of score, but at 85% less cost per task. A doubling of performance on a lightweight model for a fraction of the price of a mid-range model: the trade-off becomes difficult to ignore. These figures and their methodology—scores averaged over five trials, 300 problems for SWE-bench Multilingual, 1,266 problems for BrowseComp—are detailed in Anthropic’s official blog post of April 9, 2026 .

Why these gains, and why this simultaneous cost saving? The intuition lies in a phenomenon well known to multi-agent system practitioners. The orchestrator becomes overwhelmed by planning. The more sub-agents there are to coordinate, the more energy the central model expends dividing, arbitrating, rewriting instructions, and synthesizing feedback. This cognitive overload ultimately erodes the very quality of the decisions it is asked to make.

The person directly managing the execution benefits from a different position. They are involved in the task, not above it. They move forward. They only call upon higher-level expertise when it changes something in the trajectory. This dynamic is that of the project manager who wants to decide everything—guaranteed to cause delays—as opposed to that of the project manager who delegates effectively and makes timely decisions.

The emerging rule is not “one model does everything”. It is: each level of expertise comes into play where it has an impact.

What this pattern reproduces of human delegation

Anthropic’s engineers did not invent this logic. They transposed it.

In a well-functioning engineering department, a junior engineer executes the code while a senior engineer advises. The senior engineer isn’t consulted on every line of code, every variable choice, or every commit. They are consulted for the major decisions—architecture, debt, structural choices, and high-impact choices. Their scarcity isn’t a problem to be solved, but a constraint to be respected. A senior engineer consulted on everything inevitably becomes a senior engineer who no longer truly advises: they exhaust themselves on minor decisions and lack the cognitive capacity for the important ones. The value of rare expertise is destroyed by its own trivialization.

The same principle applies everywhere. In a rotating operations team, the manager doesn’t approve every ticket, but rather escalates. In a law firm, the partner doesn’t work on every case; they decide on points where professional judgment makes the difference. In a hospital, the head of department doesn’t examine every patient; they are called in when the diagnosis or treatment decision requires it. The model is so universal that we no longer see it. Yet it deserves to be named. It’s the principle of technical subsidiarity. We handle things at the lowest possible level of expertise and escalate to the appropriate level when justified.

What the advisor tool encodes is this principle. It encodes it natively for the first time in the API of a boundary model, with all the mechanics that well-executed human delegation requires—shared context, targeted call, task resumption, bounded return. Anthropic observed how mature organizations utilize their rare experts and decided that its AI agents would function similarly.

The most revealing detail lies in the constraint placed on the advisor. They do not produce user output, do not take control of the tools, and do not replace the implementer. They provide advice and then relinquish control. This is precisely what is expected of a good internal advisor—to advise, not to take over. The pattern reflects organizational maturity, not just cost optimization.

The advisor pattern : less complexity, better decisions

The mirror he holds up to our application architectures

Applying this pattern alongside our current application architectures produces a disturbing mirror effect. Many business applications we encounter at our clients’ sites operate according to the exact opposite pattern.

A CRM that loads its entire logic of rules, permissions, and segmentation to display a single field in a contact record. An ERP that launches a complete workflow engine, with all its cross-functional consistency checks, for a status change that should only involve three objects. An HR platform that queries all its modules—payroll, training, time, leave, contracts—to determine if an employee can take a day off the following week. These applications aren’t poorly designed. They’re often technically sound, functionally comprehensive, and they work. That’s precisely the problem. They work well enough that we don’t question them, and they cost more with each interaction than they should.

The cost is rarely visible on a line item. It’s paid elsewhere. In perceived response time for the user—the form that takes two seconds instead of two hundred milliseconds. In infrastructure consumption that forces over-provisioning of production environments to handle the load. In technical debt that accumulates because each change must traverse the entire logic, even when it only concerns a narrow and isolated path. In operational costs at scale that become prohibitive as the user base grows, when the initial architectural model didn’t allow for it.

The alternative is hardly revolutionary. It has a well-established name: composition. Lightweight modules handle the majority of cases—reading a field, displaying a list, updating a simple status. Targeted calls to heavyweight logic are made when justified—for complex business validation, inter-module arbitration, or high-stakes rules requiring true inference. This is precisely the efficiency that the pattern advisor encodes in the Anthropic API, transposed to application design.

The challenge for a CTO or application architect isn’t to copy the pattern line by line. It’s to use the advisor tool as a framework for analyzing their own applications. When Anthropic modeled its AI agents to resemble well-structured human teams, it’s because this structure is more efficient, more cost-effective, and more scalable. The question this pattern poses to us is simple: why do our business applications still resemble intelligent monoliths that can do everything and are expensive with every decision?

This is precisely what we address in sustainable architecture that secures your software investment , and which can be found, applied to a very specific type of business application, in the anatomy of a modern CRM and its five architectural pillars .

Comparison between classical orchestrator and advisor pattern On the left, a heavy orchestrator delegates to three lightweight sub-agents. On the right, a lightweight executor drives the task and escalates to an advisor only when needed. Classical orchestrator The heavy model coordinates at all times Advisor pattern The executor drives, the advisor steps in when needed Orchestrator (heavy model) Sub-agent task 1 Sub-agent task 2 Sub-agent task 3 Decision economy The heavy model is mobilised at all times. High default cost, planning saturation. Every decision routes through the centre. Advisor (Opus, called when needed) Executor (Sonnet or Haiku) drives the task end-to-end Step 1 Step 2 Step 3 Step 4 consults brief plan Decision economy Default cost at executor rate. Opus consultation: 400 to 700 tokens, targeted. Sonnet + advisor: -11.9% per task vs Sonnet alone. Figures: Anthropic, official post “The advisor strategy”, 9 April 2026.

Three questions a CTO should ask themselves

Three questions logically arise from this reflection. They are not abstract. They arise concretely, application by application, module by module, and structure an architectural audit approach that a design office can conduct methodically.

The first question: where are your applications over-provisioning their processing? In other words, where is heavy processing power being used by default when light processing would suffice? These issues are rarely visible bugs. They are legacy architectural choices that have never been reconsidered because nothing forced them to be. They can be detected through measurement—response time, infrastructure consumption, call tracing—and by observing the most frequent user journeys. If the 80% most common interactions use the same heavy processing logic as the 20% of complex cases, there is likely over-provisioning that needs to be addressed.

The second question is more structural: are your escalations formalized or implicit? In the Pattern Advisor, an escalation is an explicitly named call—an identified invocation of the expert resource. In a well-architected business application, escalations to the heavyweight logic—complex business validation, inter-module arbitration, high-stakes rules requiring real reasoning—should be just as explicit. They should be tracked, governed, and measured. When the real decisions of an application are diluted in a tangle of implicit rules, you don’t know where they are made. Therefore, you can neither optimize them, nor protect them, nor govern them. The RACI matrix, which we presented in an article dedicated to clarifying project roles , is the equivalent organizational tool: it makes explicit who decides what, when, and with what delegation.

The third question concerns the trajectory: what would your application look like if restructured according to this principle? It doesn’t call for a complete overhaul. It calls for a gradual reorganization, module by module, which falls directly under the umbrella of technical debt reduction. Each iteration that separates the lightweight path from the heavyweight path, each module that becomes specifically invoked rather than being used by default, is a step that improves performance, cost at scale, and maintainability. This strategy is precisely the one we describe in our article on technical debt, which absorbs 20 to 40% of the IT budget .

These three questions cannot be answered in a single meeting. They structure an approach that produces an output of oversizing, a formalization of escalation points, and a phased reorganization plan over several quarters.

What you need to remember

The advisor tool isn’t a technological breakthrough. It’s a signal. Anthropic didn’t invent the pattern it encodes—it made it visible by coding it in an API. The lesson for a CIO or CTO isn’t in the feature itself. It’s in the reflection.

For business applications, efficiency is no longer found in uniform power. It lies in the precision of delegation. The highest-performing systems of the coming decade will not be those that are the most intelligent everywhere. They will be those that have been able to place intelligence in the right place, escalate at the right time, and let lightweight processes do what they do best without burdening them with a skill they don’t use.

This pattern, as a pattern, resolves the architectural issue. However, it raises a related question, which becomes pressing as soon as AI agents move beyond the development phase and into autonomous production. How do you govern a system capable of making decisions, escalating, and acting without immediate human supervision? This is a separate topic that deserves its own discussion.

Frequently asked questions

Eric Lamy

Published on 22 April 2026