Which stages of the business application lifecycle can agentic AI automate without undue risk?

Two stages mainly: corrective maintenance and continuous monitoring. In both cases the problem being handled is bounded, errors made by the agent are visible and reversible at low cost, and the situations encountered most often belong to known categories. Initial development belongs with assistance — one lets AI produce, but the framing of what should be produced stays human. Legacy modernisation belongs with protection: assistance is useful, but the decision remains with a team that knows the usages, the implicit rules and the sedimented compromises in the system.

Why is corrective maintenance a better candidate for automation than initial development?

Three reasons compound. A bug ticket defines an observed behaviour, an expected behaviour and the gap between the two — that is a clean definition, unlike a functional specification which always stays partly open. The fix lives on a Git branch before reaching production, which makes the agent's error visible and repairable. Finally, bugs often fall into reproducible categories: an agent that handles many of them learns to recognise recurring configurations. Initial development offers none of these three conditions: the problem is open, a framing error costs a great deal later, and every specification remains partly singular.

Is continuous monitoring really a stage in its own right, or just an extension of maintenance?

Before routines it was a passive extension: tools raised alerts, a human reacted when available. Routines turn this function into an active capability that inspects, contextualises, diagnoses, sometimes remediates. This is no longer a difference of degree but a difference of nature. Monitoring stops being something one endures and becomes something one instruments — on equal footing with development or maintenance. Treating it as a stage in its own right allows a dedicated governance to apply to it, rather than diluting it within maintenance.

Why shouldn't agentic AI be left to modernise a legacy system autonomously?

Because a legacy system is not just code. It is a set of sedimented decisions — technical compromises, hard-coded business rules, behaviours no one can justify any longer but on which customers or internal processes depend. An agent refactoring this system reads the code; it does not read the undocumented heritage surrounding it. The risk is not producing technically inferior code but producing technically superior code that has silently broken an expected behaviour. That latent cost typically surfaces weeks later, when the regression hits a real user. Assistance stays useful — mapping, duplication detection, regression tests — but the decision of what to preserve, transform or remove must remain human.

What typical return on investment can be expected from automating corrective maintenance?

Giving an absolute figure would be dishonest: the return depends on the codebase, team maturity, quality of framing and the scope entrusted to the agent. An order of magnitude can however be given. Technical debt, of which corrective maintenance is a significant component, represents between 20 and 40% of the IT budget according to published studies. Even partial automation of a fraction of this load represents a substantial net gain, provided the governance framework (scope, permissions, traceability, responsibility) is set up upfront. Without that framework, the nominal gain is absorbed by rework and untraced incidents.

How can I build a decision grid to arbitrate which SDLC stages to automate?

Two axes are enough. On the horizontal axis, the effective value delivered by automation — the net gain once risks and rework are factored in, not raw speed. On the vertical axis, reversibility of errors — the capacity to detect and fix an agent's error before it reaches a user or business process. Their intersection yields three postures: automate when value is high and reversibility is high; assist when value is real but reversibility drops; protect when reversibility is low and the error can stay invisible for a long time. The grid is not an absolute rule — it is a discipline of reading that rebalances the arbitrations.

Do Claude Code routines replace a junior developer in an SME or mid-market company?

No, and posing the question in those terms leads to poor decisions. A junior developer is a trajectory: they learn the business domain, they contribute to understanding the implicit rules that are written nowhere, they become a senior within a few years. An agent performs tasks, well, sometimes very well, but does not accumulate lasting domain knowledge. Replacing a junior with an agent means saving a budget line today while compromising in-house expertise in five years. The right question is different: how do we redirect the junior's time, now that part of the tasks historically assigned to them can be automated? The answer belongs to management and training, not to substitution.

Where should I start if I want to bring agentic AI into the lifecycle of my business applications?

With the stage where an error costs the least: corrective maintenance. Delimit a narrow scope — for example a category of recurring bugs or alert triage on a specific service — set an explicit RACI framework that identifies who validates, who supervises and who repairs, and demand full traceability of interventions. This high-reversibility environment is the best learning ground for the organisation itself: one learns to frame an autonomous agent before extending it to less permissive stages. Extending without having learned on corrective maintenance first is like sending a new employee into production without having seen them work on a low-stakes project.

Claude Code 4.7 in the business application lifecycle

15 May 2026 | Eric Lamy | 15 min read

Agentic AI and the lifecycle of a business application: which SDLC stages benefit from automation, which demand irreplaceable human judgement.

Third and final article in the Claude Code 4.7 series. Routines now make agentic AI a resource capable of intervening throughout the entire lifecycle of a business application. This technical scope, however, says nothing about the value it brings at each stage. Four stages, three approaches, one reading discipline—and a conclusion that closes the series without attempting to reiterate its thesis.

Une capacité nouvelle n’a de valeur que là où elle est pertinente. Les routines Claude Code et l’autonomie opérationnelle qu’elles introduisent permettent désormais d’agir sur l’ensemble du cycle de vie d’une application métier — de la première ligne de code à la modernisation d’un système vieux de quinze ans. Cette étendue technique est réelle. Elle ne dit rien, en revanche, de la valeur effective que l’IA agentic apporte à chaque étape.

Le premier article de cette série a posé le plan architectural : ce que le pattern advisor nous apprend de la délégation et de la subsidiarité technique. Le deuxième a posé le plan de gouvernance : pourquoi les routines font basculer Claude Code du poste du développeur à l’infrastructure, et quelles questions cela impose à un DSI avant tout déploiement. Ce troisième article descend au terrain — celui du cycle de vie d’une application métier — et pose une thèse simple, mais inconfortable pour qui voudrait automatiser par principe : la valeur apportée par l’IA agentic ne se répartit pas uniformément sur ce cycle. Elle démultiplie certaines étapes, en concerne peu d’autres, et devient un risque sur une catégorie précise d’interventions où le jugement humain reste irremplaçable.

Quatre temps dans ce qui suit. Un rappel des quatre étapes qui structurent le cycle de vie d’une application métier. Une lecture de ce que l’IA agentic apporte, ou n’apporte pas, à chacune d’elles. Une grille de décision à deux axes qui permet de trancher étape par étape. Et, en clôture, une mise en perspective qui lie les trois plans de cette série.

The four stages of a business application lifecycle

A business application does not have a linear lifecycle. It travels through phases whose nature, timing and actors differ fundamentally. We identify four here, not as an exhaustive taxonomy, but as the minimal grid that allows us to read what agentic AI changes.

The lifecycle of a business application facing agentic AI: four stages (initial development, corrective maintenance, continuous monitoring, legacy modernisation) paired with the appropriate postures (assist, automate, automate, protect).

Initial development, first — the construction phase, from specification to the earliest production releases. It is the most visible stage, the most instrumented, the most discussed. Yet it rarely accounts for more than a minority share of an application’s total cost over its useful life.

Corrective maintenance, next — everything that keeps a system in working order: fixing reported bugs, applying security patches, adjusting behaviour on cases that were never anticipated. It is the quietest stage and, often, the most resource-consuming over time.

Continuous monitoring, which we treat here as a stage in its own right. This is a choice: historically, monitoring was not seen as a lifecycle phase but as a passive function, carried by alerting tools and human availability. Agentic AI changes that reading, and we shall see why.

Legacy modernisation, finally — that particular stage where an existing system must be rebuilt, migrated or deeply transformed. It combines a technical part and a business part, and it is precisely in that combination that its irreducibility plays out.

These four stages are not handled the same way. They do not draw on the same skills, they do not carry the same risks, and they do not reward the same tools. Successfully developing a business web application depends on distinguishing these phases and allocating adapted means to each. Agentic AI is a new capability to be allocated across this existing grid — not a uniform lever that applies with equal intensity everywhere.

Initial development: speed without judgement is not value

Agentic AI shines, at the initial development stage, wherever the problem is well posed. Boilerplate generation, writing unit tests against specified behaviour, local refactoring, syntax migration from one framework to another, applying established patterns — on all these tasks, the acceleration is real, measurable, and it converts into recoverable developer hours.

This very self-evidence is what misleads. The ambient discourse presents initial development as the natural terrain for AI, because it is the most visible and the most spectacular to demonstrate. But for a business application, the value of initial development almost never lies in typing speed. It lies in the quality of upstream analysis: what are the real business rules, which ones are implicit, which ones are never stated because everyone assumes them. Fast development that has misread the business produces an application that is fast to ship, slow to fix and fragile to evolve. The apparent saving at the front end is repaid, with interest, across the three following stages of the lifecycle.

What agentic AI does not do well, at this stage, is precisely what matters most: arbitrating between two possible readings of a need, questioning the requester about a rule that seems too simple, detecting that an implicit specification will contradict an existing usage. Understanding business processes and their rules remains human work — not because a model cannot analyse a text, but because the raw material of that analysis does not yet exist in written form at the moment it is needed. It is built in an exchange, and that exchange presupposes an interlocutor capable of thinking with you, not merely producing for you.

The right posture at initial development is therefore one of assistance. Agentic AI accelerates what is well specified; it does not dispense you from specifying well. The organisations that invert this logic — letting production run fast in the hope that specification will catch up — discover in maintenance the real cost of that drift.

Corrective maintenance: where routines gain the most

Corrective maintenance is the stage where agentic AI finds its best operational terrain. Three reasons for this, and they compound.

The first is that the problem is bounded. A bug ticket describes an observed behaviour, an expected behaviour and the gap between the two. That is a clean definition, unlike a functional specification which always stays partly open. An agent can read the ticket, inspect the relevant code, reproduce the problem and propose a fix. The whole path fits within a delimited scope.

The second is reversibility. A bug fix lives on a Git branch before reaching production. An agent that gets it wrong produces a commit a human can read, reject or modify. The error is visible and repairable at low cost — which is exactly the condition required to authorise automation. The first article in this series set out the logic of technical subsidiarity: delegate what can be delegated, escalate where judgement is required. Corrective maintenance is the paradigmatic illustration of that principle. The agent executes, and the advisor pattern lets it escalate when the fix moves outside its zone of confidence.

The third is reproducibility. Bugs in a business application are not all unique. A large share of them fall into known categories — edge-case handling, date formats, encoding, error handling on an external dependency. An agent that processes a thousand bugs in a codebase learns, within the limits of its session memory, to recognise recurring configurations. The marginal productivity of each fix rises with accumulated experience.

The gain, at this stage, is substantial — and it is often underestimated because it is so quiet. Corrective maintenance absorbs a significant share of the IT budget and a significant share of developer time. Reference figures on technical debt give an order of magnitude: a McKinsey survey estimates that technical debt accounts for roughly 40% of IT balance sheets, with 30% of surveyed CIOs reporting that more than 20% of their technology budget nominally dedicated to new products is in fact diverted to resolving it. Stripe’s Developer Coefficient study places the share of developer time spent on debt and bad code in a comparable range. A non-negligible part of this cost is pure corrective maintenance — exactly the category of work that Claude Code routines can automate or semi-automate. This is not a promise to replace teams; it is an opportunity to redirect their time towards higher-value work. Technical debt, the hidden cost that absorbs 20 to 40% of the IT budget, finds part of its concrete answer here: the automation of a fraction of its corrective dimension.

Continuous monitoring: a stage that agentic AI brings into existence

Monitoring of a business application was, until recently, a passive function. Tools raised alerts, humans received them, and the chain of action began when an operator opened the console. Between the appearance of an incident and its handling, minutes could go by, sometimes hours, occasionally a whole night — depending on perceived severity and team availability.

Claude Code routines turn this function into an active capability. A trigger can be a webhook emitted by a monitoring tool, an application event, an infrastructure alert. The agent does not merely receive: it inspects, it contextualises, it diagnoses, and in some cases it remediates. Alert triage — that ungrateful task of separating noise from signal, grouping related alerts, characterising real severity — becomes an activity that does not sleep.

This is a change of nature, not simply an acceleration. Monitoring stops being a stage one endures and becomes a stage one instruments. It takes its place in the lifecycle, alongside development and maintenance. It can even, if properly framed, become the stage at which the application improves continuously — every anomaly detected feeding a prioritised backlog, every incident pattern enriching knowledge of how the system actually behaves.

Two points of vigilance all the same. Automated monitoring inherits every governance question raised in the previous article — decision scope, traceability, responsibility. An agent that fixes a fault at three in the morning without explicit record of its intervention creates a new form of debt, exactly the one the second article in this series named operational debt. A system that self-repairs without anyone knowing what was repaired is not a system under control — it is an opaque system.

The second point of vigilance concerns the boundary. Monitoring is not decision-making. An agent that detects a feature is producing many errors does not decide whether it should be fixed, redesigned or retired — that decision belongs to functional arbitration, which is to say, to the business. Agentic AI can at most document the signal. It must never stand in for the deliberation that follows.

Legacy modernisation: where human judgement remains irreplaceable

A legacy system is not just old code. It is a sedimentation. Each layer carries a decision — an assumed technical compromise, an architectural choice that was defensible at the time, a business rule hard-coded to meet a specific need, a temporary workaround that became permanent. The application works today because the stack of these layers has stabilised. Modernising means undoing that stability without knowing precisely which layers underpin it.

What makes modernisation irreducible is not the technical complexity of refactoring. It is the invisible.

Business rules that are written nowhere. Behaviours no one can justify but which customers or processes rely on in daily use. Implicit dependencies between modules whose original teams have long since left the company. An agent refactoring this system has no access to that undocumented heritage. It reads the code, not the intentions that produced it, still less the usages that have attached themselves to it.

The risk of automated modernisation is not to deliver technically inferior code. The risk is to deliver technically superior code that has silently broken an expected behaviour — and to discover the regression only weeks later, when a customer or an internal process notices that their use case no longer passes. This latent cost far exceeds any gain in rewriting speed.

This does not condemn agentic AI at this stage. It can assist, and on specific tasks it delivers real value: mapping dependencies, detecting duplication, proposing refactoring candidates, generating regression tests against observed behaviour. Assistance is useful. The decision, however, remains human. It is the judgement of an architect who knows the application, a business owner who knows the usages, a CIO who knows the risks, that arbitrates what must be preserved, transformed or removed. The choice between off-the-shelf software and custom development often replays itself at that moment — and it is a strategic arbitration, not a refactoring.

A decision grid: value delivered against reversibility of errors

Two axes are enough to organise the four stages. On the horizontal axis, the value that automation actually delivers — meaning not raw speed, but net gain once risks and rework are accounted for. On the vertical axis, the reversibility of errors the agent can make — meaning the capacity to detect and fix an error before it reaches a user or a business process.

A two-axis decision grid: effective value on the horizontal axis, reversibility of errors on the vertical axis. Corrective maintenance and continuous monitoring sit in the Automate quadrant, initial development in the Assist quadrant, legacy modernisation in the Protect quadrant.

The intersection of these two axes produces three postures. Automate, when value is high and reversibility is high — this is the case for corrective maintenance and continuous monitoring. These two stages live in an environment where errors stay visible and recoverable. They gain from automation. Assist, when value is real but reversibility drops — this is the case for initial development. One lets AI produce, but the decision of what to produce remains human, precisely because a framing error at this stage is no longer reversible at low cost. Protect, when reversibility is low and the error can stay invisible for a long time — this is the case for legacy modernisation. One uses AI to equip judgement, never to replace it.

This grid is not an absolute rule. It is a discipline of reading. It reminds us that before asking what AI can do, one must ask what one of its errors would cost if it went unnoticed. That simple question rebalances most arbitrations.

A programmable resource calls for choices

Three articles will have covered this series. The first laid out an architectural plan : the pattern advisor encodes a logic of technical subsidiarity that good organizations have always practiced, and it prompts each CTO to consider whether their own applications follow this principle or its opposite. The second laid out a governance plan : routines shift agentic AI from the developer’s workstation to the enterprise infrastructure, and this shift raises four questions—permissions, decision scope, traceability, and responsibility—which are no longer purely technical but rather IT-related. The third, this one, sets out an operational plane: across the lifecycle of a business application, the value of agentic AI does not spread evenly, and the discipline consists in distinguishing what to automate, what to assist, what to protect.

These three planes are not three competing perspectives. They are three depths of the same question. Agentic AI is neither the tool one pictured in 2023, nor the infrastructure one is starting to describe in 2026. It is in the process of becoming a programmable resource, and a programmable resource calls for choices — of architecture, of governance, of use. Those who make these choices deliberately will build a capability that scales. Those who let the capability deploy itself without a framework will discover, as always, that the absence of a choice is itself a choice — the choice of disorder, acknowledged too late.

The question has never been what AI can do. It is what we want it to do, at each of the three levels at which it now commits us.

Frequently asked questions

: Two stages mainly: corrective maintenance and continuous monitoring. In both cases the problem being handled is bounded, errors made by the agent are visible and reversible at low cost, and the situations encountered most often belong to known categories. Initial development belongs with assistance — one lets AI produce, but the framing of what should be produced stays human. Legacy modernisation belongs with protection: assistance is useful, but the decision remains with a team that knows the usages, the implicit rules and the sedimented compromises in the system.
: Three reasons compound. A bug ticket defines an observed behaviour, an expected behaviour and the gap between the two — that is a clean definition, unlike a functional specification which always stays partly open. The fix lives on a Git branch before reaching production, which makes the agent's error visible and repairable. Finally, bugs often fall into reproducible categories: an agent that handles many of them learns to recognise recurring configurations. Initial development offers none of these three conditions: the problem is open, a framing error costs a great deal later, and every specification remains partly singular.
: Before routines it was a passive extension: tools raised alerts, a human reacted when available. Routines turn this function into an active capability that inspects, contextualises, diagnoses, sometimes remediates. This is no longer a difference of degree but a difference of nature. Monitoring stops being something one endures and becomes something one instruments — on equal footing with development or maintenance. Treating it as a stage in its own right allows a dedicated governance to apply to it, rather than diluting it within maintenance.
: Because a legacy system is not just code. It is a set of sedimented decisions — technical compromises, hard-coded business rules, behaviours no one can justify any longer but on which customers or internal processes depend. An agent refactoring this system reads the code; it does not read the undocumented heritage surrounding it. The risk is not producing technically inferior code but producing technically superior code that has silently broken an expected behaviour. That latent cost typically surfaces weeks later, when the regression hits a real user. Assistance stays useful — mapping, duplication detection, regression tests — but the decision of what to preserve, transform or remove must remain human.
: Giving an absolute figure would be dishonest: the return depends on the codebase, team maturity, quality of framing and the scope entrusted to the agent. An order of magnitude can however be given. Technical debt, of which corrective maintenance is a significant component, represents between 20 and 40% of the IT budget according to published studies. Even partial automation of a fraction of this load represents a substantial net gain, provided the governance framework (scope, permissions, traceability, responsibility) is set up upfront. Without that framework, the nominal gain is absorbed by rework and untraced incidents.
: Two axes are enough. On the horizontal axis, the effective value delivered by automation — the net gain once risks and rework are factored in, not raw speed. On the vertical axis, reversibility of errors — the capacity to detect and fix an agent's error before it reaches a user or business process. Their intersection yields three postures: automate when value is high and reversibility is high; assist when value is real but reversibility drops; protect when reversibility is low and the error can stay invisible for a long time. The grid is not an absolute rule — it is a discipline of reading that rebalances the arbitrations.
: No, and posing the question in those terms leads to poor decisions. A junior developer is a trajectory: they learn the business domain, they contribute to understanding the implicit rules that are written nowhere, they become a senior within a few years. An agent performs tasks, well, sometimes very well, but does not accumulate lasting domain knowledge. Replacing a junior with an agent means saving a budget line today while compromising in-house expertise in five years. The right question is different: how do we redirect the junior's time, now that part of the tasks historically assigned to them can be automated? The answer belongs to management and training, not to substitution.
: With the stage where an error costs the least: corrective maintenance. Delimit a narrow scope — for example a category of recurring bugs or alert triage on a specific service — set an explicit RACI framework that identifies who validates, who supervises and who repairs, and demand full traceability of interventions. This high-reversibility environment is the best learning ground for the organisation itself: one learns to frame an autonomous agent before extending it to less permissive stages. Extending without having learned on corrective maintenance first is like sending a new employee into production without having seen them work on a low-stakes project.

Eric Lamy

Published on 15 May 2026