Most of us can agree that the function that has fundamentally changed the most with AI is engineering. In a very short period of time, the development lifecycle has been completely transformed. AI coding agents like Claude Code, Cursor, Codex, and Gemini CLI are now writing features, generating APIs, and shipping pull requests at insane speeds.
This new set of tools and workflows—the AI-DLC (also called the AI-SDLC)—has rewritten the early phases of software development. Planning, design, and implementation now happen in compressed cycles with agents doing most of the heavy lifting. But the later phases—testing, security validation, deployment gates—haven’t caught up.
At the same time, the threat landscape has also dramatically changed (for the worse). ICYMI, the marketing around Anthropic’s Mythos has brought this to the public’s attention. Armed with Mythos or any other powerful frontier model, attackers have never been more effective. Data shows that time-to-exploit has collapsed from 2.3 years in 2018 to 20 hours.
So what does that mean for application security teams? The short answer: they need to close the gap between how fast code ships and how fast it gets tested. That means figuring out the right testing tooling and where in the AI-DLC that testing actually belongs.
More code, more alerts, same headcount
The volume problem is straightforward math. AI-generated code ships at 3–10x the rate of hand-written code. The jury is out on whether that code is more or less secure, but at best it nets even. That means more vulnerabilities. More vulnerabilities mean more alerts from every tool in the stack: SAST, SCA, DAST, all of them.
And the decades-old story is the same: AppSec headcount isn’t scaling to match. Most teams are the same size they were in 2023 (or even 2016), but they’re now responsible for triaging and remediating findings across a codebase that’s growing by multiples. The vulnerability management workflow that barely worked at human speed is now underwater.
The result is a compounding backlog. Not to mention burnout, desensitization, and even more friction with developers. Findings get filed, aged, deprioritized, and eventually accepted as background noise. That was already the number-one complaint from security leaders before AI coding tools hit mainstream adoption. Now it’s the default operating state for most programs.
This sounds like a very familiar story from the transition to DevOps and DevSecOps. The main difference is that the risk of exploitable vulnerabilities sitting in production has never been higher, and the window for attackers to exploit them has never been shorter.
AppSec testing in CI/CD is too late
When DevOps arrived, the answer was “shift left.” Instead of testing in production or staging, move security testing earlier in the pipeline, ideally into CI/CD or the commit stage. That was a real improvement. Running DAST or SAST against a pull request catches issues before they hit staging or production. It gives developers feedback in the same workflow where they’re writing code and in DevOps, where they’re testing as well.
With the AI-DLC, however, that doesn’t cut it. With AI agents, there’s also a new opportunity to test earlier and more easily. Testing in CI/CD still means the developer (or the agent) has finished the work, committed it, opened a PR, and moved on. What’s changed now is that more code is getting checked into commits and PRs than before. Developers are spending way more time generating and iterating with their agents. So when a vulnerability surfaces 15 or 30 minutes later in a pipeline scan, they really have moved on. The context is gone. The agent has started another task. The developer is three files deep in something else. The finding becomes a ticket, and tickets become backlogs.
At this point you might be thinking to yourself, this makes total sense for static testing. And in fact, maybe you’re starting to use Codex Security or Claude Code Security to have your AI agent handle more security tasks. But what about runtime application security testing (i.e., DAST)? Legacy DAST tools typically take hours (if not days) to scan, so even pipeline testing might not be feasible. Modern approaches (like StackHawk) scan incrementally, which is faster, making pipeline testing possible. But even CI/CD isn’t left enough anymore.
The solution: Testing at the time of AI coding
At AI code velocity, security testing has to happen during development—while the agent is still writing and the code is still in flight. That doesn’t mean a CLI linter or an IDE plugin anymore. The shift that matters now is testing against the running application, inside the agent’s workflow, while the agent still has full context on what it built and why.
AI coding agents already hold the entire working context of a feature: the code, the API structure, the data flow, and the intent behind the change. When a DAST scan runs inside that same session and returns a finding—say, a BOLA vulnerability on a specific endpoint—the agent has everything it needs to understand and fix it. It doesn’t need to load a ticket from Jira, re-read the code, reconstruct what it was trying to do, and figure out the right remediation. The context is right there.
This is the future of runtime AppSec testing: the AI coding agent triggers a scan against the running app, gets structured findings back, fixes the issues, and rescans to verify the fixes closed the holes. The loop repeats until the code is clean. No handoff. No ticket. No rework cycle weeks later.
The impact: Upstream agent effort, no downstream bottleneck
When vulnerabilities get caught and fixed during development—before the code is ever committed—they never enter the backlog. They never become tickets. They never land in a triage queue. They never sit in production waiting for someone to prioritize them.
This is the only model that actually prevents vulnerabilities in production rather than detecting them after the fact. Detection-only programs can’t keep up with AI-speed development because the backlog compounds faster than any team can work it down. The math doesn’t work. Three to ten times the code at the same defect rate, with the same number of people managing the output, results in your backlog growing every sprint.
Fixing earlier also means fixing cheaper. A vulnerability caught in the same coding session costs minutes of an agent’s time. The same vulnerability caught in CI/CD costs a context switch. Caught in production, it costs an incident response. The economics of upstream effort are well understood—what’s new is that AI coding agents make the upstream fix loop possible to automate.
And when every scan is tied to the commit SHA that produced it, you get something the backlog model never offered: proof. Commit-level attestation that security testing ran, that findings were remediated, and that the code passed before it ever hit the pipeline. Auditable by CI/CD gates, compliance teams, or your board.
This is how security actually scales with AI development velocity: not by hiring more people to manage more alerts, but by resolving findings at the source before they become anyone’s problem.
The bottleneck is shifting
Whether you call it the AI-DLC, the AI-SDLC, or just “how we build software now,” the reality is the same: engineering teams are producing more code faster than ever, and every executive in the building is pushing for more.
Security cannot be the reason for the slowdown.
But security can’t be the thing that gets skipped either. The exploit window is now less than a day. And the vulnerability classes that SAST can’t catch—auth flaws, business logic errors, runtime behavior—are the ones that cause breaches. Alert backlogs are now organizations’ biggest liabilities.
The AI-DLC needs security testing that runs at the speed of development, not after it. That means testing within the agentic loop during coding, with results that feed directly back to the agent for remediation.