VulDB

It's all about Vulnerabilities

TechnologyScience

Listen

All Episodes

Artificial Intelligence as a Challenge for Vulnerability Management

This episode explores how artificial intelligence is reshaping the vulnerability management landscape – not just as a defensive tool, but as a new source of risk. Alex, Vanessa, and Daniel break down real-world examples such as February 2026 Patch Tuesday zero-days, AI-assisted exploit development, LLM-discovered 0-days, misconfigured Copilot-style agents, and noisy "AI slop" exploit traffic. They discuss what actually works today in AI-powered cyber defense versus hype, and share practical guidance for vulnerability management teams on how to adapt processes, metrics, and controls for an AI-driven threat environment.


Chapter 1

From Vulnerability Scanner to AI Arms Race

Alex

Welcome back to "It’s all about Vulnerabilities". I’m Alex, and today we’re digging into why artificial intelligence is not just helping defenders, but also making life a lot harder for anyone running a vulnerability management program.

Vanessa

Hey everyone, Vanessa here. If you thought vuln management was already a bit of a treadmill, AI has just turned the speed up a couple of notches. We’re seeing brand new vulnerability classes, a lot more noise, and a very different kind of attacker activity to deal with.

Daniel

And I’m Daniel. I’ve watched this space since the days when we passed around exploit code on floppy disks. The shift we’re in now, with AI affecting both discovery and exploitation, is easily one of the biggest changes I’ve seen in decades.

Alex

Let’s ground this in something very current. Take Microsoft’s Patch Tuesday in February 2026. More than fifty vulnerabilities fixed in one go, six of them zero-days actively exploited. That alone would keep any vuln management team busy.

Alex

You had security feature bypasses in Windows Shell, MSHTML, and Word, things like CVE-2026-21510 and -21513, where just clicking a malicious link or file could quietly bypass SmartScreen and other protections. Then a couple of privilege escalations, like the Desktop Window Manager and Remote Desktop Services bugs, and a denial-of-service in Remote Access Connection Manager that could disrupt VPNs.

Vanessa

That’s the “classic” part of the story: lots of CVEs, different impact types, your usual prioritisation headache. But buried in that same Patch Tuesday, Microsoft also fixed several remote code execution vulnerabilities in GitHub Copilot and IDE integrations like VS Code, Visual Studio, and JetBrains products.

Vanessa

Those weren’t just boring buffer overflows. They came from a command-injection flaw that could be triggered by prompt injection. So you trick the AI assistant in the IDE into executing malicious code or commands. Suddenly, the thing that’s supposed to help your developers becomes an attack surface.

Daniel

And that’s an important inflection point. Historically, vulnerability management was about operating systems, browsers, big server software, maybe some network gear. Now we’ve got AI-assisted tools, copilots, and agents that live inside those environments. They have access to source code, secrets, and CI pipelines. When they’re vulnerable, the blast radius can be enormous.

Alex

Exactly. We used to talk about, say, Remote Desktop as a high-value component. Now “developer plus Copilot plus cloud API keys” is a much juicier target. Kev Breen at Immersive pointed out that developers are high-value targets because they often sit on API keys, secrets, and infra credentials. Tie that to an AI component that can be steered by prompts, and you’ve got a really nasty threat model.

Vanessa

From the vuln management angle, that means your scope is expanding. It’s not just “is this Windows box patched,” it’s “which AI-powered tools are present on this asset, what data do they see, and are there known AI-specific vulnerabilities or misconfigurations we need to treat like any other CVE?”

Daniel

And just to be clear, AI is also helping defenders. We’ll get into that later. But we can’t gloss over the fact that every new class of AI-enabled feature is also a fresh attack surface that the vulnerability management team needs to understand, track, and prioritise.

Chapter 2

AI Agents, Copilots, and Misconfigurations as Vulnerabilities

Vanessa

Let’s talk about AI agents and copilots, because that’s where a lot of the really interesting – and frankly scary – exposure is showing up right now.

Vanessa

We’re seeing organisations roll out agents built on things like Microsoft Copilot Studio. These agents plug into internal data sources, they can send email, they can call HTTP APIs. In theory they’re just “helpers” automating tasks. In practice, the way they’re configured often looks a lot like a new vulnerability class.

Alex

Microsoft’s own security blog laid out a top ten list of common Copilot Studio agent misconfigurations, and honestly it reads like a greatest-hits compilation of “how to make your environment exploitable without writing a single line of malware.”

Alex

You’ve got agents shared with the entire organisation, or even without authentication at all. Agents that make raw HTTP requests to arbitrary endpoints, including non-HTTPS or internal services. Agents that send emails where the recipient is chosen dynamically by the model. And of course, agents running with the creator’s personal privileges, or containing hard-coded API keys.

Daniel

That “maker authentication” pattern in particular is one I really dislike. The person who built the agent uses their own account to wire up a connector, and then never changes it. Suddenly every ordinary user who interacts with that agent is, under the hood, operating with the maker’s rights. That’s textbook privilege escalation and a violation of least privilege.

Vanessa

Let me paint a concrete scenario that I’ve actually seen in consulting. Someone in support builds a “Helpdesk Copilot” agent. They connect it to the customer database using an MCP tool, give it the ability to send emails, and for convenience they disable authentication because “it’s only internal.” Then they share it to “everyone in the company” so adoption is easy.

Vanessa

Now imagine a malicious insider or a compromised account. They can prompt the agent to pull sensitive customer records and email them out to any address they can get the model to accept. Or, if that agent also has HTTP actions, they can steer it to call internal services it was never meant to touch.

Alex

From a vulnerability management point of view, you have to treat that configuration as if it was a software flaw. The risk is: unauthenticated, widely shared agent with access to sensitive systems and the ability to exfiltrate data. Whether the root cause is code or misconfiguration doesn’t matter much to an attacker.

Daniel

And the tricky bit is: these agents often don’t show up in your traditional asset or vulnerability scans. They live in SaaS control planes. So part of adapting vulnerability management to the AI era is expanding your asset inventory to include agents and copilots, with owners, scopes, and permissions documented.

Vanessa

Exactly. When I talk to VM teams about this, I tell them: treat agents as first-class assets. For each one you want to know: who owns it, what data it can see, what connectors it uses, how authentication works, and who it’s shared with. That’s the starting point for risk assessment and for deciding whether you’ve basically minted yourself an internal zero-day.

Chapter 3

AI Slop, Noisy PoCs, and Fake Exploits

Alex

On the flip side of that, AI is also generating a ton of noise. Not every scary-looking payload in your logs is actually a working exploit. Sometimes it’s what folks have started calling “AI slop” – half-baked PoCs and auto-generated attack traffic that borrows ingredients from real bugs but doesn’t quite work.

Daniel

Johannes Ullrich from the SANS Internet Storm Center had a really nice write-up on this around Oracle WebLogic, CVE-2026-21962. He was hunting for real-world exploitation and came across this weird HTTP request targeting the WebLogic ProxyServlet, with headers like “wl-proxy-client-ip” set to 127.0.0.1 followed by a base64 string that decodes to “cmd:whoami.”

Daniel

On paper it looked like an attempt at command injection, maybe trying to get the server to decode and execute that header. But the details were off: the way the IPs were separated, the mixing of encodings… it smelled more like something an LLM had pieced together from existing exploit write-ups than a carefully engineered attack.

Vanessa

Yeah, and when Johannes asked a few AI models whether that was a real exploit or nonsense, they gave completely different answers, which is kind of perfect. The point is, we’re seeing more of this traffic: scanners or scripts clearly written or “improved” by AI, throwing odd combinations of headers and payloads at servers.

Vanessa

From a SOC and VM perspective, your logs fill up with “possible exploit attempts,” but a good chunk of them are these non-functional experiments. They still burn analyst time, and they can still trigger IDS signatures, but they’re not all equal in terms of actual risk.

Alex

Meanwhile, vendors like Arctic Wolf are putting out solid advisories on CVE-2026-21962 as a maximum-severity auth bypass in WebLogic’s proxy plug-in. That one really is serious: unauthenticated remote attackers can create, delete, and modify critical data behind the reverse proxy. So you’ve got this dual reality: genuine high-impact vulns that absolutely need patching and monitoring – and a growing cloud of AI-generated junk that looks dangerous at first glance.

Daniel

And that’s where prioritisation discipline becomes even more important. Vulnerability management teams need to lean on good threat intelligence, reputable advisories, and telemetry from actual exploitation, not just “somebody posted a PoC on GitHub that might have been written by an LLM.”

Vanessa

I’ve seen teams get dragged into endless debates over whether some exotic header combination indicates a new zero-day, when really they hadn’t yet rolled out the vendor patch for the very public CVE with a CVSS of 10.0. AI slop makes it emotionally harder to ignore the weird stuff, but the fundamentals haven’t changed: fix the known criticals, monitor for credible exploit chains, and don’t let noisy PoCs derail your patch plans.

Alex

And maybe one practical tip: maintain a living “noise catalogue.” If your SOC repeatedly sees the same bogus WebLogic payloads or malformed Copilot-related requests, document them, tag them, and feed that context back into your detection and triage logic so the VM and SOC teams don’t keep re-investigating the same AI-generated non-issues.

Chapter 4

LLM-Discovered 0-Days and the Patch Pipeline

Daniel

So far we’ve mostly talked about AI making more vulnerabilities exploitable, or creating new misconfiguration risks. But there’s another side: AI that’s actually finding zero-days, including in code bases we thought were pretty well hammered by fuzzers and human testing.

Alex

Anthropic’s work with Claude Opus 4.6 is a good example. They put the model into a sandbox with access to open source projects and standard tools – compilers, debuggers, fuzzers – but didn’t give it any special tricks. Out of the box, the model managed to find and help validate more than 500 high-severity vulnerabilities in widely-used libraries.

Alex

And these weren’t all obvious bugs. In one case, GhostScript, the model dug through git history, found a past commit that added bounds checks for certain font handling code, and then reasoned that there might be other call paths where similar checks were missing. It then generated a proof-of-concept crash input to demonstrate the bug.

Vanessa

Another nice one was in OpenSC, the smart card utility. Claude looked for typical unsafe patterns like chained “strcat” calls into fixed-size buffers, recognised that this could overflow, and homed in on code that traditional fuzzers hadn’t really exercised because of all the preconditions needed to reach it. That’s very human-like reasoning, but done at machine scale.

Vanessa

And in CGIF, a GIF library, the model understood enough about the LZW compression scheme to realise that under certain conditions the “compressed” data could actually be larger than expected, leading to a buffer overflow. That’s not trivial; it requires understanding both the algorithm and the implementation assumptions.

Daniel

What this means for vulnerability management is that even mature software – the kind that underpins printers, PDF renderers, authentication clients – can still hide nasty bugs that AI will find faster and more systematically than we’re used to. And we’re not just talking about one or two issues a year. We’re talking hundreds of high-severity findings in a relatively short campaign.

Alex

There’s also a disclosure dynamic. Anthropic did the right thing: they validated each bug with humans, coordinated with maintainers, and submitted patches. But imagine a less responsible actor doing similar large-scale scanning with an LLM and sitting on the results. Your patch pipeline might suddenly have to cope with bursts of dozens of serious issues in critical components.

Vanessa

And the standard “90-day disclosure window” starts to look a bit optimistic when models can keep churning out new bugs faster than vendors and enterprises can realistically test and deploy fixes. Even if everyone is acting in good faith, the volume alone can overwhelm existing change management processes.

Daniel

So I think vulnerability management teams need to plan for a world where: one, zero-day discovery is less exceptional and more continuous; two, a significant fraction of those bugs are found and reported with the help of AI; and three, attackers can use similar capabilities to find less-publicised issues or to weaponise patched bugs against lagging environments.

Alex

That might mean building more flexible patch pipelines, with better automation for regression testing. It might mean updating how you triage: not just “CVSS score” and “is there an exploit,” but also “is this in a class of bugs AI is very good at finding and chaining,” like memory corruption in widely deployed libraries.

Vanessa

And maybe most importantly, it means accepting that “we fuzzed this for years” doesn’t equal “we’re done.” In the AI era, code that was considered well-tested can come back onto your critical vuln list because someone’s model – friendly or not – has just found a new way to break it.

Chapter 5

Using AI Inside Vulnerability Management – What Works Today

Alex

We’ve painted a fairly daunting picture so far, so let’s flip the lens and talk about how vulnerability management teams can use AI on their side, in ways that actually work today and aren’t just vendor fairy tales.

Daniel

One useful framing comes from some work MicroSolved published – they laid out a taxonomy of practical AI use cases in security. For our purposes, three matter most: detection and triage, automated enrichment, and assisted workflows. All three map very neatly onto vulnerability management.

Vanessa

On detection and triage, think about the classic problem: your scanners dump thousands of findings across endpoints, servers, containers, cloud resources. Many are duplicates, some are false positives, and only a subset are truly important given your environment.

Vanessa

An ML model can cluster similar findings, learn which types of issues you historically ignore or accept because of context, and help you surface the “top hundred” vulns that combine high severity with exploitable exposure and critical assets. That’s not science fiction – there are already products doing this with reasonable success, especially when you feed them good asset and threat data.

Alex

Automated enrichment is the next layer. Instead of a ticket that just says “CVE-2026-21516 on host X,” your AI helper pulls in that this host belongs to the development VDI pool, has Copilot enabled, and accesses production Azure subscriptions. It cross-references threat intel and sees active exploitation in the wild. Suddenly that ticket comes pre-ranked as genuinely urgent, not just another 7.8 on a spreadsheet.

Daniel

Assisted workflows are where some of the large language models shine. They can draft remediation plans, change requests, or communication to stakeholders in human language. “Here’s why this Copilot IDE vulnerability matters for your team, here’s what’s going to happen in the next maintenance window, here’s what we need you to test.” The human still reviews and approves, but the repetitive writing work gets faster.

Vanessa

I worked with one organisation that also used AI to deal with alert noise in their SOC and vulnerability pipeline. They set explicit “alert quality” budgets – essentially saying: we won’t accept more than a certain number of non-actionable alerts per analyst hour. Any new rule or feed that blew that budget had to be tuned or rolled back.

Vanessa

They used machine learning to measure which rules and scanners were producing a lot of dead-end tickets, and then worked systematically to consolidate tools and tune signatures. That freed up a big chunk of time that they then re-invested into reviewing AI-related exposures specifically – things like misconfigured agents and developer tools with Copilot vulnerabilities.

Alex

It’s important, though, to be honest about what’s not ready. Fully autonomous SOCs or vulnerability programs where an AI agent both detects and remediates without human checkpoints are, in my view, not something you want in production yet. Ditto for “predictive AI” that claims to foresee unknown attacks with high confidence. Interesting research, but not something I’d bet my patch strategy on.

Daniel

Agreed. For now, the sweet spot is AI as an accelerator and assistant, not a replacement. Let the models chew through log volumes and vuln data that humans can’t reasonably handle, but keep people in the loop for prioritisation and changes that could break systems or affect safety and compliance.

Chapter 6

Adapting Vulnerability Management for an AI-Driven Future

Daniel

So where does all of this leave a vulnerability management team that wants to be realistic, not paranoid? I’d boil it down to a few concrete adjustments.

Daniel

First, expand your asset universe. AI systems and agents – whether it’s Copilot Studio bots, IDE plugins, LLM APIs used in production, or security copilots inside your tools – need to appear in your asset inventory. Each should have an owner, a description of what data it can see, what actions it can perform, and how it authenticates.

Alex

Second, update your risk model. Today most VM tooling talks in terms of CVSS, exploitability, and maybe whether there’s a Metasploit module. In the AI era, you want to add dimensions like: is this vulnerability in an AI component or something widely exposed to AI agents? Could it be triggered by prompt injection or malicious input crafted by a model? Does it grant access to developer secrets or critical pipelines if exploited?

Vanessa

Third, put some governance around AI-related misconfigurations. For example, you might define a simple policy: no unauthenticated AI agents in production, no agents shared org-wide if they touch sensitive data, no maker credentials in high-risk environments, all HTTP actions must use HTTPS and be limited to approved endpoints. Those become findings in your vuln management backlog just like missing patches.

Vanessa

And don’t forget email as an exfil path. If an agent can send arbitrary emails with content the model decides at runtime, that’s a data exfiltration vector you need to treat seriously.

Daniel

Fourth, strengthen collaboration. AI-related vulnerabilities sit at the intersection of application security, identity, data protection, and infrastructure. Vulnerability management can’t deal with them in a silo. You need regular touchpoints with whoever owns your AI platforms, your IAM team, and your data governance folks.

Alex

Fifth, build some AI literacy into the VM team itself. People don’t need to become ML researchers, but they should understand the basics: what prompt injection is, what an AI agent can and can’t do, why misconfigurations like broad sharing or maker authentication are dangerous in this context. That makes it much easier to recognise which tickets really matter.

Vanessa

And finally, watch your own use of AI as a team. Using models to summarise advisories or help deduplicate scanner output is great. Pasting proprietary vulnerability data or customer configs into random public chatbots without guardrails is not great. So have some guidelines for “AI in the VM workflow” too.

Daniel

If I zoom out, I’m actually cautiously optimistic. Yes, AI is generating new vulnerabilities and new kinds of noise. But it’s also giving defenders tools that, used wisely, can help us see and fix issues faster than before. The organisations that will fare best are the ones that treat AI as just another powerful, dual-use technology.

Daniel

You don’t panic, you don’t pretend it doesn’t exist, and you don’t hand it the keys to the kingdom without supervision. You expand your asset list, you update your risk models, you tighten your configs, and you use AI where it really adds value.

Alex

Nicely put. From my side, if you’re running a vulnerability management programme today, I’d encourage you to start small but concrete: add AI agents and developer tools to your asset inventory, review one or two high-risk misconfigurations – like unauthenticated agents or maker credentials – and pilot an AI-based triage helper on a subset of your findings.

Vanessa

And if you’re listening and thinking, “This is a lot,” you’re not alone. It is a lot. But the fundamentals still matter most: know your assets, patch the big things promptly, reduce unnecessary exposure, and create feedback loops between your VM team, your SOC, and your developers. AI doesn’t change those basics – it just raises the stakes.

Daniel

We’ll wrap it there. Thanks for joining us on "It’s all about Vulnerabilities". I’m Daniel.

Alex

I’m Alex.

Vanessa

And I’m Vanessa. Take care, patch smart, and we’ll talk to you in the next episode.