AI in Cybersecurity: What It Actually Does and Where It Fails

IBM Puts a Number on AI’s Impact: $2.2 Million

IBM’s 2024 Cost of a Data Breach Report found that organizations using AI and automation in their security operations saved an average of $2.2 million per breach compared to those that did not. With the average breach cost at $4.88 million, that is a 45% reduction.

That number is worth examining carefully. It does not mean AI prevented breaches. It means AI-augmented teams identified and contained breaches faster, reducing dwell time and limiting damage. The savings come from speed — and speed matters more than ever. CrowdStrike’s 2026 Global Threat Report puts average breakout time (the window between initial compromise and lateral movement) at 29 minutes. The fastest observed: 27 seconds.

When attackers move in seconds, the question is not whether to adopt AI in security operations. It is where AI delivers genuine value and where it creates false confidence.

Where AI Delivers Measurable Value

Vulnerability Prioritization That Actually Works

In 2024, 40,009 new CVEs were published — 108 per day, a 38% increase year-over-year. No human team can evaluate each one for relevance to their environment. But the raw numbers are misleading: research from Kenna Security and the EPSS model shows that only 2-5% of CVEs are ever exploited in the wild. 62% have less than 1% probability of exploitation within 30 days.

The prioritization problem is not volume — it is signal extraction. AI excels here by combining multiple data sources that humans struggle to correlate manually:

EPSS probability scores — statistical likelihood of exploitation in the next 30 days, updated daily
CISA KEV status — confirmed active exploitation, which overrides any statistical model
Asset context — is the vulnerable system internet-facing, handling sensitive data, or sitting in an isolated dev environment?
Compensating controls — does a WAF rule, network segmentation, or disabled feature mitigate the vulnerability?
Attack path position — is the vulnerable asset reachable from the perimeter, or buried behind three layers of controls?

Machine learning models trained on these signals consistently identify the 3-5% of vulnerabilities that represent real breach risk. Teams using multi-signal AI scoring report spending 60-70% less time on triage while catching more of the vulnerabilities that attackers actually exploit.

Automated Triage at Machine Speed

False positives kill SOC productivity. Security teams investigating thousands of alerts per day — the majority of which are benign — experience alert fatigue that causes real threats to be missed or deprioritized.

AI addresses this through pattern recognition across historical triage decisions, environment baselines, and exploit validation results. When a scanner flags a vulnerability, AI enrichment determines:

Is the vulnerable function reachable through the application’s inputs?
Is the target behind a WAF that blocks the relevant exploit technique?
Does the vulnerability require a specific OS version, library version, or configuration that this environment does not have?
Has this exact finding been triaged as a false positive in similar environments?

This is not alert suppression. It is contextual enrichment that gives analysts a pre-scored finding with evidence for the assessment. The analyst still makes the final call, but they start from an informed position rather than a raw CVE ID and a CVSS score.

Cross-Scan Pattern Recognition

Individual scan results are data points. AI turns them into intelligence by recognizing patterns across scanners, time periods, and asset groups:

A single open port is a finding. The same port appearing across 40 assets in a subnet after a recent deployment suggests a systematic configuration error.
One SQL injection finding is a vulnerability. SQL injection findings across five different endpoints in the same application suggest a shared vulnerable data access pattern — fix the pattern, not five individual bugs.
A vulnerability that has been remediated and reintroduced three times in six months is not a patching problem. It is a deployment pipeline problem.

These patterns are invisible when reviewing scan results one at a time. ML models operating across the full dataset surface them automatically.

Where AI Falls Short — and Honesty Matters

The security industry has a credibility problem with AI claims. Every vendor promises “AI-powered” capabilities. Fewer are honest about limitations.

Novel Attack Discovery

AI models trained on known vulnerability patterns cannot reliably identify truly novel attack techniques. A zero-day that exploits a previously unknown class of vulnerability has no training data. AI will catch variants of known attacks — a new SQL injection technique, a novel deserialization chain using known gadgets. It will not discover a fundamentally new attack class.

This is a training data problem, not an algorithm problem. Until a new attack technique generates sufficient examples for model training, detection depends on human researchers, threat intelligence, and manual analysis.

Business Logic Flaws

AI can identify common vulnerability patterns from OWASP Top 10 categories — injection, broken access control, cryptographic failures. It struggles with application-specific logic flaws.

Example: an e-commerce API that lets users apply discount codes. A business logic flaw might allow applying the same code twice, or applying a percentage discount after a fixed discount to stack savings beyond the intended amount. No CVE exists for this. No scanner signature matches it. Understanding the flaw requires understanding the business rules — something that demands domain knowledge AI does not have.

Business logic testing remains a human skill. AI can assist by mapping application flows and identifying unusual data patterns, but the judgment of “this behavior violates business intent” requires understanding intent.

The Adversarial Arms Race

Attackers use AI too. Polymorphic malware generated by LLMs evades signature-based detection. AI-crafted phishing emails bypass natural language classifiers trained on older phishing patterns. Adversarial inputs designed to fool ML models are a documented attack technique (MITRE ATLAS, the framework for adversarial threats to ML systems).

Any AI-based detection capability has a shelf life. The model’s effectiveness degrades as attackers adapt to its patterns. Continuous retraining on fresh data is not optional — it is the cost of maintaining detection accuracy.

The Right Model: Human-AI Collaboration

The $2.2 million savings IBM documented did not come from replacing analysts with AI. It came from augmenting analysts so they work faster and focus on the right problems.

A practical human-AI division of labor:

AI handles	Humans handle
Vulnerability prioritization across 40,000+ annual CVEs	Novel zero-day analysis and threat research
False positive elimination through contextual enrichment	Business logic testing and application-specific risk
Pattern recognition across scan results	Strategic security architecture decisions
Automated triage of known vulnerability classes	Incident response judgment calls
Continuous monitoring at scale	Adversary emulation and red team operations

Speed is AI’s advantage. Judgment is the human advantage. The organizations that get the best outcomes combine both instead of pretending AI can do everything or dismissing it as hype.

What to Demand From AI Security Tools

Three criteria separate useful AI capabilities from marketing:

Explainability. If the AI assigns a risk score, you need to see why. What signals contributed? What weight did each carry? Black-box scores erode analyst trust and make it impossible to validate accuracy. If you cannot explain the score to your CISO, the tool is not ready for production.

Integration, not isolation. AI capabilities that create another dashboard, another alert queue, another silo add complexity. AI should integrate into your existing CTEM workflow — enriching findings in your scanner output, scoring vulnerabilities in your ticketing system, prioritizing BAS results in your existing reporting.

Continuous learning. A static ML model trained on last year’s data degrades as the threat landscape evolves. Effective AI security tools retrain on your environment’s data — your triage decisions, your false positive patterns, your asset context. This is the difference between a classifier and a system that gets smarter as your team uses it.

VirtueThreatX uses AI-assisted risk scoring and automated triage to help security teams focus on the exposures that matter — not the bulk that never reach exploitability. See how CRPS scoring works or walk through it live.