logo
|
Blog
  • Vulnerability Research
  • AI for Security
  • Security for AI
  • Web2
  • 🌐

AI Made Code Cheap. Trust Did Not.

While code is abundant, assurance is scarce. The winners won't be the teams that generate the most code, it’ll be the teams that can prove it's safe.
Hector Leano's avatar
Xint's avatar
Hector Leano,Xint
Apr 13, 2026
AI Made Code Cheap. Trust Did Not.
Contents
AI did not invent insecure software. It industrialized it.The Dangerous Part: Flaws You Don't SeeThe Human Review is UnderwaterThe Answer: Verify at Machine Speed

AI did not invent insecure software. It industrialized it.

Coding agents are scaling software faster than security can scale review. According to the Cortex 2026 Benchmark report, while pull requests (PRs) per author increased 20% year over year, incidents per PR increased 23.5% and change failures increased 30% over the same period. 

Every analysis comes back to the same conclusion: substantial increases in serious data leaks and design flaws go hand in hand with increased usage of AI code. 


Software developers are hoping that more data will fix this issue, but this is systemic to LLMs - not a scaling fix. 

The real urgency however is just as organizations are using AI to write more vulnerable code than ever, attackers can use AI to industrialize their probes for weaknesses as well. According to an Anthropic public disclosure, in September 2025, a state-sponsored actor used AI for 80–90% of an espionage campaign against ~30 global targets in tech, finance, and government with human intervention at only 4–6 decision points.

The Dangerous Part: Flaws You Don't See

Security is the absence of a vulnerability, and unfortunately AI doesn't reason about what's missing. Rather AI optimizes for the shortest path to working code.

Buggy code is the easy problem to fix because engineers can see when it doesn’t compile correctly. The much harder problems are those where the code executes correctly but violates the large business context. The following are real examples of issues Xint has found to occur more frequently as AI coding has increased:

Missing Authorization: "Build an API that returns customer billing details by ID." → Clean, working endpoint — any logged-in user can fetch any other user's data. The AI solved the task. It didn't think about access control.

Secrets in Config Files or Code: Agent hardcodes an API key to make a feature work. "Temporary" fix ships to production. Exposed secrets or API keys result in sensitive data leak and/or unexpected bills.

Memory Safety in Native Code: Asked AI to write an image parser in C. Compiled cleanly, ran correctly for most files. Contained a critical heap overflow:

// attacker-controlled width & height from file header, no overflow check

unsigned char data = (unsigned char) malloc(3 width height);

// e.g. 100,000 × 100,000 × 3 wraps in 32-bit → tiny buffer → heap overflow

The Human Review is Underwater

There has always existed a workforce gap in cybersecurity, with nearly 5 million unfilled cybersecurity positions globally, leading to 88% of companies experiencing a serious cyber incident as a result of capabilities gaps and a 44% increase in attacks exploiting public-facing apps.

The Answer: Verify at Machine Speed

AI-generated code should be treated as untrusted until proven otherwise.

Below are a series of best practices so dev teams can continue to innovate faster without compromising on security:

  • Safe defaults: secure templates, approved building blocks, golden paths

  • Automated gates: secret scanning, SAST, dependency & IaC scanning, permissions checks

  • AI-powered code review (where Xint comes in): deep understanding of intent & context, not pattern matching. 

    • Catches business logic errors that SAST tools miss 

    • Identifies trust boundary violations across multi-file changes

    • Detects unsafe memory patterns in complex native code

    • Operates at code gen. speed — reviews commits in real-time

  • Runtime monitoring: anomaly detection, containment, blast-radius limiting

  • Human escalation: novel issues, architecture, trust & judgment calls


While code is abundant, assurance is scarce. The winners won't be the teams that generate the most code, it’ll be the teams that can prove it's safe.

Share article

Theori © 2025 All rights reserved.

RSS·Powered by Inblog