Codex Security vs Claude Code Security — Comparing the Two AI Security Agents

ZenChAIne·March 10, 2026

AI SecurityCodex SecurityClaude Code SecurityOpenAIAnthropic

Introduction

February and March 2026 brought major developments in AI-powered security. Anthropic launched Claude Code Security on February 20, and OpenAI followed with Codex Security on March 6 — both as research previews. The era of AI agents that automatically discover, verify, and propose fixes for code vulnerabilities has officially arrived.

What makes both launches noteworthy is that each tool debuted with proven track records on real production code. Claude Code Security found 22 vulnerabilities in Firefox, while Codex Security detected over 10,000 high-severity issues across 1.2 million scanned commits.

This article provides a thorough comparison of these two AI security agents across release timelines, usage, detection capabilities, and pricing.

Overview

	Codex Security (OpenAI)	Claude Code Security (Anthropic)
Release date	March 6, 2026	February 20, 2026
Status	Research preview	Research preview (limited access)
Origin	Aardvark (internal tool)	Anthropic cybersecurity research (1+ years)
Base model	GPT-5.4 / GPT-5.3-Codex	Claude Opus 4.6
Target plans	Enterprise / Business / Edu	Enterprise / Team
OSS access	Free program available	Priority access for maintainers

Both are in research preview and not yet available to general users. However, OSS project maintainers can apply for priority access through dedicated programs.

Different Approaches

Codex Security — Reproducing Attacks in a Sandbox

Codex Security's defining feature is that it validates discovered vulnerabilities by actually exploiting them in a sandbox environment.

Connect a GitHub repository
Create an isolated copy of the repo inside a container
Automatically generate a threat model (identifying attack surfaces)
Detect vulnerabilities and run PoC exploits in the sandbox to confirm impact
Rank by severity and present fix code with natural-language explanations

Analysis can take several days depending on repository size. This is a deep-analysis approach that takes its time.

Claude Code Security — Reasoning Like a Human Security Researcher

Claude Code Security's defining feature is that it reasons about code rather than relying on rule-based detection.

Scan the codebase
Understand interactions between components
Trace data flows
Filter false positives through a multi-stage verification process
Report findings with severity + confidence ratings
Present results in a dashboard for human review and approval

Its strength is detecting complex vulnerabilities arising from cross-component interactions — the kind traditional static analysis tools miss.

Both tools follow a Human-in-the-Loop principle: no code is modified without human approval. The AI discovers and proposes — humans make the final call.

Detection Track Records

Codex Security

Scanned 1.2 million commits during beta
Detected 792 critical and 10,561 high-severity issues
Found 14 CVE-level vulnerabilities in popular OSS projects
Reduced false positive rate by over 50% during beta

Claude Code Security

Discovered 500+ vulnerabilities in production OSS codebases (including bugs that had gone undetected for decades)
Partnered with Mozilla to find 22 vulnerabilities in Firefox in two weeks
- 14 rated high severity, 7 medium, 1 low
- Scanned approximately 6,000 C++ files and generated 112 unique vulnerability reports
- CVE-2026-2796 (CVSS 9.8): Discovered a JIT miscompilation bug in JavaScript WebAssembly
- Discovery cost: approximately $4,000 (API credits)
Mozilla used this analysis as a starting point to find an additional 90 bugs

The Firefox findings are particularly striking. Claude Opus 4.6 began exploring the JavaScript engine and found a use-after-free bug within just 20 minutes.

How Engineers Can Use These Tools

Using Codex Security

Codex Security is accessed through Codex Cloud (a web dashboard). It is a separate product from Codex CLI (the coding agent) — no CLI installation is required.

Connect a GitHub repository in the Codex Cloud web UI
Select repository, branch, and history range on the scan creation screen
Scanning starts automatically (initial scan takes hours to days depending on repo size)
Review threat models and findings in the web dashboard
Create PRs directly from the dashboard when fixes are needed

Eligible plans: ChatGPT Enterprise / Business / Edu (first month free)

Using Claude Code Security

Claude Code Security is accessed through Claude Code (web). It is currently not offered as a standalone command-line tool — scan results are reviewed and approved through a web dashboard interface.

Eligible plans: Claude Enterprise / Team (limited research preview)

Access request: claude.com/contact-sales/security

Usage Comparison

	Codex Security	Claude Code Security
Interface	Web (Codex Cloud dashboard)	Web (dashboard)
GitHub integration	Auto-scan on repository connection	Codebase scanning
Results	Web dashboard	Dashboard
Auto-fix	Generates fix code + natural-language explanation, can create PRs	Proposes fix patches (human approval required)

Pricing

Both are in research preview, so official pricing has not been announced.

	Codex Security	Claude Code Security
Initial cost	First month free (Enterprise/Business/Edu)	Undisclosed (Enterprise/Team)
OSS access	Free maintainer program	Priority access for maintainers
Official pricing	TBA	TBA

For reference, Anthropic spent approximately $4,000 in API credits on the Firefox vulnerability research (2 weeks, scanning 6,000 files + exploit verification). This gives a rough sense of costs for full scans of large codebases.

Which One Should You Choose?

Since both are still in research preview, it is not quite "choosing" time yet. But here is a framework for when they reach general availability.

Codex Security is a better fit when:

You already have a ChatGPT Enterprise / Business subscription
You want continuous security scanning for GitHub repositories
You need sandbox-based exploit verification
You want to create PRs directly from findings to close the fix cycle

Claude Code Security is a better fit when:

You already have a Claude Enterprise / Team subscription
You want to find deep vulnerabilities that traditional static analysis misses
You want a reasoning-based approach that catches complex cross-component issues
You need analysis closer to security research depth

Neither tool catches every vulnerability. Using them alongside traditional SAST/DAST tools is recommended. AI security agents are best positioned as a complement — catching the complex vulnerabilities that humans and existing tools miss.

Summary

Aspect	Codex Security	Claude Code Security
Strengths	Large-scale scanning, sandbox verification, direct PR creation	Reasoning-based deep analysis, Firefox 22-vulnerability track record
Weaknesses	Analysis can take days	Currently limited access
Best for	Large teams driving DevSecOps	Teams seeking research-grade deep analysis

The arrival of AI security agents is one answer to the industry-wide challenge of "not enough security experts." The fact that OpenAI and Anthropic released research previews almost simultaneously speaks to both the importance and competitiveness of this space.

This is not about which one "wins." The realistic future is both coexisting alongside traditional tools as part of a defense-in-depth strategy. Keep an eye on official releases and pricing announcements, and start evaluating which tool fits your team.

Bonus: GitHub Actions Anyone with an API Key Can Use

The Codex Security and Claude Code Security products covered above both require enterprise contracts. However, both companies offer GitHub Actions that anyone with an API key can use.

anthropics/claude-code-security-review

Anthropic's claude-code-security-review is a GitHub Action that automatically reviews PR diffs for security issues using Claude.

yaml

# .github/workflows/security-review.yml
name: Security Review
on:
  pull_request:
 
permissions:
  pull-requests: write
  contents: read
 
jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 2
      - uses: anthropics/claude-code-security-review@main
        with:
          comment-pr: true
          claude-api-key: ${{ secrets.CLAUDE_API_KEY }}

It detects SQL injection, XSS, hardcoded secrets, authentication/authorization flaws, and more, providing feedback as inline PR comments on the relevant code lines.

openai/codex-action

OpenAI's codex-action runs Codex CLI within GitHub Actions. While not security-specific, it can be used for security reviews with the right prompt.

yaml

# .github/workflows/codex-review.yml
name: Codex Review
on:
  pull_request:
 
permissions:
  contents: read
  pull-requests: write
 
jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: openai/codex-action@v1
        with:
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}
          prompt: "Review the code changes in this PR from a security perspective"

These GitHub Actions are separate products from the Codex Security / Claude Code Security enterprise offerings discussed in this article. They require no enterprise contract — anyone with an API key can integrate them into their CI/CD pipeline.

🇯🇵 日本語で読む