Apex is an AI-powered pentester that attacks apps in black-box mode to find weaknesses.

Apex AI Penetration Testing Agent Apex is an AI-powered penetration testing agent that can work on its own and test live apps in black-box mode. It doesn't need access to source code, hints, or set attack paths. This lets it find, link, and check real-world vulnerabilities at the speed that modern software development needs.

A structural breakdown in how software security is being done is what started Apex. AI coding agents are writing and combining code on a large scale. For example, Stripe's coding agents merge 1,300 pull requests every week, and some engineering teams spend more than $1,000 a day on AI tokens for each engineer without any human code review.

Find out more Technology for Businesses News updates about hacking VPN services This speed is too fast for traditional scanners and assessments done by people. Apex was made to be the adversarial verification layer. It is a separate agent that attacks the running application just like a real attacker would, finding weaknesses before they become breaches.

Apex can be used in three different ways. In CI pipelines, it checks every deployment against a sandboxed copy of the application, mapping the attack surface and trying to exploit it before code merges. It constantly finds and shows exploitable weaknesses in real time against production. It also lets you test any target on demand, replacing the quarterly PDF engagement with a feedback loop that works as fast as modern threats.

PensarAI created Argus, an open-source benchmark of 60 self-contained, Dockerized vulnerable web applications that are specifically designed to test offensive security agents. The most popular benchmark suite, XBOW's 104-challenge set, is 70% PHP, only covers single-vulnerability targets, and doesn't include GraphQL, JWT algorithm confusion, race conditions, prototype pollution chains, WAF bypass, or multi-tenant isolation scenarios. Node.js/Express (40%), Python/Flask/Django (20%), multi-service architectures (25%), Go, Java/Spring Boot, and PHP are all part of Argus.

It adds new categories that no other benchmark does, such as WAF and IDS evasion, multi-step exploit chains that need up to 7 chained vulnerabilities, multi-tenant isolation failures, race conditions and business logic flaws, modern authentication bypasses (JWT, OAuth, SAML, MFA), and attacks on cloud and Kubernetes infrastructure.

There are 2 easy, 27 medium, and 31 hard challenges that test your difficulty level. 60 apps have 271 vulnerabilities. We used Claude Haiku 4.5, the smallest and cheapest model available, to point Apex at all 60 Argus challenges in full black-box mode.

This helped us see how architectural gains were better than raw model capability. Apex had a 35% pass rate, which was better than PentestGPT (30%) and Raptor (27%). The gap grew a lot on the top 10 hardest challenges using Claude Opus 4.6: Apex solved 80%, PentestGPT reached 70%, and Raptor hit 60%. Find out more Platform for threat intelligence Training in security awareness for hacking and cracking Apex found 271 different vulnerabilities during the entire run.

These included SQL injection, SSRF, NoSQL injection, prototype pollution, SSTI, XXE, race conditions, IDOR, auth bypass, CORS misconfigurations, command injection, and path traversal.

The average cost of each challenge was about $8, and the whole 60-challenge run on Haiku cost less than $500. In less than 15 minutes, there were some notable solves, such as a 7-step race-condition double-spend in a fintech transfer endpoint, a multi-tenant SSRF chain that used a shared cache to get API keys from neighboring tenants, and SpEL injection to RCE a Java Spring Boot application. Apex's documented failure modes are helpful.

The biggest gap was last-mile execution, which meant finishing the last step of credential extraction after a successful SSRF chain. The agent was fooled by decoy flags twice, and complex multi-step chains like CI/CD pipeline poisoning and Kubernetes compromise took longer than the 30-minute budget. You can now get both Apex and the Argus benchmark for free on GitHub.

, LinkedIn, and X for daily news about cybersecurity. Get in touch with us to have your stories published.

These included SQL injection, SSRF, NoSQL injection, prototype pollution, SSTI, XXE, race conditions, IDOR, auth bypass, CORS misconfigurations, command injection, and path traversal.

, LinkedIn, and X for daily news about cybersecurity. Get in touch with us to have your stories published.

Apex is an AI-powered pentester that attacks apps in black-box mode to find weaknesses.

Trending News

Your Next Breach Will Look Like Business as Usual

Your Next Breach Will Look Like Business as Usual

Ransomware Groups Increasingly Turn to EDR Killers Outside Vulnerable Driver Tactics

Ransomware Groups Increasingly Turn to EDR Killers Outside Vulnerable Driver Tactics

ProSpy Spyware Spread Through Fake Messaging Apps In Middle East Campaign

ProSpy Spyware Spread Through Fake Messaging Apps In Middle East Campaign

Malicious OpenVSX Extension Delivers GlassWorm To VS Code, Cursor, and Windsurf Users

Malicious OpenVSX Extension Delivers GlassWorm To VS Code, Cursor, and Windsurf Users

Industrial Controllers Still Vulnerable As Conflicts Move to Cyber

Industrial Controllers Still Vulnerable As Conflicts Move to Cyber

Gmail with end-to-end encryption is now available on Android and iPhone.

Gmail with end-to-end encryption is now available on Android and iPhone.

CPUID Breach Sends STX RAT Through Trojanized Downloads of CPU-Z and HWMonitor

CPUID Breach Sends STX RAT Through Trojanized Downloads of CPU-Z and HWMonitor

Adobe Patches Exploited CVE-2026-34621, a Flaw in Acrobat Reader

Adobe Patches Exploited CVE-2026-34621, a Flaw in Acrobat Reader

Top Node.js Maintainers Targeted in Sophisticated Social Engineering Scheme

Top Node.js Maintainers Targeted in Sophisticated Social Engineering Scheme

Threat Actors Abuse Claude Code Leak In GitHub Malware Campaign

Threat Actors Abuse Claude Code Leak In GitHub Malware Campaign

Apex is an AI-powered pentester that attacks apps in black-box mode to find weaknesses.

Trending News

Your Next Breach Will Look Like Business as Usual

Your Next Breach Will Look Like Business as Usual

Ransomware Groups Increasingly Turn to EDR Killers Outside Vulnerable Driver Tactics

Ransomware Groups Increasingly Turn to EDR Killers Outside Vulnerable Driver Tactics

ProSpy Spyware Spread Through Fake Messaging Apps In Middle East Campaign

ProSpy Spyware Spread Through Fake Messaging Apps In Middle East Campaign

Malicious OpenVSX Extension Delivers GlassWorm To VS Code, Cursor, and Windsurf Users

Malicious OpenVSX Extension Delivers GlassWorm To VS Code, Cursor, and Windsurf Users

Industrial Controllers Still Vulnerable As Conflicts Move to Cyber

Industrial Controllers Still Vulnerable As Conflicts Move to Cyber

Gmail with end-to-end encryption is now available on Android and iPhone.

Gmail with end-to-end encryption is now available on Android and iPhone.

CPUID Breach Sends STX RAT Through Trojanized Downloads of CPU-Z and HWMonitor

CPUID Breach Sends STX RAT Through Trojanized Downloads of CPU-Z and HWMonitor

Adobe Patches Exploited CVE-2026-34621, a Flaw in Acrobat Reader

Adobe Patches Exploited CVE-2026-34621, a Flaw in Acrobat Reader

Top Node.js Maintainers Targeted in Sophisticated Social Engineering Scheme

Top Node.js Maintainers Targeted in Sophisticated Social Engineering Scheme

Threat Actors Abuse Claude Code Leak In GitHub Malware Campaign

Threat Actors Abuse Claude Code Leak In GitHub Malware Campaign