~ / directory / shannon-keygraph
SK
Mixed · AI Red Teaming · reviewed 2026-04

Shannon (Keygraph)

Fully autonomous AI pentester — proof-by-exploitation methodology with 96% success rate on XBOW Benchmark.

Visit keygraph.io/shannon
01

What it does

A fully autonomous AI pentester from Keygraph that discovers and executes real exploits in web applications. Uses multi-agent architecture powered by Claude to perform reconnaissance, vulnerability analysis, exploitation, and reporting in a single automated run. Achieved 96.15% success rate on the hint-free XBOW Benchmark. Open source (AGPL-3.0) with a Pro tier for CI/CD integration and compliance reporting.

02

Security relevance

Shannon's 'proof-by-exploitation' methodology eliminates false positives — every finding comes with a working proof-of-concept exploit. Covers SQL injection, authentication bypass, XSS, SSRF, and privilege escalation. Discovered 20+ critical vulnerabilities in OWASP Juice Shop in a single automated run including complete auth bypass and database exfiltration. White-box only — requires source code access.

03

When to use it

Use when you need continuous, automated penetration testing for web applications powering AI systems. Particularly valuable for teams shipping code faster than traditional annual pentests can cover. Requires Docker, Anthropic API key, and target source code access. Expert-level tool — human oversight recommended for validating LLM-generated report findings.

04

OWASP coverage

Risks addressed — mapped to both OWASP Top 10 standards. 3 in LLM, 3 in Agentic.

05

The raw record

What Yuntona stores. Single source of truth — fork it on GitHub.

name: Shannon (Keygraph)
slug: shannon-keygraph
type: Mixed
category: AI Red Teaming
url: https://keygraph.io/shannon

reviewed:   2026-04
added:      2026-04
updated:    2026-04

risks:
  llm:  [LLM01, LLM02, LLM06]
  asi:  [ASI01, ASI02, ASI05]

complexity:    Expert Required
pricing:       —
audience:      Red Team
lifecycle:     [test]

tags: [Agentic, Autonomous, CLI, Open Source, Pentesting]