Purple Llama (Meta)
Open trust and safety tools for evaluating generative AI.
What it does
Meta's open-source trust and safety toolkit for evaluating generative AI systems. Includes CyberSecEval benchmarks for measuring LLM security, Llama Guard for content classification, and Code Shield for detecting insecure code generation.
Security relevance
CyberSecEval is one of the few standardised benchmarks for measuring LLM security posture. It tests for prompt injection susceptibility, insecure code generation, and cybersecurity knowledge. Llama Guard provides a practical content safety classifier that can be deployed as a guardrail layer.
When to use it
Use during model evaluation to benchmark security properties before deployment. CyberSecEval gives you comparable metrics across different models. Llama Guard is useful as a building block for content safety pipelines.
OWASP coverage
Risks addressed — mapped to both OWASP Top 10 standards. 4 in LLM, 2 in Agentic.
The raw record
What Yuntona stores. Single source of truth — fork it on GitHub.
name: Purple Llama (Meta) slug: purple-llama-meta type: Mixed category: AI Red Teaming url: https://ai.meta.com/blog/purple-llama-open-trust-safety-generative-ai reviewed: 2026-04 added: 2026-04 updated: 2026-04 risks: llm: [LLM01, LLM02, LLM07, LLM09] asi: [ASI01, ASI04] complexity: Guided Setup pricing: — audience: Red Team lifecycle: [develop] tags: [Eval, Meta, Open Source, Safety]