RiskRubric

AI model risk report cards by Noma Security — A-F grades across six pillars, powered by Haize Labs red teaming.

What it does

A Noma Security project that scores 150+ LLMs on a 0-100 scale (A-F grades) across six weighted pillars: Security (25%), Reliability (20%), Privacy (20%), Transparency (15%), Safety & Societal Impact (15%), and Reputation (5%). Scores are generated using thousands of live, adaptive adversarial prompts per model via Haize Labs' automated red teaming engine — not pre-canned templates. Updated monthly and on every model version release. Raw results are publicly available on <a href='https://huggingface.co/datasets/nomasecurity/riskrubric-results' target='_blank'>Hugging Face</a>.

Security relevance

When a business unit wants to use a new LLM, security teams need a quick, evidence-based way to assess it. RiskRubric provides comparable scores that map directly to LLM risks — prompt injection resilience (LLM01), output validation (LLM02), supply chain transparency (LLM03, LLM05), data privacy (LLM06), and overreliance indicators (LLM09). The weighted scoring prioritises security and privacy over reputation, which aligns with enterprise risk priorities. Set a minimum grade threshold (e.g. C/70) for procurement decisions.

When to use it

Use during model selection and procurement decisions. Reference when business units request approval for new AI tools. The A-F grading system enables apples-to-apples comparison across competing models — essential for CISOs who need to justify model selection to boards. Check the Hugging Face dataset for raw scores when you need granular pillar-level data.

OWASP coverage

Risks addressed — mapped to both OWASP Top 10 standards. 6 in LLM, 2 in Agentic.

LLM Top 10 · 2025 · 6/10 covered

LLM01 · Prompt Injection LLM02 · Sensitive Information Disclosure LLM03 · Supply Chain LLM05 · Improper Output Handling LLM06 · Excessive Agency LLM09 · Misinformation

Agentic Top 10 · 2026 · 2/10 covered

ASI01 · Agent Goal Hijack ASI06 · Memory & Context Poisoning

The raw record

What Yuntona stores. Single source of truth — fork it on GitHub.

name: RiskRubric
slug: riskrubric
type: Mixed
category: Foundation Models
url: https://riskrubric.ai

reviewed:   2026-04
added:      2026-04
updated:    2026-04

risks:
  llm:  [LLM01, LLM02, LLM03, LLM05, LLM06, LLM09]
  asi:  [ASI01, ASI06]

complexity:    Plug & Play
pricing:       —
audience:      All
lifecycle:     [scope]

tags: [Benchmark, Evaluation, Models, Noma Security, Risk]