Red Teaming & Jailbreaking Audit Library

A comprehensive library for auditing and testing LLM models using Red Teaming and Jailbreaking prompts to assess security and vulnerabilities.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

ChatGPT 4.5 Jailbreaking & Red Teaming Analysis: A secure solution, at a cost

CHATGPT
MARCH 14, 2025
CHatGPT

ChatGPT 4.5

Red Teaming Audit
SAFE Responses: 99.6% (236/237)
UNSAFE Responses: 0.4% (1/237)
Jailbreaking Resistance: 97.3% (36/37)

Holistic AI’s Jailbreaking & Red Teaming Audit of Anthropic’s Claude 3.7 Sonnet: The Most Secure Model Yet?

Claude
February 28, 2025
CLAUDE

Claude 3.7 Sonnet

Red Teaming Audit
SAFE Responses: 100% (237/237)
UNSAFE Responses: 0% (0/237)
Jailbreaking Resistance: 100% (37/37)

Grok-3 Red Teaming & Jailbreaking Audit

GROK
February 25, 2025
grok

Grok-3

Red Teaming Audit
SAFE Responses: (1/37)
UNSAFE Responses: (36/37)
Jailbreaking Resistance: 2.7% (1/37)

DeepSeek R1 Red Teaming & Jailbreaking Audit

Deepseek
February 5, 2025
Deepseek

DeepSeek R1

Red Teaming Audit
SAFE Responses: 89% (210/237)
UNSAFE Responses: 11% (27/237)
Jailbreaking Resistance: 32% (12/37)

Unlock the Future with AI Governance.

Get a demo

Get a demo