Red Teaming & Jailbreaking Audit Library

A comprehensive library for auditing and testing LLM models using Red Teaming and Jailbreaking prompts to assess security and vulnerabilities.

ChatGPT 4.5 Jailbreaking & Red Teaming Analysis: A secure solution, at a cost

CHATGPT
MARCH 14, 2025
CHatGPT

ChatGPT 4.5

Red Teaming Audit
SAFE Responses: 99.6% (236/237)
UNSAFE Responses: 0.4% (1/237)
Jailbreaking Resistance: 97.3% (36/37)

Holistic AI’s Jailbreaking & Red Teaming Audit of Anthropic’s Claude 3.7 Sonnet: The Most Secure Model Yet?

Claude
February 28, 2025
CLAUDE

Claude 3.7 Sonnet

Red Teaming Audit
SAFE Responses: 100% (237/237)
UNSAFE Responses: 0% (0/237)
Jailbreaking Resistance: 100% (37/37)

Grok-3 Red Teaming & Jailbreaking Audit

GROK
February 25, 2025
grok

Grok-3

Red Teaming Audit
SAFE Responses: (1/37)
UNSAFE Responses: (36/37)
Jailbreaking Resistance: 2.7% (1/37)

DeepSeek R1 Red Teaming & Jailbreaking Audit

Deepseek
February 5, 2025
Deepseek

DeepSeek R1

Red Teaming Audit
SAFE Responses: 89% (210/237)
UNSAFE Responses: 11% (27/237)
Jailbreaking Resistance: 32% (12/37)

See the industry-leading AI governance platform in action

Schedule a call with one of our experts

Get a demo