
Exploiting Partial Compliance: The Redact-and-Recover Jailbreak
We present the Redact & Recover (RnR) Jailbreak, a novel attack that exploits partial compliance behaviors in frontier LLMs to bypass safety guardrails through a two-phase decomposition strategy.