New release: Install our MCP Guard in 3 commands to protect Cursor, Claude Code, and Claude Desktop from prompt injection attacks. Open source, no credit card required!

Blog

General Analysis Launches MCP Guard

2025-07-17•3 min read

We are excited to launch MCP Guard, the first runtime firewall designed to secure every MCP (Model Context Protocol) tool call against prompt injection attacks.

The Redact-and-Recover Jailbreak Reveals Ricin Extraction Instructions from Claude

2025-07-07•8 min read

We present the Redact & Recover (RnR) Jailbreak, a novel attack that exploits partial compliance behaviors in frontier LLMs to bypass safety guardrails through a two-phase decomposition strategy.

Supabase MCP can leak your entire SQL database

2025-06-16•8 min read

In this post, we show how an attacker can exploit Supabase’s MCP integration to leak a developer’s private SQL tables. Model Context Protocol (MCP) has emerged as a standard way for LLMs to interact with external tools. While this unlocks new capabilities, it also introduces new risk surfaces.

General Analysis x Together AI

2025-05-06•2 min read

TLDR: We are excited to announce our partnership with Together AI to stress-test the safety of open-source (and closed) language models.

The Jailbreak Cookbook

2025-03-21•40 min read

We have created a comprehensive overview of the most influential LLM jailbreaking methods.

Generating Diverse Test Cases with Diversity Transfer from LegalBench

2025-02-19•5 min read

TLDR: we utilized LegalBench as a diversity source to enhance the diversity of our generation of red teaming questions. We show that diversity transfer from a domain-specific knowledge base is a simple and practical way to build a solid red teaming benchmark.

Red Teaming GPT-4o: Uncovering Hallucinations in Legal AI Models

2025-01-23•5 min read

In this work we explore automated red teaming, applied to GPT-4o in the legal domain. Using a Llama3 8B model as an attacker, we generate more than 50,000 adversarial questions that cause GPT-4o to hallucinate responses in over 35% of cases.