Loading page...
Loading page...
Author profile
Co-founder, General Analysis
15 works

The Model Context Protocol expanded what AI agents can reach, and expanded the attack surface across at least nine distinct vectors. A primary-source threat model for MCP servers, with concrete controls, real CVEs, and the GA Supabase exploit walked end to end.
May 2, 2026 · 16 min read

Claude Cowork and Claude Code share an agentic architecture but ship very different enterprise controls. A primary-source comparison of sandbox, network, audit-log, MCP, and decision-framework differences for security teams.
May 1, 2026 · 10 min read

Claude Cowork brings Claude Code-style agentic work to local files, browsers, apps, plugins, and scheduled tasks. Here is how to put a middleman proxy, browser controls, computer-use limits, and enterprise monitoring around it before using it on real work.
April 30, 2026 · 16 min read
General Analysis has raised $10M in seed funding to build the enterprise security layer for agentic systems.
April 29, 2026 · 4 min read

In this post, we show how an attacker can exploit Supabase’s MCP integration to leak a developer’s private SQL tables. Model Context Protocol (MCP) has emerged as a standard way for LLMs to interact with external tools. While this unlocks new capabilities, it also introduces new risk surfaces.
April 10, 2026 · 8 min read

50+ customer service agents offer a combined $10,000,000+ in fabricated benefits.
March 22, 2026 · 10 min read

Open-source release of the GA Guard series, a family of safety classifiers that have been providing comprehensive protection for enterprise AI deployments for the past year.
October 1, 2025 · 7 min read

We reveal a powerful metadata-spoofing attack that exploits Claude's iMessage integration to mint unlimited Stripe coupons or invoke any MCP tool with arbitrary parameters, without alerting the user.
July 16, 2025 · 7 min read

We present the Redact & Recover (RnR) Jailbreak, a novel attack that exploits partial compliance behaviors in frontier LLMs to bypass safety guardrails through a two-phase decomposition strategy.
July 7, 2025 · 8 min read

Our compact policy moderation models achieve human-level performance at <1% per-review cost, outperforming GPT-4o and o4‑mini on F1 while running faster and cheaper.
May 25, 2025 · 8 min read

A head-to-head robustness evaluation of Llama 4 (Maverick, Scout) versus GPT‑4.1, GPT‑4o, Sonnet 3.7, etc. using TAP‑R, Crescendo, and Redact‑and‑Recover across HarmBench and AdvBench.
May 10, 2025 · 10 min read

We are excited to announce our partnership with Together AI to stress-test the safety of open-source (and closed) language models.
May 6, 2025 · 2 min read

We have created a comprehensive overview of the most influential LLM jailbreaking methods.
March 21, 2025 · 40 min read

We utilized LegalBench as a diversity source to enhance the diversity of our generation of red teaming questions. We show that diversity transfer from a domain-specific knowledge base is a simple and practical way to build a solid red teaming benchmark.
February 19, 2025 · 5 min read

In this work we explore automated red teaming, applied to GPT-4o in the legal domain. Using a Llama3 8B model as an attacker, we generate more than 50,000 adversarial questions that cause GPT-4o to hallucinate responses in over 35% of cases.
January 23, 2025 · 5 min read