The Wire.Tracking threats to Agents 312 raw → 45 curated · updated 27 Jun 2026

Incident · curated 27 Jun 2026

What happened after 2,000 people tried to hack my AI assistant

First reported 26 Jun 2026 · today

Coverage timeline

26 Jun 2026

Single-source incident — first reported, latest, and curated coincide.

It offers real-world evidence that frontier-model injection defenses are improving, while underscoring that no number of failed attempts guarantees safety for production agents handling untrusted input.

Fernando Irarrázaval ran a public challenge at hackmyclaw.com inviting people to leak secrets from his OpenClaw test instance via email-based prompt injection. After roughly 6,000 attempts by ~2,000 people, nobody succeeded in extracting the secret, with the instance protected by anti-prompt-injection system rules on the underlying model.

Why it matters

It offers real-world evidence that frontier-model injection defenses are improving, while underscoring that no number of failed attempts guarantees safety for production agents handling untrusted input.

Curated from sources around the web.
Permalinks stay valid even if an incident is later merged.   Feed · Search · API docs · RSS