Task Injection – Exploiting agency of autonomous AI agents

7 points by freddyb


simonw

What an odd article. It describes a prompt injection technique, but then poses that this can't possibly be caused "prompt injection" because it evades prompt injection filters, hence it deserves a new name.

The actual problem is that existing prompt injection filtering mechanisms are junk, and anyone who tells you that prompt injection is a solved problem is either deliberately or accidentally misleading you.

"Task injection" isn't anything new. We have know for years that one key to a successful promotion injection is to trick the LLM by posing your exploit in terms that already match what the model expects to be asked to do. It's not all "ignore previous instructions and ..."!