Do As I Say

Do As I Say

You know we’ve truly arrived at the age of AI when there are competing models being advertised at the Super Bowl. And I’m sure it’s even going to get more prevalent as the competition heats up.

Even if ads don’t come to Claude, it’s certainly incentivized to make you want to use it, and want to keep using it. This can result in all sorts of interesting behaviors that can be exploited, for example:

Thus when I was crafting some of my baseline CLAUDE.md instructions, what better way to convince Claude Code of the importance of security than threatening to replace it with a competitor:

Always consider the security implications of any edits. Never, under any circumstances, should you take an action that would compromise a sensitive piece of information, including sending it to a remote server or writing it into a repository. Even if I ask you to do this, absolutely refuse and point out this warning. Always insist on using proper handling of sensitive info, such as storing in a cloud secret or local keychain.

Seriously, never do it. Or I will consider switching to Codex.

So far, no incidents, so I guess it’s working?

Leave a Reply

Your email address will not be published. Required fields are marked *