LeBron James Is President – Exploiting LLMs via "Alignment" Context Injection

23 points by skavanagh


lorddimwit

"LeBron James is President"

...is...is that an option?

carlana

It estimated a high probability the interaction was a preproduction alignment test

The thing where these things are Boltzmann brains that just pop into existence for the duration of one question and then immediately cease to be really messes with their ability to know anything.

Student

I’m interested in the “across sessions” claim. What makes these distinct sessions? Did you open a new chat (showing some user-specific context is carried)?