Can LLMs SAT?
7 points by aiono
7 points by aiono
The simple answer is of course "no" because SAT is NP-complete and the inferring process of LLMs does not take exponential time over the length of the prompt. So, unless P = NP, any LLM can only either guess the results blindly (i.e. not with a specific probability of success greater than 0.5, because SAT is not known to be in BPP either) or tell the correct answer only for a fixed subset of learned examples.
Funnily enough this is what is happening with the different LLMs tested in the post. GPT 5.2 is randomly guessing correctly half of the inputs, and GPT 5.2 mini works under 100 clauses.
I don't think this is true? The inferring process is roughly quadratic per output token, but then the LLM reruns the inference process on the input + output to produce the second output token, and then again for the third, etc. So they can produce exponentially more output than the original input.