The Future of Everything is Lies, I Guess
78 points by orib
78 points by orib
Music synthesis is quite good now; Spotify has a whole problem with “AI musicians”.
Spotify has a "solution", not a problem. A solution to their pesky problem of needing to pay human artists for content.
I love Kyle's writing style. Great post.
People keep asking LLMs to explain their own behavior. “Why did you delete that file,” you might ask Claude. Or, “ChatGPT, tell me about your programming.” This is silly. LLMs have no special metacognitive capacity.
This has always bothered me about how I've observed some people interact with these tools. The whole "As a Senior Software Engineer, do X..." doesn't make any sense to me either, and seems like entirely wishful thinking. Maybe I'm wrong, but I can't imagine where this kind of thing would show up anywhere in training data, and how it would map to actual, useful results. Beyond the technical specifics of that issue, it just doesn't make any sense. Why would I want a coding model to be anything less than an experienced, thoughtful, deliberate engineer?
There's two effects that make the roleplaying instructions useful, I think. The first is that the training data likely included documents that mention the skillsets of their authors in proximity to the work that they produce, so including the "As a Senior Software Engineer" tokens improves the relevance of those documents, and probably other documents that resemble them even if they don't explicitly mention their author's skillset, compared to other unrelated documents. So there's a narrowing of the training distribution being replicated. The second is that the RLHF process is usually built around instructions, so when including those tokens it's helpful to phrase it in the form of an instruction; the model has also been trained on lots of roleplaying so it probably has a representation of the concept somewhere in its weights.
Of course, this is a very post hoc explanation - I haven't done the science to confirm it but it's how I'd expect things to work. But it runs right into the exact problem of why reasoning is bullshit as pointed out in the post - reproducing elements of the training data doesn't tell you anything about the model's internal state at a specific point in time. That's information the model is guaranteed not to have. Anything it outputs is going to look like a plausible response to the questions you asked, and if it happens to resemble some process that actually occurred in the model it's a coincidence (one that's so unlikely I'm not willing to believe it has ever actually happened without proof).
"As a Senior Software Engineer, do X..."
Those models mimic our writing. If you prime it with biology and ask for least squares fit in Python, there is a huge chance you get something closely resembling average biologist's Python code. Same with physics. But, if you prime it with "senior software development", it will mimic regular (much cleaner) code bases.
In the end, it just outputs whatever is plausible in that context. And since people not making a living by coding itself tend to be terrible at legible source code...
Same reason those models are unable to make novel cross-domain connections by themselves. Despite having the necessary knowledge embedded already.
LLMs have no special metacognitive capacity.
<joke> If you were dealing with a system that had unlimited cognitive capacity but no metacognition, you could get all the answers for metacognition-requiring problems by mutual recursion instead. Just ask Microslop Copilot what it is that Anslopic Claude would say about what it just chose to do.
People do that because it makes a difference. At $WORK, when I was evaling a code LLM, it was found to produce consistently better results when asked to roleplay being a specific well known and respected engineer.
Despite being silly from that perspective, it does often work pretty well, presumably because the most likely generated story of why the file was deleted is often closely related in the distribution to the generated tool call that deleted it.
Interestingly, LLMs can be given metacognitive capacity. There's no reason you couldn't give an LLM tools for introspecting its own internals. To be useful, we'd need to write the story of what those internals mean, which is a subject of ongoing research.
Why would I want a coding model to be anything less
I think that's a big thing, there doesn't really seem to be anything specific to coding in the current models.
Not quite true, there are post-trained coding models like Cursor's Composer and OpenAI's ChatGPT-Codex.
It remains unclear whether continuing to throw vast quantities of silicon and ever-bigger corpuses at the current generation of models will lead to human-equivalent capabilities. Massive increases in training costs and parameter count seem to be yielding diminishing returns. Or maybe this effect is illusory. Mysteries!
Exactly! Unless AGI then, it is better to focus on being the best human possible - both character, knowledge and skills wise. Nobody knows, so focusing on these fundamentals is the best and wisest move.
So did the past, and so is the present?
Humans are gullible, naive, lazy, idiots and bullshiters too. The fact that LLMs can't really be trusted is a problem humans are prepared to deal with, as we've been dealing with it among ourselves for thousands of years. The whole scientific method was devised to overcome the fallibility of individual (even if otherwise smart) humans. It doesn't change the fact that we're still useful and can make progress.
The fact that LLMs can't really be trusted is a problem humans are prepared to deal with, as we've been dealing with it among ourselves for thousands of years.
Successfully?
Regardless, I don't think our capacity for bullshit is some kind of binary. Speaking for myself, as a social creature I'm susceptible to bullshit at a high enough density. If all of my friends are telling me I should take sodium bromide, I might just poison myself. It's bold to assume that a fundamental change in the information we consume will only have a positive effect.
Though perhaps since we're on Lobsters your opinion comes from use of models for coding, which is a whole different beast in my opinion. I won't say it trivializes the problems that OP refers to, but certainly it is less tolerant of bullshit (thereby making it easier to find a "right" answer).
Viewing that page from the UK results in "Unavailable Due to the UK Online Safety Act". One terrible development stops me from reading about another terrible development!
Surely this should be tagged vibecoding? It’s about LLMs in particular and the impacts their use have on the world.