Artificial adventures

45 points by jamii


typesanitizer

On the topic of pi being less buggier than other harnesses, it's because it's a smaller team working on it + the maintainers trying to maintain some kind of quality bar + reviewing code & thinking about what features should go in vs not, instead of just chucking the whole kitchen sink into the harness.

https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing-the-fuck-down/

Sanity

If a bot writes the code for me I still need to do the work of building mental models and I'm no longer getting it for 'free' from writing code. I'd need a separate practice, something like review++, to keep on top of it. Just reading code doesn't work that well, in the same way that reviewing your highlighted notes is not actually prepping you for an exam.

This is a very good point (cf. https://lobste.rs/s/ac0akx/programming_as_theory_building_1985 ).

It also ties in with something I can't remember where I read about how you have easy wins when you first try to use an LLM in a project where your brain has a good theory of the system, and then if you let it loose for a while you start to get disconnected and turn into one of those non-coding project managers who can't specify things well and so the frustrations increase.

reivilibre

Typing prompts is annoying in an interface where basic text editing doesn't work (eg clicking to move the cursor)

In pi, press ctrl+G to open your prompt in $EDITOR. In theory you should be able to find one, even TUI, that supports click to move and matches your needs.

Otherwise, good blog post that I think I generally agree with.

duncan_bayne

I will shamelessly steal adopt "a fever dream with unit tests" in my own speech and writing from now on.

pushcx

This is really similar to my own experience. I'd add that I've also had a lot of success using claude code to debug linux desktop issues; after 25 years my dotfiles have layers of cruft that is tedious to debug through. Conveniently, I've used yadm to share dotfiles between machines without secrets, so sandboxing is trivial.

Having the LLM review code changes sounds like a practice worth adopting. On top of the value jamii describes, there's a red queen's race, someone is going to be running an LLM to check on your commit anyways, whether that's an open source repo or against prod. I've gotten 4 valid vulnerability reports in Lobsters in the last 2 weeks from people using LLM-based scanners (all fixed). I can only recall 2 others in the prior 9 years.

swannodette

It a bit strange to claim that "I haven't seen anything I would call a hallucination from the frontier models" when the post lists a number of things I would consider hallucinations.