What I learned building an opinionated and minimal coding agent

12 points by jefftriplett


mtlynch

pi runs in full YOLO mode and assumes you know what you're doing. It has unrestricted access to your filesystem and can execute any command without permission checks or safety rails... Since we cannot solve this trifecta of capabilities (read data, execute code, network access), pi just gives in.

This was surprising, as better security is one of the biggest reasons I see to roll your own agent.

To me, the easiest low-hanging fruit in improving LLM agents is to implement real access control limitations for the agent rather than the dumb system pretty much every agent has today which is just, "I'll give you root, but please pinky promise me you won't do anything naughty."

I waste a lot of time babysitting Cline to make sure it doesn't read my .env.prod files and upload them to Anthropic or whatever cloud LLM vendor.

I don't see why it would be particularly hard to run an agent in a limited user account in a chrooted filesystem that doesn't have access to anything sensitive. I get why tools like Cursor and Cline don't want to do it because it's simpler for them to push risk onto the consumer than to implement protections that work across platforms, but if you're rolling your own agent for your specific environment, these protections seem pretty doable.

Nothing against this article, as I think it's well-written and helpful, but I was just surprised at this choice.

atharva

There's really only four APIs you need to speak to talk to pretty much any LLM provider

While this is true, I've observed highly different behaviours. Some of these pass the reasoning traces back, some of these don't. Others can do interleaved thinking, or return tool call tokens in the reasoning trace. Unifying all of this in a gateway has been a massive headache—I will probably dig into the code later to see how this is handled!