AI sandbox that runs on your homelab

5 points by deevus

deevus

Pixels provisions Incus containers onto your TrueNAS server using their websocket API. I took inspiration from https://sprites.dev/. I have been doing a bit of vibe coding recently, but I didn't want to pay for a sandbox product. It's supposed to be simple to startup a container and get into a console. The most popular agent CLI's that I could think of are already installed if you provision with devtools = true.

I used Claude Code extensively but this is NOT vibe coded! I review every line of code. I might have missed some corkers when developing this at midnight.

It is using a bit of trickery to support checkpoints of the Incus containers. The main reason for this was so that you can spin up a base container, install everything you want, and then create a new container from that.

I did try to make the sandbox secure. I think I can do a better job there. Opus is actually pretty good at escaping a sandbox if you ask it to. Read the details in SECURITY.md.

In future I would like to provision the base container as part of the setup, to help speed up subsequent creates.

I'm not totally sold on the name. It was just the first thing I thought of. I don't know if this is even something others would want to use, but it scratches an itch for me.

A bit of backstory:

Over the last couple of months I have been working on a Terraform provider for TrueNAS SCALE 0 and it occurred to me that all this work could be extracted into a client library 1. I wanted to play around with vaxis 2, so I created a basic TUI that you can use to monitor your TrueNAS server 3.

addison

I used Claude Code extensively

That is likely still considered vibecoded my friend... Absolutists among us (including myself) would consider any amount of "assistance" vibecoding, as you offload the design and research effort.
- deevus
  
  Ok. You’re entitled to your opinion. I’m doing the engineering, system design, architecture. Claude Code is just writing the code.
  - ThinkChaos
    
    The remark isn't that you did nothing, but you didn't write the code. So I agree it's fair to call it vibecoded.
    
    simonw
    
    The term "vibe coding" was useful when it meant "I prompted this together without even looking at the code" - it was a way to signal that something was a low quality, unreviewed prototype that shouldn't be trusted.
    
    If you apply it to anything that an LLM has touched it loses that usefulness. You may as well say something is "IDE-coded" because the author worked on it in an IDE (with autocomplete and refactoring feature and so on).
    
    I think it's lost that usefulness already. Now we need a new term for low quality unreviewed AI code. Vibeslop?
    
    ThinkChaos
    
    it was a way to signal that something was a low quality [...] Now we need a new term for low quality unreviewed AI code
    
    I think people who are not pro-LLM don't make much of a distinction here. Vibecoded is indeed being used more broadly than originally defined. Slop is the worst of it. Whether it was reviewed or not doesn't matter, it's about the result's quality.
    
    You may as well say something is "IDE-coded"
    
    This doesn't seem remotely close to me. The IDE doesn't write all the code for you based on fancy statistics.
    
    strugee
    
    I would not call myself pro-LLM (maybe by Lobsters' standards I am, but at work I'm by far the most anti-LLM employee). I've dabbled but don't really use them (definitely a big part of this is ethical concern). But I do make this distinction. I trust handwritten code most, LLM-assisted code less, and entirely LLM-authored code not at all.
    
    Whether or not to trust someone's handwritten code has always been a nuanced choice, sometimes based on stars, sometimes on project reputation, sometimes based on a glance at the code itself, sometimes based on an impression of the author and their capabilities. Most often, it is based on nothing at all ("I want to accomplish this task, so what am I going to do, NOT curl | sh from a random GitHub repository I found?"). An LLM being involved in authoring the project is only one signal among many. Do I trust the author to handle it properly?
    
    gerikson
    
    Note: I did not suggest a tag change for this submission.
    
    When I do suggest a tag change to "vibecoding", I try to think "would the person who has blocked the 'vibecoding' tag be interested in reading this?"
    
    In this case, I am pretty sure they would not.
    
    addison
    
    I think this somewhat uncritically equivocates two behaviors.
    
    IDE autocompletes (specifically those which do not generate code, but do generate syntactic elements, e.g., closing braces) and refactorings affect the syntax of a program. You don't defer your semantic choices in this case.
    
    Code generators (regardless of how that code is generated; code synthesis, boilerplate generators, etc. are relevant here too) affect the semantics of a program. When you use these, you defer responsibility/quality/correctness/etc. to the generator you use. Yes, you can review it, but that is nowhere near the level of care nor the procedural consideration that it takes to write it.
    
    Whether you are writing the code or reviewing it, you will suffer decision fatigue. When you're writing the code, this might make you take a break or choose easier, less optimal implementations; after all, you won't write effective or necessarily correct code in this case. But ultimately, you are still responsible for actually producing the code, so you will manage your decision fatigue s.t. you can still write reasonable code. When you're reviewing code, you will engage in avoidant or complacent behavior that leads you to review less carefully. This is a well-understood psychological phenomenon (habituation of acceptance, normalization of deviance, etc.). The reason I consider "code generation + review" as vibecoding is simple: you defer responsibility to the LLM for the semantics of your program and you will, necessarily, become complacent in accepting its output. The degree to which the LLM is able to transform your prompt containing your natural-language semantics into code is somewhat irrelevant, because there is always more detail in code than you can encode in a prompt (and, if there wasn't, it would be just as costly to write a prompt as it would be to write the code).
    
    deevus
    
    I find it really sad that all of the effort I put into this is distilled down to “oh you vibe coded it”.
    
    I’m fine with putting my hand up and saying I used an agent. But to say that this is low effort or spam? I’m not sure how to take it. Like I’m not offended but I’m trying to work out what the new normal of software engineering is.
    
    I tagged it myself as vibecoding both because it’s a tool useful for vibe coders and I also know what this community is like.
    
    If this is sounding like a rant, it’s not. I just wish we could talk about what I built, and not how everyone feels about how I built it.
    
    addison
    
    FWIW, I'm not suggesting it's low-effort or spam at all, and these flags seem... misplaced. I have reservations about the use of LLMs, but I'm not going to dismiss a project wholesale for it. We have projects at work where we use LLMs for generating certain things where we don't mind deferring the responsibility for semantics. Of course there's a whole mess of ethical reasons why I don't involve myself in those projects, but I don't deny their efficacy. I'm sure that this project meaningfully accomplishes what it sets out to do, otherwise you wouldn't use it.
    
    On the other hand, because I have reservations about the use of LLMs, I don't have much room to have an opinion on the project. It seems infrastructurally sound and interesting, but I have no means/desire to use it. I only commented originally to inform you about the norms of the term "vibecoding", both personally and how I've seen others do so on lobsters. I didn't expect it to become the main theme of the thread, nor was that my intent...