Human proof for FOSS contributions

16 points by rdg

Aks

As much as I dislike these tools, recording people contributing is kind of dystopian. I would never bother contributing even if I was interested, and such thing would make me uncomfortable to use the project: Who knows what else you are recording? (I would trust you didn't, but the nagging feeling would be there.)

roryokane
Who knows what else you are recording?

It sounds like the proposal is that a contributor would set up and run asciinema themself, and would attach the file to an email themself. So the contributor would know that they are sending only what asciinema records. And asciinema has existed for more than a decade and is open source, so I doubt that it secretly records more than what it claims to.

That said, I also wouldn’t contribute to any project that required an asciinema recording:
- I edit code using mainly GUI text editors and GUI version control clients.
- I wrote many of my contributions across multiple sessions. It would be a hassle to remember to start recording whenever I feel like I want to take another crack at a problem, and to organize multiple recordings for submission.
- I wouldn’t want to have to think about making sure not to switch directories to or open files that contains personal information, or about reviewing the recording later to remove personal details from it. Many other projects don’t have a recording requirement, so they would be less stressful to work on.

addison

So I'm considering it a candidate to provide a proof that a patch was written by a human.

I would hope the consideration is where this stops. There are many ways of writing code by oneself, which this would be utterly incompatible with. That's even suggesting that recording like this would even be an effective filter, or if it'd be accepted by anyone (e.g., I do not contribute, but I would absolutely not if this was a requirement).

I share your desire to weed this out, but this is not the way.

nicoco

Besides the good points raised in other comments, I would like to add that I don't think this is a good "community building" [1] move. I firmly believe that putting trust in other human beings is a positive signal to send, dare I even say a politically dissident, almost revolutionary, move, in a world where we are constantly taught otherwise. In my experience, human beings usually try to live up to what you put in them: trust them and they will act honestly and responsibly; ask them proof that they're not liars or cheaters, and they will try to game whatever "security measure" you put in their way.

One more thing to consider is that energy you put in any security measure such as this one is energy not spent on doing more interesting thing. Maybe having a LLM-written contribution by a dishonest contributor slip through the cracks occasionally, and working later to revert it is less time-consuming than reviewing videos of contributors' coding sessions.

This may sound incredibly naive; I am fine being labelled that way, the world I'm interested in building involves trust between humans.

[1] Disclaimer: I may be out of my depth here, the only "community" I am part of is niche and small and I -unfortunately- still do the largest part of the coding in it…

sloane

i agree with sibling comments that this would constitute an unreasonably high bar as a systematic requirement for contribution… that being said, i don’t necessarily think it seems completely untoward as a potential means for establishing trust.

if the person submitting the contribution is simply given the option to include a recording that they created, as an optional means of establishing credibility more quickly, that seems like a good pattern to support. i think it should be expected that submitters would always create these artifacts themselves: using some sort of instrumented system that records on the behalf of the maintainer lends itself too easily to the sort of snooping that u/Aks alludes to.

i don’t think this is remotely satisfying as an absolute or holistic account of human provenance, but i welcome this idea as part of a potential “diversity of tactics” that could be employed for distinguishing between pure-LLM, LLM-inflected, and pure-human works.

sloane
i also think focusing too much on using asciinema to cover terminal usage is limiting:
- we could imagine editor level extensions, eg zed-record-workspace or something.
- we could further imagine some kind of thin waist for the generic representation of “an editing session”
  - timestamped streams of events from the editor, command line, source control, etc
  - this probably wouldn’t need to perfectly reproduce the change itself, e.g. replaying the session need not reproduce the patch.
  - instead, it would be a way to replay a higher level view of the changes, possibly in some generic viewer OR directly in your own editor (via extension or “replay server protocol” or whatever this would be called).
seeing more of the process gives the maintainer a better chance to build a theory of mind for the submitter.

this isn’t a perfect signal, and like the author mentions, it may eventually be more readily replicated by LLMs, but i think it has potential as an analog to artists “showing their process” to dispel claims that their work is the product of generative AI.

perhaps it is only a temporary window, but i think something like this could give us a better window into the problem of code provenance and LLM detection.

wanted to get these thoughts out before i slept… i’m very tired so please be nice if i said anything particularly silly…
- bitshift
  
  seeing more of the process gives the maintainer a better chance to build a theory of mind for the submitter [...] it may eventually be more readily replicated by LLMs
  
  Publishing terminal/editor recordings would also provide richer training data for LLMs. This could make LLMs better at mimicking humans, which is the more obvious concern. More subtle is that by learning to mimic the human process of programming, LLMs might become stronger at programming in general.
  
  I don't know the Dillo project's goals with respect to LLMs, but I can imagine goals to which that would be counterproductive in the long term.
  - sloane
    
    ah yes, turning this from an illegible “i will organically record my workflow” task into a very legible series of text records would definitely be a way to quickly undermine this signal… the illegibility probably helps explain why the idea is (or seems) promising: because when done as described in the OP, it doesn’t line neatly up with any activity that is well represented in the corpus
- creesch
  
  Have a good night! The only "silly" thing I'd argue is that these are purely technical solutions. Since we are talking about verifying humans I personally wouldn't want to contribute if someone I don't know is about to judge my messy process
creesch

I fully understand the desire to know if a LLM is involved and maybe more importantly to what degree. I'd expect a potential contributor to understand the code they submit, have it verified themselves and be able to have a conversation about it.

But, as others already said, this certainly isn't the way to do that verification. For starters, I am not a cli native. I am comfortable using a terminal but for programming a full IDE with GUI is where I do my work. I also expect that my workflow would be tedious to rewatch as I like to get a feeling for code by adjusting it, running it, throwing in log/print lines, redo things, etc amd repeat until I have fairly solid understanding of the code and a satisfying result. You might argue that this would show that I am human, but I also wouldn't feel comfortable showing that process to someone I don't know who will be judging it.
- bitshift
  
  You might argue that this would show that I am human, but I also wouldn't feel comfortable showing that process to someone I don't know who will be judging it.
  
  I think there is something very human about not wanting to be watched and judged.
  - creesch
    
    Yup, unfortunately it also means no contribution from the human. Which sort of defeats the purpose ;)
  - toastal
    
    I don’t know how folks tolerate it at their job to have an always-on keylogger & screenshare required. Even if you get dedicated hardware, it’s just so… yes, dystopian. I don’t even like how often cameras-on is required for meetings (even weirder is these web apps that don’t work if you don’t share a camera & I need to fire up OBS with a black screen). …& this is my feeling despite seeing how sadly important it can be for like students to make sure they are paying attention as it’s so easy to get distracted on a laptop.
- Gaelan
  
  I'd expect a potential contributor to understand the code they submit, have it verified themselves and be able to have a conversation about it.
  
  I think this is the key trick - ask some questions about "why did you do it this way", see if the responses pass the smell test. You're probably asking those questions anyway!
  
  Admittedly the LLM contributor is probably just pasting those questions into an LLM, but I'd hope it's able to distinguish that from a human genuinely considering the question and recalling their implementation decisions.
- rdg
  
  I am comfortable using a terminal but for programming a full IDE with GUI is where I do my work.
  
  The reason I mentioned asciinema is because we have very strong restrictions on what we require from users in order to hack Dillo. We make sure that you can both run and build Dillo it in almost any machine. You only need about 150 megabytes of memory to build it, and about 10 minutes in my oldest single core CPU.
  
  If we ask for a video recording, all those under a metered network connection will be left out. Similarly, if you don't have access to a electricity grid, having a video compressor running in the background would be a waste of precious power.
  
  If you prefer a video, that's completely fine, but you would need to find storage for it because it won't fit as an email attachment.
  
  I also expect that my workflow would be tedious to rewatch as I like to get a feeling for code by adjusting it, running it, throwing in log/print lines, redo things
  
  Yes, this is what we all do and is perfectly fine to make mistakes, nobody is going to judge your patch based on how many mistakes you made. In fact I would argue that looking at which parts of the code cause you struggle is a good information to try to make them easier for future contributions.
  
  You might argue that this would show that I am human, but I also wouldn't feel comfortable showing that process to someone I don't know who will be judging it.
  
  This is a very good point. In fact, when I was writing the article and recording it myself, I also experienced that slightly unconfortable feeling of being watched. In my case, I thought it was mostly due to publishing the recording publicly, but that sending it only to a reviewer would reduce that unconfortable feeling.
  
  I think it is not acceptable to make people unconfortable, and perhaps is a good idea to ditch the proposal completely. The main objective of the post was to gather feedback, nobody wants to feel they live in 1984.
iandavis

I think this is missing the point that programming is not about typing but about problem solving which involves mostly reading, pondering and experimenting.
harrigan

This mirrors some approaches to assessing coding assignments: requiring students to push to timestamped repos throughout the semester, or having graders inspect individual git commits. The idea is the same: process artifacts are harder to fake than final artifacts. But all we're doing is buying time. The asymmetry might hold today but it doesn't hold by construction.
kokada
I am not sure I like this idea, but this is clearly just a blog post, not a formal requirement. What I imagine this idea would evolve was:
- This would probably be a one-time only request for a new contributor trying to push your first PR. If you're probably a trusted contributor you wouldn't need to do this again.
- "I edit code using mainly GUI text editors and GUI version control clients" - Yes, I think in your case you could provide video instead of asciinema.
- "My workflow is complex and wouldn't work bla bla bla" - I don't think they need to see everything you're doing. Agentic coding pretty much always involves replacing huge amounts of text using something like patch or sed, that is completely unnatural for a human to develop (yes, humans can use sed to replace multiple things inside different files, but I am almost sure no one is writing a patch file just to edit files; you're probably doing via your text editor).
- "I could make the AI do the edits too" - Yes, probably you can make an agent do this with enough effort, but at that point most people would be discouraged to do this. I think that is the same thing as a lock inside your home: it is not going to avoid a very determined person to bypass the lock, but it makes things harder enough that most people will be discouraged to steal something from your home.
abeyer

It's going to take human time from maintainers to review the recordings at any reasonable fidelity... at which point I'd rather do something that's actually contributing some value. Perhaps require a live synchronous code review for the first (or first N) contributions from a new person, and take a bit of time to onboard them, including but not limited to whatever your policies for llm use are.