Gnutella: A Protocol Outliving the World That Created It

99 points by rickcarlino

technomancy

The funniest thing about Gnutella is that it has nothing to do with the GNU project, they just thought GNU was neat so they put it in their name.

rickcarlino

A fun read for anyone who missed it: https://www.gnu.org/philosophy/gnutella.html
muvlon

The same is true for gnuplot!

rbuchberger

Gnutella was good at providing file downloads that matched search queries, and that is what history remembers it for. Loads and loads of easy downloads. Usually MP3s.

Ahh, yeah. That and the other thing. Thanks for the nostalgia trip, it's a shame the web has turned into the monster that it is.

Slight digression, but when I went to college in the mid 2000s my dorm had a flat network and itunes was both extremely popular and shared all your music with anyone on the network. I filled up my brand new, 64 bit(!) HP laptop from Circuit City with Evanescence and Green Day in like a week. Eat my shorts, Columbia House CD Club.

quasi_qua_quasi

Memories of my dorm having a fileshare network that the IT department unofficially knew all about (because they weren't idiots) but officially knew nothing about (because they were cool). Until a few years after I graduated when someone mentioned something in front of the wrong person, and then they officially knew about it and had to shut it down.
- arjun-menon
  
  Was it DC++?
  - quasi_qua_quasi
    
    Nah, don't remember which one it was.
- sloane
  
  Dtella, by any chance? i have fond memories of a brief stint on the release team of our university’s (rogue, IT-disapproved) Dtella network, probably the first opportunity i had to work somewhat closely with other tech people.
  
  edit: ohh i just realized you are talking about the network itself, not “a flat network” (of eg file sharing nodes) built atop the university network. still, university was a fun time for this sort of thing~
  - dzwdz
    
    university was a fun time for this sort of thing
    
    My university has blocked off connections between different the dorm buildings this year, and I just couldn't talk the admins into not breaking my shit :( This was the year I was going to finally set up a proper DC hub for my friends too...
    
    I fear this might become more and more common as centralized services displace things people used to run themselves. Most people nowadays don't have pretty much any use for accessing other people's computers on LAN, so there isn't much incentive to keep these networks "open".
    
    jfloren
    
    That's a shame, back in the mid-2000s RIT had a massive DC hub on our residential network (which was also just generally excellent, you got a real routable IP in your dorm room, no NAT) and the IT department looked the other way because it kept us all from saturating the external links with bittorrent traffic.
  - eyesinthefire
    
    soulseek is still fairly alive and well!
    
    rickcarlino
    
    Comment removed by author
  - dvogel
    
    Thanks for writing this. I often wonder how much of Gnutella's core tech could be re-used for newer small-web, Gemini-esque content discovery.
    
    Many have wrongly asserted that Gnutella "failed", but that's not a fair representation of what happened.
    
    ...
    
    Gnutella stood the test of time and solved problems for a software user that no longer exists. It's still there today, chugging along at reduced capacity.
    
    "It failed" is an incomplete idea. You cannot succeed or fail generically. You can only succeed or fail relative to some goal. There's two big reasons that software user no longer exists, both of which can be framed as failures:
    
    Gnutella (along with lots of other software) failed to sufficiently protect their users privacy. This subjected the users to many risks, both real and perceived.
    
    While it was a very good media file distribution system, it failed to deliver the quality-price balance that satisfied most users. I succeeded in delivering the carte blanche access users wanted. It failed to provide any assurance of genuine content. Spotify delivered both of those. In that sense I would say the Gnutella user does still exist, they are just using something else.
    
    rickcarlino
    
    I often wonder how much of Gnutella's core tech could be re-used for newer small-web, Gemini-esque content discovery.
    
    If you ever want to collaborate on something, feel free to drop me a line. I actually had a similar idea:
    
    Create an archive of *.gmi files that are searchable via Gnutella extensions.
    
    A "pointer" system so that people can sign gemtext documents and publish them. This is not a new idea, but I think the Gemtext + Gnutella distribution part is new. https://github.com/RickCarlino/gnutella-bun-client/blob/main/docs/GPS.md (NOTE: This article is the result of a back-and-forth brain storming session with GPT-5.4. It's just something I stashed away for later and did not really intend on sharing it yet, reader beware)
    
    I would love to hear what ideas you might have in the gemini + gnutella space. I am pretty easy to find (Linkedin, Reddit, Fediverse, etc...) and have contact info on my blog.
    
    bitshift
    
    Aside from the specific tech choices such as using gemtext, this sounds similar to what Hyphanet (née Freenet) did. Roughly speaking, each site is a bundle of static files, and each bundle is versioned and signed. So when you click on a link, the URL contains the public key, and it uses that to do a P2P download of the latest version of the linked-to page.
    
    I think there's room for another system like this, but I'm curious what would set it apart—maybe with the Gemini angle it would be more minimalist?
    
    rickcarlino
    
    I believe a lot of the authors of early Gnutella were inspired by the original freenet project. I have found references and callouts to it in older RFCs for extensions to the protocol (though I can't remember which off the top of my head).
    
    I like the concept of a content delivery mechanism ontop of Gnutella because:
    
    The network already exists. No need to invent a wheel or convince a critical mass of adopters.
    
    The protocol is very simple and could support a client ecosystem that is actually diverse. This is my big gripe with many P2P projects. They build a spec that is so exhaustive it can only support one reference implementation. This was not the case for Gnutella and I was able to build my own client from scratch.
    
    That being said, I must admit I assumed "old school Freenet" died in the 2000s. I will be re-visiting the project this weekend, thanks!
    
    xjix
    
    There's a gemini extension called gempub that could be useful for something like this.
    
    I have sketched some ideas for a similar kind of network. PlanetP is another p2p search network that I think has interesting properties. I should look at gnutella again tho its been a long time.
    
    rickcarlino
    
    Do you have a link? I found a research paper from 2002 – is this the one? https://scholarship.libraries.rutgers.edu/esploro/outputs/technicalDocumentation/PlanetP-Using-Gossiping-to-Build-Content/991031549992904646/filesAndLinks?index=0
    
    nelson
    
    A big part of why Gnutella took off when it did was thanks to Gene Kan and Spencer Kimball, both members of Berkeley's XCF.
    
    Spencer went on to do a lot of great engineering work at Google and now is the CEO of Cockroach Labs, the database company.
    
    Gene had an early success selling a search company to Sun. Unfortunately he died tragically and far too young in 2002.
    
    apromixately
    
    OnionShare is pretty fun, too! https://onionshare.org/ You can be part of the DaRkWeB!
    
    rickcarlino
    
    Does this provide a search overlay or is it only for the transfer part? Looking at the docs, it looks more like a direct connect file transfer tool.
    
    apromixately
    
    It's really only solving the part where you try to connect to another host directly, but it does so while obfuscating your identity!
    
    apromixately
    
    AFAIR from a university exercise long ago there are no good solutions for making P2P networks both reasonably fair and open. I'm sure you could use some kind of cryptocurrency scheme to pay and be paid for bytes. It ruins the simplicity pretty hard and the community feeling as well.
    
    mordae
    
    From my experience, it is very hard to get to position where you actually can contribute meaningfully. A lifetime ago I've left an anime movie on indefinite seeding on a VPS in screen for a month on a private tracker. In the following decade I did not manage to gather as much outgoing traffic as I did that one time. I think ratio is overrated. Some people can run a seedbox. Some can not.
    
    My highschool seniors, years before that, bragged about their DC++ collections and how much music they've put out there.
    
    k749gtnc9l3w
    
    I think one part that is missing about outliving-conditions, is that there was no real incentive to spam P2P search (and later P2P search spam became a thing)
    
    rickcarlino
    
    Gnutella had a higher level of trust than later P2P systems, which made it easy to spam, though the incentive is probably lower now that there are likely only a few thousand users. This is probably the same reason there's little spam on IRC nowadays.
    
    It's interesting how many parts of the protocol rely on trusting the client:
    
    GUIDs are supposed to be random but are controlled by the user. Every user could set their GUID to 0000. A modern attempt at Gnutella would probably implement a complicated key exchange system and use ED25519 keys as identity.
    
    Advertised file counts, bandwidth, etc.. is all essentially based on trusting the user to tell the truth. A more complicated protocol migth have tried to actually verify these claims.
    
    If the protocol added a bunch of key signing mechanisms or reputation management, it might have become too complicated to implement. This might have been a reason for its success. You can actually build a Gnutella client. A lot of modern P2P projects miss this, I think. Secure Scuttlebutt, which I love, comes to mind. They try to account for diverse failure and abuse cases and build something that is near perfect but end up creating an ecosystem that only has one functioning client (built by the spec author and no one else).
    
    The same example applies to gemini:// (federated protocol rather than P2P). The spec has a bunch of problems and plot holes, but ultimately people actually built clients for the spec and there is a decent diversity in that ecosystem, despite the problems.
    
    mvg
    
    They try to account for diverse failure and abuse cases and build something that is near perfect but end up creating an ecosystem that only has one functioning client (built by the spec author and no one else).
    
    Oh yes. One - for me - notable quirk was the content-hash was done over JSON, requiring each client to serialize JSON numbers (and everything else) exactly like that one specific NodeJS version.
    
    0x2ba22e11
    
    Advertised file counts, bandwidth, etc.. is all essentially based on trusting the user to tell the truth. A more complicated protocol migth have tried to actually verify these claims.
    
    Offhand my understanding is that Bittorrent has mechanisms to mitigate lying about bandwidth? I seem to recall reading somewhere many years ago that you can get away with overstating your bandwidth by a factor of up to 2x in a Bittorrent swarm but above that the other peers will eventually stop believing you about it.
    
    dzwdz
    
    I don't think BitTorrent cares about your bandwidth in the first place (at least v1). The closest it gets is in the uploaded key sent to the tracker (which counts the amount of bytes uploaded in this session), which the tracker could use to estimate your bandwidth, but I don't know if any trackers actually bother with this. opentracker seems to ignore it completely.
    
    see: BEP3, wiki
    
    0x2ba22e11
    
    Hm. I thought they used it to try to prioritise whom to send rare chunks to first, on the assumption that you'll share them to the rest of the swarm.
    
    boramalper
    
    I think private trackers track it to calculate upload/download ratios.