What job interviews taught me about Kubernetes
37 points by rau
37 points by rau
On the European side, I can tell you why.
Every CTO believe that if it is on K8s we can switch managed k8s provider in a few weeks tops.
Doesn't mean they are right, but they believe it.
They also believe it makes PR environment easier.
But mostly, switching provider. We expect a ban on using any provider linked to the US in the next few years. Particularly around GDPR, financial systems and more.
So we hedged that risk
I don't know, just seems like more evidence that the tech industry regardless of the size of the company has completely lost the plot. We've been on a steady path of ever increasing uniformity and complexity in the stack, yet the end result is that it's getting harder and harder to identify products and services that don't make you grind your teeth.
I think the problem is that the low level stuff is so buggy and complex that we basically have to build something like Kubernetes to have any chance. If you want to stop the stack from getting taller, then we have to make the lower levels better. We need much better operating system primitives--consider, for example, that containers are a hodgepodge of random kernel isolation primitives with no coherent design and thus a bunch of holes. While we've made container isolation much better by now, it was through a whack-a-mole approach rather than designing for security and sanity upfront. Until the kernel works around its massive backlog of tech-debt, or until someone else builds a kernel that is worth moving to (probably something with bulletproof legacy linux compatibility) we're going to keep stacking things on top of it.
Yeah I think this is exactly it -- Linux became a mess and is hard to use [1], so we built another layer of abstraction on top of those shaky foundations, which looks like Docker / K8s.
But you're just piling crap on top of crap at that point -- you still have to debug BOTH layers in the end!
This was the rationale behind unikernels (e.g. MirageOS) ... although I'm not sure they solve the problem even in principle ... operating systems are a hard problem :-)
[1] And pure "Unix" doesn't have the features that clouds need to run. Another example of this is Oxide computer and their Helios fork of Solaris/Illumos
But you're just piling crap on top of crap at that point -- you still have to debug BOTH layers in the end!
I don't know about you, but I rarely have to touch Linux these days. I have to debug Kubernetes which as frustrating as that may be, is still several orders of magnitude easier than debugging Linux (for one, the tools, configuration files, and interfaces for Kubernetes are far more standardized and uniform and thus intuitive compared to Linux).
Any sufficiently complicated deployment tool contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Kubernetes
The "half" is right.
Just happens to be the relevant half.
I could talk at excruciating length on how we set up a $1B SaaS e-commerce company in 2009, or how the online backend of extremely large AAA multiplayer games functioned, and they were definitely closer to kubernetes than a single machine deployment... but, they were faster, and opinionated in exactly the way the organisation needed, not in a way that cut against the requirements of our products.
The "bugginess" in kubernetes is the oodles of interface layers we add on top of it to make it work as we like, not necessarily in the core systems itself.
I could talk at excruciating length on how we set up a $1B SaaS e-commerce company in 2009, or how the online backend of extremely large AAA multiplayer games functioned, and they were definitely closer to kubernetes than a single machine deployment... but, they were faster, and opinionated in exactly the way the organisation needed, not in a way that cut against the requirements of our products.
My strong feeling is that if your org has the discipline to build this, then you have the discipline to build and operate a high-quality Kubernetes environment, and the Kubernetes environment would let you skip over a whole bunch of stuff that you would have to build yourself and you could focus on higher-level abstractions.
My experience of kubernetes has been: if you're making stateless services, there's nothing better, but as soon as you have some state, or you need a special load balancing system, or you want something to be deployed in a specific way: then you're sort of fighting a very heavy system that will swallow as much time as you can throw at it.
I know it's a darling, and that a lot of effort has gone into it, and I should be thankful that the system exists and would save me a lot of time up front. But the aggregate time over a project I have seen be a net negative. I need more people to manage kubernetes than I needed for running our slimmed down orchestrator for the games I made.
The pain is that everyone learns kubernetes, but our custom orchestrator is something you have to be onboarded with, despite being faster, less buggy and purpose built- it's better to just lean into Kubernetes even though it makes trade-offs that are antithetical to what we need; so we build overlays like Agones to make it less painful.
The problem with the opinionated custom orchestrator is that everyone has different opinions and most people are not in a position to build their own orchestrator.
The problem with this is that most places half ass K8s and there is a whole devops team managing it, but then still require software engineers to write, deploy, debug, deal with all the K8s stuff anyway. IMO, a good devops teams gives software folks the experience like Heroku has, but internally. I should be able to define the resources I need for my service, merge to main, and deploy happens. I should not need to go mess around in a shitty gitops/devops dashboard to figure out whats wrong. Heroku was peak DevEx and since then we've lost that with K8s.
Except for seeing that your pods end up running on nodes, what’s the big difference between using Heroku and using Kubernetes? Obviously Heroku gives you a more integrated experience with databases and deploying on git push, but I don’t like building a custom facade on top of Kubernetes. You end up passing through all the parameters anyway.
Adoption of technologies in industry is always driven by the principle of "hire off the rack, give 'em the sack". This is just one of the latest examples.
Don’t do much devops myself, systemd and the occasional podman container keep our systems running just fine for now. I work in IoT/AgTech.
There is an “argument” being made in this article that I often hear non tech management engage in, it goes something like “they do LoRa too, right? So, we’re good? Can we ship tomorrow?”
There is a belief that non uniformity IS the (only real) impediment to success. That any two systems that both speak Fiber, or Modbus, or “have an API” are immediately capable of being coupled together for a glorious “1 + 1 is greater than the sum” experience. I have the darndest time explaining to them that just because two pieces of software have agreed to a certain low level standard of interop, there’s still real and significant work to be done in determining how data, though easily parsed, is to be interpreted and do useful things with it.
Just because two people can speak a same language, there is still much work to be done. Just because you use a uniform language, doesn’t take away the fact that a subset of your team made decisions, known to them at the time, in how they used the common tool. In my fledgling days Fortran was the “lingau Franca” in sci/engineering circles. It’s not like that kept me from being completely baffled by some of the things my colleagues did or rewriting them.
I have no beef with the value of k8s or its emergence as a “standard” for the moment. Just any assertion that that makes a class of programming problems go away. Law of Conservation of Ugly still remains.
That's when you want the system to hold the knowledge, not people.
This puts in words something I’ve thought but hadn’t crystallized so eloquently.
This formality is achievable only as the volatility of a process decreases: a person can do something, the process is documented, the process is scripted, and then the process is automated. You can get most or all of these steps for free with common workflows in a popular tool or ecosystem.
Distribution is another reason. I work on Canton nodes. Helm charts is how we get given the upstream Canton software, and associated apps. It doesn't really matter if we think Kubernetes is good for what we do (I don't) because that's just how the software is distributed and supported. Going outside of that would make more work than just dealing with Kubernetes.
is it just me or does this sound super written by ai?
I agree with the sentiment tho. I’ve been migrating a bunch of my homelab/self-hosted stuff from bespoke systemd configs and a bunch of shell commands I can never remember and “damn it what markdown file did I put that one setup process in?” and it’s sooooo refreshing. I’m not using a real “cd” system yet but it feels great knowing that my little “apply” shell script and a bunch of yaml files would get me 90% of the way back up if disaster were to strike.
systemd is simple conceptual, complicated in reproducibility. kubernetes is flipped. you pay more conceptually, but reproducibility and the understanding that comes along with that naturally are much stronger. at least that’s my take. I’m very much in the middle of learning kubernetes but I’ve been having fun with it the last while.
I agreed with this 10 years ago, the integration of various namespacing options and dynamic users makes systemd today feel like "just another beast".
This vertical integration with first class definitions seems like the wrong move to me.
is it just me or does this sound super written by ai?
It really does not matter that much. It can be written either well or poorly.
It matters to me. When I read content here, I'm reading the content not just for the knowledge, but to connect authentically with people, even if abstractly.
If someone posts "an (arguably) elegant fibonacci ratchet in Smalltalk: a := b + (b := a)"... I know that AI can generate this if properly prompted, but if I know that someone figured it out on their own, that means something very different to me.
I don't necessarily mind EDM/synth music that is generated by AI. But when I find out the Irish Pub "Feck Trump" song is AI generated, the song loses a bunch. The song is still amusing/appreciated, but I don't get to dream that there's a real group of people somewhere in a bar joining with me in the chorus.
Human speech is not a context free grammar. I think the argument that "the content is orthogonal from the creator" is flawed.
When began working on the field I entered a K8s platform team, but at that time K8s wasn't as ubiquitous as now. The things the article mention are clearly true and they're advantages, much more than the scalability. When I started you also had Docker Swarm and Nomad was an interesting alternative. Now, K8s has been standardized, and it has some benefits, just the same way the POSIX standard made things easier for development across heterogeneous projects. To be honest sometimes. Even if you don't need the benefits of the common platform, it's just easier to write deployment against what everybody knows. I think Swarm is good enough for many projects, but no Cloud is offering that service AFAIK. To be honest even in my home lab I thought about running single node K8s clusters, but I'm still not comfortable with the overhead, specially in my RISC-V machines.
There are very real problems Kubernetes solves and at a certain scale it starts making sense, but I wonder how many of these interviews/companies don't have at least a single person whose job it is to keep these cluster's alive and running.
It might not be their main job, it might not be mentioned in their job title, they might not even be the development/operational team. But in reality, there will always be a person. And really, if you're just 1-20 people, I'm not sure there is the bandwidth for that person.
I'm not a fan of Kubernetes, but I'll accept it. I'm not a fan of yaml, but I'll accept it. I won't accept the assumption that these things "just work". Nothing does (neither the VM/systemd route), but I feel like around Kubernetes the most there is this idea of "it's easy to run" (it's easier with managed, but still not "easy to run" Heruko style).
While that sounds reasonable I don't think I can come up with an instance where this was a problem. I don't know a place or time where the problem was the infrastructure tooling. I've seen kubernetes, I've seen serverless, I've been at places from Debian to RedHat and FreeBSD, Docker Swarm and Nomad. I've been at a big bank that way back built and used its own init system.
The only thing I have ever seen really was people misunderstanding what kubernetes is and does, but even that was probably because it was introduced as that magical silver bullet by Google.
People either know how to do Sysadmin/DevOps Engineering/SRE or they don't.
I also very much do not think that hiring got easier with Kubernetes or is harder without. Currently in the process of hiring and it seems like candidates are just as 6-7 years ago and 15 years ago which were the last time I was actively looking for someone in the field.
Off topic but the main thing that changed is they now mention stuff like Claude in their "Skills" section which makes me shodder.
Also it feels like with every tool people being hyped with a specific tool usually turn out the worst in terms of knowledge about it, but that's probably just psychology. As with many things one is excited first and it takes time to see downsides. Otten enough also with people, companies, etc.
So while I agree with the article in that it makes sense either it's my bubble out it would seem strange to care about it cause there is little consequence.
And like eg with programming languages the job market is probably one of the few places where that whole fantasy about markets works. If a lot of people use a technology a lot of job offers and a lot of employees will exist. If you use something less common then you have less competitors both on the employer and the employee side. Not talking about extreme cases but chances there only one job or employee is kind of slim. Looks like even things like Cobol seem to work in the end.
And usually people like to learn new skills for their resume anyways.
I see a analogy with how companies have been looking at front-end development for the past decade. SPA frameworks didn't get popular because they are always the right fit. But they do provide uniformity, ease of use (and training people), etc.
With kubernetes you do see the same thing as the article also observed. What I am also seeing, similar with early days of react (and still to some degree), is kubernetes being used in ways that make it underperform or a headache in other ways. But, I suppose it is a fair tradeoff for most companies.
Like the article says, for us it's usually the only option to get managed servers from local hosting providers. There are a lot of projects that don't have continuous development and also don't have 24/7 (or even 9-5) dedicated support staff. If you want to outsource that, you need to have some sort of standardized process that is easy to explain to someone outside of the company. The alternative is to accept multi-day outages when someone is on vacation or too busy to jump in and fix things.
I find that comparing k8s to a disjointed compute layer is not as relevant as comparing it to something like terraform and a simpler container runtime like ecs. When people say "you don't need k8s" that is typically what they are talking about.
Early in my career, I (not a DevOps engineer) got comfortable enough with Kubernetes to be dangerous with it. I worked at a small ML startup (pre-LLM), with all of us wearing many hats. We didn't have anyone remotely close to a real devops engineer, so doing devops was everyone's part time job. We used managed Kubernetes (GKE), and that turned out to be a very fast way to burn through a ton of cloud spend (especially with GPU nodes) while feeling like we were just one scaling config away from getting it right.
Could we have burned through that money by just spinning up VMs or doing serverless? Probably. But Kubernetes made me feel like I was pursuing the best possible thing while, in reality, getting increasingly lost.
I then ran into people at other companies who specifically avoided k8s. At first it was weird to me, because Kubernetes was supposed to be the future, but then I realised maybe I was the one who had been misled. Now, reading this, I'm realising that I may have been forced into that future before others, even if perhaps they were right to avoid it then.
There is a way to get Kubernetes right, but doing it as a job on top of another job has scarred me.