pyinfra — agentless infrastructure automation, in plain Python
64 points by gregnavis
64 points by gregnavis
pyinfra is basically what ansible should have always been -- write your automation directly in python, rather than a janky mix of templated yaml with control flow structures bolted on. It's a breath of fresh air after dealing with ansible for so long (and I say this as someone that didn't have any particular dislike for ansible).
It’s not Anisible in Python though exactly. More like a Python to shell interpreter, that has its own issues.
Believe I’d like a hybrid of the two that used Python at the destination as well for a bit more sophistication and for example, less quoting mess when updating files. The limitations of sed for regex come to mind as well.
I really like pyinfra and I wish it had more traction.
So far, all the companies I worked with use Ansible (with or without Terraform), and not a single one was ready to rewrite all of their automation with something else that their employees have no experience with.
pyinfra also requires the SysOps to know Python. While in my mind it should be mandatory for a SysOps to know some scripting language (especially with Ansible, Python can be used to write modules and reduce the amount of YAML mess), it's not a very common point of view, at least in France.
I've done a lot of Ansible and in my opinion it's a scripting language in disguise anyway.
(Possibly not a very hot take.)
It is, and a very bad one, with a terrible developer experience, and an almost non-existent testing capabilities.
As is Gitlab CI, Github Actions, etc... YAML is not "Yet Another Markup Language" but "Yet Another Programming Language Pretending to be a Markup Language"
Just push that script right into a multiline string 👀
Such a common pattern to see an escape hatch, but then it becomes the default because you can't do what you need to. GHA's run: block comes to mind. Now you need to open an editor, pull in the injected dependencies, write your script, then copy/paste that back into the YAML.
You know you can put your script in a .sh file and have run: bash myscript.sh right?
Bonus, now you can run shellcheck on it.
You can but now you don't have access to the github contexts and templating. Or you're stuck injecting a load of envvars and tempfiles into your process.
- run: bash myscript.sh
env:
MYVAR: ${{ inputs.foobar }}
You should not inject ${{ ... }} directly in run scripts, always use environment variables.
I do it by writing my ansible module as a shell script directly. Just making it
- name: Run MySpecialModule
vaelatern.lobsters.myscript:
somearg: True
otherarg: "Look ma, no scripting in YAML!"
The more readily broken the Ansible is, the more it's a scripting language. If you use it as an idempotent setup tool, then it's a lot less janky and a lot more reliable. But most people end up falling into the scripting capabilities instead, and it's terrible at that.
But they make it work anyway for some definition of work, which does not include working next week.
Even as an idempotent setup tool, it's bad at it.
Because it's agent-less, and state-less:
This way lies chef.
Eventually you need to decide when you want to start over from a clean state server.
Alternative is keeping your fleet up to reasonable date and phasing out the tasks that remove things you don't need.
The state for removal has to live somewhere, either a database or your readable task files.
My current plan (for my homelab, and for a server that we manage with some people) is to move more stuff from Ansible into Docker (or Podman) containers. Hopefully the Ansible playbook and roles can be reduced a lot, to make them more manageable. And since containers can be rebuilt entirely, there should be less cruft accumulating on the servers.
What are your opinions on this? Is that a good idea?
if you're comfortable with systemd-ish stuff - podman quadlets are great! and you can manage them exactly the same as regular units
It's the route I've gone myself. Containers make system administration a lot easier, and you need to convince a lot fewer people to have good practices. I run Nomad everywhere to manage my containers since that gives me easy visibility and control and consistency on the jobs (k8s/k3s has too much operational effort per unit time). Podman quadlets are cool on systemd, I'm not always on systemd systems. Nomad lets me start on a single system and expand if it becomes useful to do so, without changing too much.
I've been trying pyinfra over the last few days, in a homelab deployment tool I'm building (hope to post about it soon, once it's ready for public consumption). Compared to Ansible, what I've liked the most so far is not the python syntax (which is nice), but the speed (Ansible always felt unbearably slow).
This is an interesting space, I'm working on a deployment tool as well that works at my dayjob to deploy stuff there. I think I want it to replace my usage of ansible and salt in most places.
Curious if you've used ssh control master/control presist aka multiplexing and/or pipelining with Ansible? I'm not saying Ansible isn't slow - but so slow speed is a dominant annoyance?
I generally set up control master in a ssh config that can be included in the DevOps used personal ssh config.
https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Multiplexing
See also:
I’ve used Ansible with the Mitogen accelerator which improves it a lot, but Ansible is still far slower than it should be.
I was using ansible for my homelab. It was getting really frustrating. It was bad... and everything is like a hack... The yaml config is horrible.... And the speed... it was just sad. Also, why do I need python3 on my server to run a bunch of shell commands. (I know, I know, ansible creates a python bundle, uploads it over ssh into a temporary directory, and execute it on the server with python. But seriously, why?!?)
I discovered pyinfra thanks to Google AI Mode (not an ad for my employer, I speak for myself). Finally, it was a breeze of fresh air! I've only been using for almost a month, so take everything I say with a grain of salt, and I didn't explore everything.
python or python-apt on your orchestrated machine.-y it will ask you "you want to run this?"if 'web_server' in hosts.groups is not great, but I feel that operation(..., filter_group='web_server') might be worse. I don't know what would be better here...pyproject.toml with pyinfra specific entry-points. It's a nightmare to develop home grown connectors, even with uv... Who came up with this?! Make custom connectors a regular python file in your project, dammit!custom connectors must be pyproject.toml with pyinfra specific entry-points.
FWIW, you're just describing packaging.
Also, why do I need python3 on my server to run a bunch of shell commands
I suspect you meant that as a euphemism but if all you are running is shell commands, you do not; that's what raw: is for https://docs.ansible.com/projects/ansible/14/collections/ansible/builtin/raw_module.html You still get the server grouping, server and group variables, and jinja2 parts of ansible but don't require anything other than a way to connect and /bin/sh on the target machine
I like how we came full cycle with infrastructure as code: We moved from scripts to YAML back to (more sophisticated) scripts. I think there is a sweetspot for every approach and pyinfra really looks nice from an Ansible user perspective.
The main thing that led me to adopt Ansible was its dry-run and diff modes, so I could be confident it would not do anything unexpected. It seems that the pyinfra CLI lacks those options but I can’t find any reference documentation with an alphabetical list of all the options so I might have missed something.
For those interested, here's a similar 14 yo project of mine: https://github.com/sebastien/cuisine/tree/main -- no agent, just SSH, Pythonic API on top of the core admin functions, but does not support dry mode.
We use Ansible to provision resources in OpenStack and then use pyinfra for the rest and this has worked pretty well for us the last few years. I guess the biggest drawback is the smaller community, so you end up writing your own solutions for things. For example, we use keyring + privy to keep shared secrets necessary for deployment on disk and we use a few lines of code to convert the OpenStack compute inventory to hosts data.
When puppet was released, some of us already had hand rolled automated deployments using shell scripts and SSH. The bulk, think 99%+, of those adopting puppet came from error prone non idempotent deployment processes. More often than not involving error prone manual steps which broke mission critical statements.
I've used terraform, puppet, ansible, saltstack pulumi, clod formation, fabric, etc. and I am yet to be convinced on how I would use these instead of SSH and shell scripts. One reason that I often hear is that people are used and familiar with xyz system, but this is built on the premise that a given team of engineers is dumb and cant learn anything if needed. I don't consider this a very strong argument to be honest.
Is all this trouble just so we can maneuver around learning shell scripts? The cost is way higher than having your team learning shell script. You still have to learn the specifics of the abstraction of whichever of these tools you pick.
Bash is my favorite programming language.
I'm absolutely exhausted of having to teach all the historical reasons that make writing correct shell scripts so tedious and difficult. IMHO, there's still a sweet spot where shell scripts are the best way to do things, but these tend to be very simple scripts, and certainly not for configuration management.
Honestly, even if you dislike Ansible or others (and there are plenty of reasons to do so!), there are so many languages that are much more effective than bash at configuration management.
I've been writing Python scripts for those purposes for a while, using only the standard library with just a few select abstractions (basically, a subprocess wrapper) and things are so much more readable. Plus on the systems I was working, Python was already there.
I think it's worth learning the intricacies of bash- just because it's still everywhere and you'll have to deal with bash intricacies at some point. But (I'll stress out again bash is my favorite language), other than a REPL, somewhere were you're forced to write bash, for anything longer than 20 lines, likely it's the wrong choice. (And it might be the right choice in the other cases!)
Honestly, even if you dislike Ansible or others (and there are plenty of reasons to do so!), there are so many languages that are much more effective than bash at configuration management.
I don't dislike them, I just never saw a single instance where it would do something for me that a shell script would easily do too. Hence being useless/pointless for me to use them. A common problem with those is corner cases that they don't support with require plugins or even shelling out to you shell scripts. Bringing you back to square one.
I've been writing Python scripts for those purposes for a while, using only the standard library with just a few select abstractions (basically, a subprocess wrapper) and things are so much more readable. Plus on the systems I was working, Python was already there.
My experience is the oposite. Python is my go to language. I have write deployment scripts in python which I later converted to shell script for succinctness and readability reasons.
I suspect the problem most people face is trying to use shell scripts as a replacement for other languages. Shell script is optimized to call external tools and pass its output around. It's not too ergonomic, to say the least, when it comes to non trivial data structures or even numerical data.
Likely we're writing different scripts. I said "anything longer than 20 lines", because if you're calling a few commands, then bash is likely optimal. (E.g. I would still use it for some CI pipeline steps.) But I find myself manipulating files, doing temporary directories, making some HTTP requests, requiring some error handling.
(But really, there's not much configuration management you can do with so little code. Some, yes.)
Before Ansible we had a deployment system that used rdist, which did the job but its dry-run mode was nearly useless. I had done some experiments using git for deployment because I wanted dry runs with diffs, but that experiment didn’t get very far. Then Ansible came along and solved the problem for me.
And Ansible has reasonably reliable error handling (tho its error reporting could be better). Error handling is usually the first thing that makes me rewrite a shell script in another language, because the shell gets verbose when you need reliability so it loses its main advantage.