The first year of free-threaded Python
32 points by ngoldbaum
32 points by ngoldbaum
On one hand I love performance increases, but on the other hand, if you want fast code it’s better to use Python as glue to call out to faster libraries via FFI which can already utilise the advantages of no GIL. When using Numpy or OpenCV functions for example most are not written in python and are already multi-threaded/process.
Since Python by most benchmarks is somewhere around 2 order of magnitudes slower than comparable compiled languages, even a 20x speed up on a 20 core machine is futile compared to re-writing it in a single-threaded compiled language. Not even considering that you can get that same speed up in your compiled language often with as little as one line of code using OpenMP in c++ or Julia for exaple.
In general if you want python and you want performance, It’s better to use Python for doing all the light work, and then calling to faster compiled code for anything performance critical. This allows you to get the best of both worlds in terms of speed and productivity and leveraging the ecosystem and batteries included std lib.
Python isn’t an ideal production language for many reasons (and I would discourage its use for any new production application), but the changes are still valuable for places where it is already extant. In many cases there are already large, working codebases in Python for which the business case to write it in something else is very low. For example, cases where all the developers employed already understand Python, and the risks of conversion to something else are very high. For example, we probably wouldn’t write Instagram in Python today, but the likelihood that we would rewrite it from the ground up in another language is very low due to many factors. As such we instead have invested in:
On medium to small codebases it may make sense to have instead rewritten the project, but in this case, it’s vastly cheaper to try to improve CPython.
Agreed, and to add to this, it’s often impossible or impractical to find an optimized FFI library that solves a problem. For example, at the last company I worked at, we picked Python because it was assumed that Python would let us build a prototype easily and then we could throw Numpy or whatever at it to make it fast. But the application itself was a large object graph and the performance penalty was distributed across the entire thing–there was no single component to replace for a significant gain; we would have had to rewrite the entire thing in a faster language.
Performance was just one of many taxes we paid to use Python. Here were some others off the top of my head:
Lots of bugs that would have been trivially prevented by a type checker. We tried using Mypy, but it was painful to use at the time for a lot of reasons.
Prototyping was slower without type checking–you had to run your code to see if it worked. I would frequently prototype in Go for expedience and then rewrite into Python.
Reproducible package management was horrible. We churned through several different package managers, each promising to fix the problems of the last while introducing new catastrophic problems, such as pipenv’s 30 minute constraint checking times on a relatively small project.
Trying to use containers to reproduce our development environment–using Docker for Mac to share a host volume with a container would eat every ounce of CPU on your machine just to marshal filesystem events. This was only a Python issue insofar as Python doesn’t let us quickly compile the source to an executable artifact that we can then chuck into an already running container like we could do with Go or similar (moreover, Python is more dependent on containers for development because it depends on the environment to a much higher degree than Go).
Trying to build Lambda artifacts out of Python–our small Python codebase depended on Pandas and a couple other large libraries, which meant that even compressed, the artifact was nearly half a gig and it exceeded the maximum size allowable by AWS Lambda at the time. Go throws away unused code, and a comparable Go application built artifacts that were a couple orders of magnitude smaller (and it built them much more quickly).
In general, the larger artifact sizes, slower builds, and slower runtimes increased every iteration loop and added friction to nearly every aspect of the development and operations process. For example, unit tests took several minutes to run compared to several seconds to run our entire test suite with Go (this is a performance issue, to be clear) which meant that people ran tests less frequently.
By contrast, our naive rewrite from Python into Go improved our performance by two to three orders of magnitude and reduced artifact sizes by a factor of 60. Endpoints that would time out after 60s would run in 1s.
I’m sure lots of Python issues have improved in the last ~5 years, but I would still not start a new project with Python unless I absolutely needed some niche ecosystem where Python was dominant (and even then I would think about how to minimize my Python dependency from the start) because my general experience is that it’s extremely hard to predict the showstoppers you will run into, whereas other languages seem to have far fewer showstoppers in the first place.
Python isn’t an ideal production language for many reasons (and I would discourage its use for any new production application)
Bold. What class of applications are you thinking of and what would you use instead?
even a 20x speed up on a 20 core machine is futile compared to re-writing it in a single-threaded compiled language.
Only if your time is free. For any non-trivial system, rewriting is a matter of months and, consequently, tens or hundreds of thousands of dollars in development time.
Updating to a new python version, on the other hand, is weeks, at worse, and likely much less when nice people like OP have done a lot of the legwork to make things stable.
Unfortunately, that ‘faster compiled code’ at times holds locks, so you can’t fork() the process reliably anymore.
That used to be a nice way to use more cores for e.g. matplotlib plots.