Lobsters Interview with ngoldbaum

57 points by veqq


@ngoldbaum is doing awesome work slaying Python's GIL and moving the entire ecosystem to the free-threading build. We also talked about burnout, Mercurial and Jujutsu. He likes PRs which remove 1k loc.


What are your thoughts on your own work? You get to do open source stuff and wanted to talk about this!

I discovered an unusual career niche and it's nice to show people it's possible. I started my career writing software to analyze simulations of galaxies which transitioned into working on software that makes grad students' lives easier. So I work on complicated Python projects which help others work on projects more directly important to society itself.

tl;dr: So I went from astrophysics simulations to getting a commit bit on NumPy and PyO3.

So you have a website with some articles, you've given a few talks, you're on lobsters, IRC and so on with your real name. You want to show people what's possible... What inspires you to engage with the public?

It's my privilege as a white dude with a US passport where I feel like I can have an online presence and participate in online communities with my real name without people bouncing down my door and telling me how much of a piece of shit I am. So far that hasn't happened. If it ever happens, I might change my stance.

In academia, I was into open science. In astrophysics people would often keep their secret sauce, publishing papers about simulations whose code and data was private. A large fraction of the literature is worthless because you have to take it on faith they didn't have some bug. I wanted to help push everything to a culture of reproducible workflows with shared data and analysis scripts anyone could run. My PhD simulation data are still available on a server at the University of Illinois.

I find it personally satisfying to help people asking technical questions, if I know the answer. It's my karmic way of paying back.

**Your website's about page is reverse chronological (interesting choice) but it ends at "got my PhD" what happened before? How'd you do you decide to do astrophysics? Were you programming before?

I took a basic class in like 9th or 10th grade in high school. I already messed around with PC gaming. My dad was a PC technician so I would watch him build computers and debug bullshit. At 12, I had Windows ME which exposed me to dealing with computers breaking.

I got a degree in Physics at CU Boulder and just went straight to grad school, hanging out in the physics library, which informed my approach where you haven't actually learned until you've explained it to someone (in a few ways!)

A common pathology of hard science PhDs is:

It's nice to solve a solvable problem like a bug report, sitting down for a couple of hours working on a self-contained thing, which really satisfies the kind of person who does a PhD, who is probably good at solving homework problems. Research, on the other hand, is open-ended and discouraging.

There's a similar dynamic in Common Lisp where we have 4-50 years worth of libraries which may have last been maintained 30 years ago or had a commit 12 years ago, but which generally run fine. But maybe so and so who updated it for their 1994 thesis didn't care about the bug you will encounter... Is this similar?

It's usually worse than that. Sometimes you just have a paper which describes an algorithm which you get to implement. If lucky, your advisor wrote some relevant code. And it'll always be buggy spaghetti Fortran or C++.

But hopefully you can use a community tool others work on; many analysis tasks are quite similar so ideally we can maintain tricky code in common routines together with nice APIs which people want to use. So it's better to work on community packages, encourage people to use them and get dopamine hits with our friendly communities!

So, Python! I've long had the impression of things in Python not working together, so you have to write a lot of glue between two libraries. A big promise of Julia was that all libraries would work together because it had standard math formats. How does Python handle that today?

Via standardization! Projects which adhere to the Array API Standard work together, so if someone passes a Torch tensor into scikit-learn it just works! scikit-learn's algorithms are written to call the Torch API in the right way to make it run directly on the GPU. But there are limitations; it only works if you can encode the algorithm as a series of NumPy operations or in Python, but not if you go into a Cython extension...

I'd love to update the Buffer Protocol API. It's a detail in the Python C API which allows you to do zero-copy sharing of a void* pointer to an arbitrary buffer containing bytes. People like 0 copy data sharing! But the way the Buffer Protocol was originally formulated, it only works with data types NumPy supported 20 years ago. We can make it so NumPy, when sharing a buffer for e.g. a string dtype array, it will have extra format metadata saying "this is a NumPy string dtype buffer with version 1.1 and some basic metadata." This is useful for Cython and there's a ton of Cython code in the wild like Pandas, scikit-learn. It has neat syntax for passing a NumPy arrays to Cython via the Buffer Protocol called typed memoryviews

Right now it's impossible to share any type unsupported by the Buffer Protocol with this nice Cython syntax. Ideally I want to make it possible for anyone to define custom data types that can then be shared via the buffer protocol. That will take some work both in CPython and in the community to enable.

What's package management like in Python now?

There's a lot of stuff happening. The Conda ecosystem is really nice. I wouldn't necessarily use the Anaconda Python distribution because of some legal issues, but the community-maintained conda-forge package ecosystem makes stuff instantly available and installable in a uniform way on the main OSs, great especially for native dependencies. The classic example is GDAL, a complicated geophysical data analysis library with all kinds of native dependencies, which you could never pip install, but it works through conda.

The Pixi package manager lets you have a lock file with an environment with all your packages defined and has similar goals to uv (uv is to Pip as Pixi is to Conda). I'm excited because conda packages can build the package too. So Pixi could recursively build everything in an environment from source, enabling sanitizer testing for packages all over the ecosystem. There are many packages of big Cython code bases, which is technically a memory safe language sort of, but there are styles where you're writing C in a Pythonic interface. I've personally shipped Cython code doing undefined behavior, generating garbage data in production, causing incorrect science results. I'm sure if you recursively built any big project's entire dependency stack including Python with sanitizer instrumentation, you'd see all kinds of use-after-frees, buffer overflows and other nastiness no one's noticing now. I want to make the process of running tests under sanitizers and other runtime validation tools super easy.

I also want to enable people to reach for PyO3 and Rust, which doesn't have a hold in the scientific ecosystem; Cython took a hold before Rust existed. Currently, no one wants to add the Rust dependency, but I'd like to add a Rust module to NumPy so a Rust compiler is a NumPy build dependency, making it more straight forward for any other project to add a Rust dependency.

I don't think C or C++ is a good choice for native, greenfield code in 2026. I'd also always choose Rust over Cython; writing in a safe language with tooling and which doesn't compile to generated C is so much more pleasant. We should be handing people shovels not grenades.

The main blocker to getting this going in the Python community is adding better support for Rust and PyO3 projects to meson and meson-python. PyO3 is tightly coupled to Python. Maturin manages this complexity, but meson-python will need to do something similar if it wants to directly support PyO3-based Rust projects. See this issue for more details.

What does work right now is putting Rust code in its own package that other projects can depend on. The Rust code is managed by maturin, so we nicely sidestep the issue with meson and meson-python. That said, I'd really like to be able to add Rust code directly to scientific projects and I'm hoping to find a way to get that meson-python feature request through.

What is your elevator pitch for free threading? Who is it for?

It's for people who understand Amdahl's law. You'll always have CPU-bound pure-Python code to orchestrate low-level code that releases the GIL, so no matter how well your low-level code scales, Pure Python code and the GIL will eventually show up as a scaling bottleneck for almost any arbitrary Python code if you want to exploit your 100-core Threadripper.

Now that we've gone through the effort and converted all these projects, you can really use your threads! We just need people to use it now.

Scientific computing's not new, what workflows does the free-threaded build help or enable?

Anywhere you have CPU bound code in Python. If someone has optimized something to death, I doubt the free-threaded build will lead to speedups, but it will really speed up development for e.g. throwing 12 cores on a laptop at a data reduction pipeline. Right now, with multiprocessing you can do it but it might require copying a lot of data, without noticing. Pickle also introduces weird caveats for multiprocessing e.g. with Jupyter Notebooks. Free-threading will often be more efficient too.

By the way, we dearly need people to test and report multi-threading performance issues. For example, whether a test is faster with multiprocessing than multi-threading. We're not sure where the scaling issues are because the community hasn't started experimenting yet. Some workflows won't work yet. I wouldn't make my production workflow depend on it yet. But most workflows should work! I'd love it if people who normally use multiprocessing would give this a try. Like, if you have a process pool executor, what happens if you replace it with our thread pool executor? We'd really love to find and fix the cases where it's slower. Ideally multithreading should always beat or be just as fast as multiprocessing.

Is there any reason to prefer multiprocessing?

One big issue with the free-threaded build is there's only one stop-the-world garbage collector, so many threads can wait on a GC pass, which we'll need to address. In some cases, process pools may be better. I ran into a case like this, so it won't always be a complete slam dunk depending on the region of scaling space you're in.

Maybe a future version of the free-threaded interpreter will have a concurrent garbage collector to avoid scaling issues coming from GC pauses. It's also possible to re-architect extensions so long-running native calculations occasionally check in with the interpreter to allow the GC to run, so these issues can be worked around.

How did free threading finally become a thing?

Sam Gross and the Python runtime team at Meta are very, very smart and found a way to do it without invasive changes all over enormous legacy code bases! There's still a CPython runtime you have to explicitly attach and detach from, but now without releasing a lock! This builds on the enormous engineering effort to produce mimalloc, and efforts from WebKit to develop parking-lot style Mutexes, which directly inspired the design of pymutex. So it was the right time in terms of technology.

Because of the GIL, the community's multi-threaded C knowledge was low and injecting some modern sensibility is badly needed. We're finding all kinds of thread safety issues, present forever, because no one has been using threads.

How long have they been working on this?

Meta's been working on Cinder, their CPython fork, for a long time. I believe Instagram is running its Django on Cinder, without a GIL. Sam Gross forked Python 3.9 and got CPython running without the GIL around then. Also it's not all Sam. The Python runtime team at Meta has a lot of very talented folks who are working to improve CPython.

At some point, I guess someone decided it'd cost less engineering effort to use the upstream version of CPython and everyone benefits, really. It's not always possible to align a mega corporation with community benefit, but the stars aligned. PEP 703 was approved in October 2023 and the steering cancelled asked Meta to fund two full-time equivalents to support the ecosystem, which was subcontracted to us at Quansight.

In March, 2024 we started on NumPy, Cython, setuptools etc. trying to get the absolute bottom of the stack working with the three-threaded interpreter. It was mostly me for NumPy, fixing the global state, though ndarray is fundamentally unsafe. No matter how fancy the lock, you'd have contention and I don't want to deal with the fall out on something as stable as NumPy. It'd be better to make something with the same interfaces as ndarray but is CPU only and immutable. ndarray is deeply mutable but doesn't need to be.

After NumPy, I helped get Py03 working, which is very sensitive to the ABI. Now, the free-threaded interpreter has a completely different ABI, a different layout to PyObject structs... There needs to be Rust code explicitly listing out the ABI down to the byte level, so the pyo3-ffi crate is basically rewriting the CPython headers with extra complexity for older Python versions via conditional compilation. It was a lot.

Thankfully I had a lot of help from David Hewitt who mentored me a bit. It's so nice to have someone who really knows a codebase and can just tell you what you need to work on. I'll have to do this again for 3.15's new ABI and for the new planned stable ABI.

Historically, each release (3.3, 3.4) had its own ABI, you had to build a binary for. Around 3.2 they added a stable subset of the CPython ABI, you could build extensions against to support arbitrary future versions. But the PyObject struct is public with 2 fields: the type and ref count, so for a multi-threaded program taking references to a Python object, you have to increment and decrement the ref counts (multi-threaded counter) on a shared object. Yikes.

But you can't change the layout because you're exposing the ref count which people use and rely on (in dubious ways). Technically, you're not even supposed to move PyObjects in memory, just use pointers - but people do it! For free-threading, one of the fields is a mutex, so you need a per-object lock - there's nowhere to put it in the original layout. On the free-threaded build, PyObject has a completely different layout. The struct isn't even the same size.

This means that reference count can be split into shared and local references. The cost is that many operations are no longer so simple. Py_INCREF is no longer just incrementing an unsigned integer. Take a look at its implementation!

In the future, they want a shared ABI so you can ship one binary for both interpreters (but hopefully we'll only have the free-threaded interpreter) which means making PyObjects opaque. Serhiy Storchaka, Hugo van Kemenade, Petr Viktorin are working on this.

What do you not anticipate supporting the free-threaded build?

There are some sketchily maintained or "done" libraries which many projects rely on. But with these radical changes to Python, the assumptions baked into them are no longer true. For example, we did a lot of work on cffi whose author considered it done. There's a long tail of such projects.

Thankfully for CFFI we were able to convince the original author and the current maintainer to accept our patches to make the C internals thread-safe.

But even in the worst case, it's only a few weeks of effort. We were thinking Greenlet would be difficult to support, because it patches the interpreter and makes a lot of assumptions. But Thomas Wouters gave it a look and it's supported experimentally now.

We've been focusing hard on the scientific and AI ecosystem where people are quite excited, but there's also the Python web developer ecosystem. The big reason I did PyO3 was to get cryptography working, which is a reverse dependency of almost every Python web stack. More niche things will definitely lag.

I think the Python ecosystem is ripe for rewriting all these old C extensions. Polars is faster and fixed a lot of mistakes in the Pandas API, leaving it in the dust.

You've been working on a lot of codebases in this project! How do you cope with that?

I had around 1500 contributions in 2025 according to the Github graph. Before Quansight, I had a lot of experience noticing and addressing a bug in random projects in the scientific Python ecosystem, for the most part well maintained packages with communities. But in this project, the packages with 1-2 maintainers we've been branching out to add free-threaded support to, are unique and we have to figure out how to work with them individually. We don't want to make it so no one can use these without explicit support for the free-threaded build, but there are often thread safety issues with static global variables in old C extensions, so we fix that where we can. Lately, I've been adding testing to pure Python projects, but some aren't actively maintained and they just sit there awaiting approval. A teammate has a spreadsheet generating reminders for them to look at our contributions.

This is in no way a criticism of maintainers who aren't looking at my PRs as fast as I would like. It's a reflection of the open source ecosystem and lack of resources available for maintainers of critical but not-so-visible libraries.

How do you go about testing?

For free-threaded build support, we've used pytest-run-parallel, a PyTest plugin, which runs each test in a suite many times at once in a thread pool, helping to shake out problematic global state. It's not perfect, it won't validate thread safety for mutable data structures, but it will find implementations that rely on global state. If a test suite passes under pytest-run-parallel, then it's probably safe to use the library in a multithreaded workflow, but without sharing any mutable state between threads (e.g. read-only map-reduce operations).

Explicitly multi-threaded tests are better but trickier because you have to understand what you're actually trying to test and think hard to test something interesting, harder than running something automatically over the codebase. But it's surprising if any of these code bases use threads at all, so we have to understand the project, hope the documentation is good... Sometimes it's more straightforward e.g. if it makes sense to share a mutable object like a counter between threads, you just make sure it's accumulating correctly. Sometimes you might only read an object from multiple threads! If there's an obvious multi-threaded workflow, we just set that up in a test with a thread pool. The free-threaded guide documents testing patterns to set that up. We want to improve the documentation and tooling around this.

If a project only exposes pure functions and doesn't expose any types with mutable state, it's easy to say whether it's thread safe. If there's no global state in the implementation, by definition it's thread safe. There can be weird issues like a thread mutating the argument to a function like np.array()

while another thread is creating an array from the object - but I'm usually assuming no one's intentionally trying to break anything like that. It's easy to get bogged down in patterns not indicative of real world use. For instance, fuzz tests are less impactful because they mostly find things people would never do. In the distant future, we should think about that, make things resilient against monkey business but it's not possible right now. So yeah, there be dragons all around multi-threaded testing, especially where there's nontrivial work with threads, wrapping a C library. Sometimes it's easy, FFmpeg is 100% thread safe! But if not, there are probably issues hiding in the GIL-enabled build already.

How much Rust is making its way into the Python ecosystem?

There's a move towards tooling and new packages, basing them on Rust and Py03 instead of Cython, the C API, pybind11 or nanobind and C++. So Polars, Astral's tooling, Pixi's tooling, UV, new type checkers etc. A lot of excitement now comes from implementing low level stuff in Rust. There's a lot for the web. The cryptography project led the way, putting Rust in everyone's stacks which made it easier for other projects because they could rely on the Rust already being there when distributions build nontrivial Python stacks.

Business-y, data science tasks are more often C++. Scientific Python isn't ready for Rust yet - I want to fix the problems blocking more Rust adoption. Many communities are stable, people don't want to add new languages. Others see Rust as complicated and are comfortable with Cython's Pythonic syntax.

I have a good mental model of how to write Rust and can reason solutions from compiler warnings, after a few years. It isn't easy. But if we start people now and shift towards that direction, a lot of things will be a lot easier. Many problems with NumPy are because changing the C code is so difficult; it'd be nice to work in a language where I can refactor things without feeling like I've broken something.

Do you think Python itself will be entirely written in Rust at any point?

David Hewitt gave a talk about adding Rust to the standard library. Emma Smith is also leading the effort to write a PEP to implement that proposal, currently targeting Python 3.16. A key issue's that Python helps bootstrap Linux distros and e.g. Debian gives exotic architectures first-class support, which a Rust dependency could block or complicate. But Rust is being included in Linux now and as it gets incorporated more, we could talk about introducing Rust in the Python interpreter. Personally, I think it makes the most sense for CPython to give extensions a Rust API alongside the C API. That might look like upstreaming parts of PyO3 into CPython itself.

Interestingly, you have an old article where Python, because of its libraries, outcompetes Rust at scientific tasks.

That was in 2019, not knowing Rust well yet.

You'll should still probably be doing machine learning in Python, but a lot of your dependencies will be written in Rust. If you're using Huggingface's tokenizers, that's Rust. However, the core will like still be C++, PyTorch or such. I don't think it's reasonable for every task to always be in Rust, but I think where it really shines is when you have Python which isn't fast enough.If a single function takes up all your runtime, I think PyO3 is really well-suited to replace the implementation. But if you wanted to deal with a complicated C++ library, PyO3 of course doesn't make sense. So many domains are based on a huge corpus of a flavor of C++ I like to call "cmake soup". Rust isn't going to unseat that anytime soon, but I want to enable that possible future without memory unsafe languages.

How long do you think it'l take for free threading to be everywhere, to be the only build?

Maybe 3.16 or 3.17. That depends on community uptake, how people feel about it and how fast we address the issues the steering council identified. It's not good for CPython to have these ifdefs everywhere for onboarding new devs; two builds adds a lot of complexity. Hopefully we can manage this in the next year and half, address documentation issues and help packages support it. We're making steady progress here.

What would you like to do after free-threading?

I have a list of things in NumPy like improving the next iteration of StringDType, NumPy's variable-wdith utf-8 text data type. It'd be nice for it to support other encodings. It'd also be nice to have a BytesDType to go along with StringDType for cases where there is no valid encoding or the encoding is unknown. If NumPy has first-class support for arrays of text, it should also for arrays of bytes. Being able to specify an encoding would help memory map enormous CSV data set for performance optimization, addressing problems astropy faces. That's funded by a NASA grant.

What goes into a NumPy NEP?

It's looser than in Python with its steering council, bylaws and formalized rules. If Python's developer community's 100 people, NumPy's is perhaps 10 so there's significantly less social complexity.

When starting a big project, taking more than e.g. 10 working sessions it's a good idea to sit down and write out a plan, even sketch out a prototype or get something mostly working, to inform a design document. I really like documentation driven design where I write the docs or example programs which demonstrate what I want to be able to do. After implementing a prototype, you can then compare with the examples and documentation to find bugs (whether in the implementation or design).

A lot of my job is communicating, writing things on GitHub, so I try to write as clearly as possible. It's enjoyable though also time-consuming and labor-intensive, to draft a bigger document but worth while especially for technically complex work. Assembling all the details together helps discover mistakes and also gives you something to refer to later.

For my first NumPy project (NEP55) my main thrust was adding better support for variable length string arrays to NumPy. NumPy supports fixed-width strings, that is, arrays of strings with a fixed number of characters. For example, a scalar items in a "U4" string array are four-character unicode strings. So every entry in the array is a four character string and if NumPy is supplied a string with more than four characters to store in the arrays, it just truncates it. If an item has less than four characters, NumPy stores null bytes in the trailing characters, wasting memory.

NumPy arrays are strided data. That things are fixed width is deeply baked into the structure and assumptions of NumPy, so adding variable length strings required some hacks. It took some engineering effort to make it possible to add new dtypes or add more advanced dtypes as user packages. This all builds on that. Sebastian Berg led the effort before me when at the Berkeley Institute for Data Science, but didn't manage to completely ship the new dtype implementation before I stated on NumPy. He helped me tackle it. The NumPy community also helped a lot; I'd propose a plan and others would give in-depth improvements to the plan along with example code!

Is there any Documentation that you are particularly proud of writing?

I am really proud of these:

What community development approaches do you like?

Anyone can do this, it just takes a few people to develop the habit, and if you stick a few people in a meeting for an hour they'll have interesting discussions about something!

Version Control

You worked on a Mercurial client at the Recurse center?

In the Mercurial IRC channel, I'd seen interesting technical discussions and development e.g. related to scaling from Google and Meta engineers doing actual real things, in real production code bases. I wanted a concrete project to learn Rust, spread out over multiple files. I wrote some blog posts.

Mercurial's legacy is in Jujutsu, especially influencing the UI with revsets. Many of the Mercurial IRC people are now in the JJ discord too!

What about Mercurial became untenable for Meta, Google that they switched?

All the tooling in the universe is built around Git; a lot of tooling assumes any code in a repository is in Git. The only fix is to treat version control like an abstract system with an adapter for each alternative, which no one wants to do. So bootstrapping off Git makes a lot of sense today. There were attempts to make that work but e.g. BitBucket eventually decided it wouldn't support hg repos anymore - deleting all the repos, issues, PRs, discussions etc. But BitBucket was already terrible, for Open Source you had to use GitHub by 2015 at the latest, for super holdouts like CPython. Mercurial's death knell was Meta forking it (into Sapling). Before that, most dev effort in Mercurial was from Meta, but then they stopped contributing upstream...

Why did Meta move to Sapling?

I think Mercurial kept making Meta's goals harder than they needed to be. At first, Mercurial was quite lively, supported by Google Code etc. but as less people used Mercurial it stopped being worth it to upstream things, but I'm just speculating.

What do you want in a future forge?

To really replace Git, you need it usable for things besides text files. People want to version arbitrary binary files, natively. Figuring out semantic diffs as a first class thing, not just binary blobs. But I don't know how to actualize this nebulous concept.

I think good jj native hosting will push things off the edge form Git. Maybe East River. I'm skeptical of Tangled, was atProto really the best choice? We all know GitHub is shitting the bed, e.g. IP blocking South America or being unreasonably slow on Safari due to a CSS bug. Or just generally being down a lot. It would take a lot of engineering to replicate GitHub actions, but do you really need to?

What different version control workflows have you had?

It's easiest to talk about in terms of metaphor. There's this blog post that sarcastically offers some "git koans" from 10 years ago. Things have improved a bit since then and there's work going on to improve Git's documentation and UI, but still Git's UI is arbitrarily complicated for no reason. git checkout is a homonym of 3 different things! Mercurial (or JJ)'s UI is well composed, each command's a verb doing just one thing. Most commands accept -r or --rev where a revset can be a commit hash, branch name, change ID or an expression in the revset DSL letting you search the commit graph with predicates! hg rebase or jj rebase works the same way with a source and base concept, not present in Git.

In Mercurial (and jujutsu), a commits can "evolve", starting as a draft then getting published. You can rewrite draft commits while published/public commits are immutable (without --force). While doing a code review (e.g. a PR), commits are in draft phase and you can do whatever with them.

There are 2 approaches for version control. Some treat it as a backup, pressing "save" every now and then. These people won't understand crafting commit messages. But you can also treat a commit like documentation: After you hack your way to a fix, squash it into a single commit which you then split into a reasoned history. Doing this, you'll very quickly run into more complicated things where Mercurial or jj come in handy. But jj rebase is of course already more pleasant if you keep a long running branch which you need to rebase every now and then.

Burnout

I definitely don't work on code very much outside of working hours. I burned out pretty hard and quit my job in 2020 and was was not working for a few years, I have tried to structure things so that work is very timelocked to 35 hours a week. At Quansight I have to put hours in a time sheet, which timeboxes me so they can't legitimately ask me to like be on call on the weekend or something. That's just not something Quansight does.

While I was at the Recurse Center I had to job hunt too although I ended up finding a job without going through Recurse Center. I actually almost got a job working on the test team for the Windows probes at CrowdStrike. So I was almost on the team that caused the CrowdStrike global outage.

What did you do with your time after quitting?

Not much. It was the pandemic. So the first bit of that was just kind of like feeling like, yeah, I'm glad I don't have a job right now. And then after that I was, I was like, okay, I'm going to take a break for six months. And then six months passed. And I was like, yeah, I don't really want to do it anymore. And just kind of my mental health wasn't getting any better. It was getting worse, in fact. And my wife encouraged me to seek therapy. And I started medication, which also helped a lot. I don't mind publicly saying that. My SciPy 2024 talk covers this experience at what like the period after that experience, where I sort of got my legs back working on NumPy stuff. But yeah I really wish I had started antidepressants in grad school. It would have helped me a lot. But for some reason, I had a mental block or had told myself that I would be like cheating or something or that like, yeah, the medication is bad. There's lots of incorrect presuppositions that
probably should have been disabused. But you know, I made my own way in the world and didn't have a lot of help from adults. at the end of that period, I reached out to Ralf Gommers. I said I was interested in looking for work.

I was at Quansight from October 2019 to March 2020. So like six months. And then I started again, October 2022. And I haven't stopped since then first at 20 hours a week. Then after a year, 35 hours a week when I felt I could do more. I do more engineering management too. Ralf transitioned from head of Quansight Labs, an internal division inside of Quansight, to CEO of Quansight PBC, which made me take on more responsibilities. He was leading the project that we're working on to do the free threaded support stuff.

What other interests do you have?

I quit in 2020 because I was thinking about work all the time. It didn't feel like I could be physically active, spend time with my wife and actually fulfill job requirements. I could pick 2, not 3. My mom died in 2018, she was very unhealthy and I could see myself going down that path. So running, rock climbing and biking, I lost 50 pounds. So I was looking for things that aren't programming, while getting paid to program too.

What you would suggest people do to try to better segment time or mentally handle work and segment that from the rest of their life or something. But as a manager, do you have ideas for how you can help your team with this perhaps?

People are humans. If you have problems, it's okay. It's the Dutch management style, which I learned from Ralf. You just say what you mean and you accept people as humans. Sometimes people aren't able to do what we're asking of them, so I try to make space for them to contribute. Reasonable expectations are important. Maybe you can change their project to align better with their experience. But they're adults, I can't force them, only make them aware of expectations. If they won't do it, there are consequences.

A good thing about the free threaded project is that everyone's very experienced. If I run into a hard technical problem, I can often just ask someone to arrive at an answer and fix it. In my past experience, the most I could do would be filing an issue and hope someone checks it out.

Working with really talented excellent people makes managing them and keeping track of what they're doing really easy because you don't have to micromanage, just look at results!

rebeca

In academia, I was into open science. In astrophysics people would often keep their secret sauce, publishing papers about simulations whose code and data was private.

This is so relatable. I remember having to email authors to hopefully get data not fully disclosed. And in a simple grad school thesis I heard that I shouldn't "share the full code" urhg

I like your approach @ngoldbaum :)