Code Review Can Be Better
99 points by matklad
99 points by matklad
Hey! I am the person who gave that talk eight years ago (gosh).
I think it’s important to the workflow (as you discovered) that the changes to CR comments are intermingled with changes to the code – when you make a commit that addresses a comment, you atomically XCR the comment back to the original person in that same commit. No friction from wrangling the CRs separately.
When I’ve thought about bringing CR-comments to the outside world, while still preserving legible history, I have imagined doing it as a “cleanup” pass: do the review, and once all comments have been addressed, rewrite each commit in the branch without the CR comments (removing any empty commits that result), and then go through and squash things until your history looks reasonable, then merge that. Before Jane Street I was a big fan of pushing fixup commits during review; if you address CR issues as fixup commits, then once those fixup commits have the comment bits stripped out, they should apply cleanly. In my head this would work, but I’ve never tried it. I guess you could also just add CR comments in the original commit that you think should be fixed up… but I dunno. Jane Street kinda treats “one branch” as the atomic unit of a change rather than “one commit” which makes a lot of this simpler.
You could keep the original branch around (pre-squash) with some separate archived branch name or something so that you can consult the comment history, but we basically just try to make sure that anything interesting/valuable in the CR history survives as a real comment in the code. We don’t treat all CR comments themselves as part of the historical record, the same way that we don’t e.g. record every in-person conversation about the code. Occasionally you’ll run across comments like this:
(* person1: why do we do this?
person2: a long and detailed explanation of why we do this
person1: but why don't we do x instead?
person2: x wouldn't work because... *)
Which you can tell was someone who just converted a CR into a regular comment instead of deleting it or rewriting it as a more normal comment. Dialogues can make good documentation!
The “brain” part of Jane Street’s review system (“remembering what I’ve already reviewed as a literal diff”) makes it easy to “fearlessly” squash and rearrange history and be confident that you haven’t changed anything about the final state of the code. It’s… technically easy to keep track of that… but actually adopting that requires some way to visualize ddiffs which makes it kinda high cost to try out. I’m not even sure if Jane Street’s open-source patdiff includes the ddiff stuff…
You could keep the original branch around
Yeah, that’s why I want to table this until git’s ChangeId story completes. It feels like keep a previous version of the same commit might be the trick!
Somehow forgot to mention: thanks a lot for that talk! And the Janet book! And the weird unicode braille distribution visualizer!
Why not ditch GitHub PRs entirely and just use git? This is how we (the build2
project) do it:
Things that require reviews (features, tricky fixes) are developed in branches. Trivial changes go directly to master
without reviews.
Once the branch is ready for review, the reviewer pulls it down and reviews the code directly in their favorite editor/diff tool leaving comments in code or fixing stuff directly (just as you have described). The result of the review is a commit in that branch.
The author addresses the issues identified during the review, if any, and adds another commit on top. This may or may not require another review. If it does, the process repeats with another review commit, etc.
This naturally extends to more complex features that are developed and reviewed in stages. We often have branches that are developed for months with tens or even hundreds of commits that implement and review various aspects of the feature.
Once the branch is ready, we merge it to master
and clean up the messy history by squashing relevant commits together. For more complex features we also keep the branch with the original history around for future reference.
The biggest inconvenience of this approach that I can think of is the need to periodically rebase this branch on new commits in master
.
This roughly what we wanted to do with git-review. Though, we had an extra constraint of “review is a HEAD commit” because we do care somewhat about git history, and worry that squashing/rebasing after review but before merge can introduce bugs (reverting review commit feels more reliable).
But I a, starting to think that maybe it’s a good idea to let go of the history constraint. Jane street thing also encourages messy history!
maybe it’s a good idea to let go of the history constraint
This gets at an idea that I’ve touched on before, which is that we shouldn’t conflate the unit of history with the unit of review. If I’m interpreting you right, when you say you want to “keep the history”, you mean the history of how the change evolved over time as part of the code review process. But that is at odds with the goal mentioned in the interdiff overview* about wanting git bisect
to work well, and the general notion of (finished, complete) changes being committed atomically to main
.
So yes, we should find a way to keep the history of review (in-repo if possible!), but we shouldn’t assume that that history of review needs to be comingled with the git log
of finished-and-reviewed changes landing on main
.
*which I had never seen before, even though I’m a huge fan of that workflow, so thank you for that link!
I’d be a bit sad about losing the review comments and process history in squashing the commits since I find it helpful when tracking down issues to be able to read as much of the discussion around a piece of code as possible. Are you saying that you squash and clean up after the merge to preserve the review history?
Thinking out loud here I guess I could just save a copy of the branch before squash+merge..
Yes, for more complex features we preserve the branch with the original history in all its messy glory. The history is only cleaned up/squashed in master
. You could also add an explicit reference to the branch in the description of the squashed commit.
How do you remove review comments from the code? Just a manual commit with negative diff after receiving LGTM, or something smarter?
Not sure I entirely understand the question. The reviewer normally only leaves review comments when something needs addressing which are then removed by the author in the commit that addresses them. I can see how there could be review comments that are really questions or requests for clarifications (so in a sense the reviewer and the author are having a chat in the code). My personal solution to this is quite unorthodox: I strongly prefer to do reviews with the author on a voice call (with my screen shared so they can see the part of the code I am looking at). This way I can ask any questions or get clarifications directly (and which we may decide to add as code comments for future reference).
Great article, I resonate with it quite a lot, which is why I’ve been slowly working on something similar called git-pr: https://pr.pico.sh
It’s the same idea, use comments in code, but leveraging range-diff (e.g. https://pr.pico.sh/rd/101) with patchsets generated from git-format-patch and an SSH app to push and pull patches.
I haven’t quite figured out comments outside of code because I do see a need for something closer to GitHub pr comments. Right now all we allow is an empty commit for users to submit git format-patch --cover-letter
to add the top-level “review” response. I keep going back to git-notes but it seems potentially annoying to use regularly
I immediately thought of git-pr as soon as I read this post. I think the main difference is that they wanted to keep all the review in a single commit instead of a back-and-forth between the reviewer and the author, and that causes some friction. But also they wanted it to be more “interdiff-like” meaning the original commits themselves end up getting modified, resulting in a lot of history rewriting.
IIRC the git-pr model is append-only, right? The main thing I’m getting from this is that doing better than github pull requests is not that hard, but supporting interdiff flow is quite hard.
I’m not exactly sure what you mean by append only in the context of patchsets. Basically in the git-pr model, reviewer and contributor can submit patchsets to the same PR and you can rearrange commits all you want. You also don’t have to pull down the reviewers patchset at all and we figure out how to range diff between patchsets.
I haven’t quite figured out comments outside of code because I do see a need for something closer to GitHub pr comments
Yeah, a particular killer features is to be able to comment from the browser as well. I do serious review locally, but if I can give a quick suggestion on my phone while commuting, that’s also valuable. I wonder if those could be stored as git notes?
Another thing I think I like about git hub PR model is that it is branch based, not patch based. I like to review code in the context it was written, not relative to the current main. It’s the job of not-rocket-science rule to then check that applying the changes from that branch to main make sense
EDIT: added git-pr to the list of links at the bottom, sorry, should have been there from the start!
Agreed. Sometimes I can approve and merge from the UI without pulling locally.
I also think massive PRs is another use case that some of these tools cannot accommodate successfully.
People laude gerrit so I’ve been drawing inspiration from that service when it comes to enterprise level code collab.
Thus, git-review was born. The idea is simple:
- Code review is a single commit which sits on top of the PR branch.
- That commit adds a bunch of code comments with specific markers.
- Review process involves both the author and the reviewer modifying this top commit (so, there’s a fair amount of git push –force-with-lease involved).
- The review concludes when all threads were marked as //? resolved and an explicit revert commit is added on top (such that review is preserved in the history).
And when multiple people review the same PR? Maybe at different points-in-time, between which the original author has pushed some intermediary commits? I’m sweating bullets just thinking about merging all of those conflicting commits together, especially if the process relies on --force
– !
In our context, we have one reviewer and one author, to avoid bystander effect, so multi-way conflicts are not an issue. If there were many reviewers, I wouldn’t expect that to be too bad: the key thing is that —force-with-lease
(not plain force!) gives you cas. The problem are “spurious” merge conflicts, which are a pain to resolve, at least without writing a custom merge tool or something.
What we did incorporate from git-review experiment is that now, for smaller changes, reviewer directly pushes to the PR branch, rather than round-tripping through suggestions. Works great!
Not trying to refute the article, genuinely curious what other real humans here think:
Where do you draw the line, because a thorough enough code review approaches “why didn’t you just do this yourself” territory. When I’m one of a couple seniors reviewing MRs from 10 other people daily… I mean, I wish I could 100% every review, but at some point, I have to trust that the author has done their due diligence, and keep my reviews focused on non-obvious issues, and applying some forward thinking to not be backed into future corners that I know about. It does miss sometimes, but that turns into a learning opportunity for the MR authors— they can’t exclusively learn from you fixing everything for them.
Anyway I do agree with everything here. There is a GitLab plugin for IntelliJ that does a pretty good job at emulating this, but I get it, reviews could be further codified into source, since it’s a critical piece of maintenance information. It’s probably different if you aren’t a corpo overworked code monkey too I guess -_-
Depends on context! Two useful opposite points:
Depending on the consequences of code changes for your project, you need different review strategies. But it helps to be explicit about which strategy you use.
Require that each change is essentially fully owned by two people, don’t make a distinction between the author and reviewer, require them both to load the change fully in their heads.
Maybe it’s just me, but no matter how hard I try to “load”, the result is never even close to when I implement the thing from scratch myself. Maybe the solution is for two people to separately implement the thing and then compare the implementations? Wonder if anyone does that and how it fares…
It’s hard! Some tips that work for me:
My top tip when reviewing what I haven’t wrote from scratch is to first sit down with a pen and paper and think through all the nuances and corner cases that I would need to take care of were I implementing this myself. The make sure they are taken care of during the review.
I once made a short presentation about a generic way to put the reviews in the git repository itself.
The solution space is large, I ended up with proposing:
reviews/<branch-name>-<reviewer>.org
but you are free to use any kind of file format you should prefer).Here is a tool doing this for you: https://git.esy.fun/yogsototh/gpm#headline-8 (I hope you will not be blocked by anubis :/)
If you are curious, the tool I wrote is clearly just a proof of concept, very minimal, but the principle are easy to grasp. The presentation is here: https://her.esy.fun/slides/git-project-manager.html
There could be a lot more ways to perform reviews using git-only resources. Another option that would still allow users to perform code change but reduce the risks of conflicts would be to introduce a minimal tool that prior to merge a branch foo
would create a new branch foo-review-<reviewer>
that you could pull compare and merge parts into foo
.
Code review sucks because you’re reviewing the code after the author’s intent in making the changes has already been lost
This assertion makes no sense to me. If the author’s intent isn’t clear from the changes (and the commit message, and any linked issues, and…), then the change probably doesn’t make sense and shouldn’t be submitted. This would be a code review win.
Say the change is “add an extra argument to all calls of this particular function”.
The code reviewer is responsible for working backwards from all the touched calls to the original intent. To spot a mistake in the process they’d basically forced to do all the work over again to be sure because add/change conflicts with call sites are de rigeur on Github
When I’m reviewing code, if the reason for a change isn’t either obvious or explained then I won’t accept it. There’s no way I can sensibly review such code.
It’s not my job (as a reviewer) to attempt to read the mind of the author or to figure out why they did what they did. That would be way too time consuming. If they can’t explain their own changes when asked then it’s likely not a change that needs to be made.
I think @conartist6 is right if we understand “author intent” in line with the famous ‘Programming as Theory Building paper’. Sure, here both the author and the reviewer (ideally) have some theory about the code already, but for a large PR (e.g. for a completely new, large feature) even the best code in itself may not be enough.
Edit: I didn’t scroll down to @jdp ’s comment, which says the same and more!
If that’s the commit message, it’s a bad one. I’m probably guilty of this type of commit message myself, more often than I’d like to admit, but I do try to make commit messages with Why, not just what, and often no what, because that’s in the diff anyway
I would not read a single line of code and immediately require a better commit message. Again, that’s a win of code review - this message making it into the commit history would be a mistake. And chances are there are problems with the change anyway if that’s the message that the author decided to associate with it.
It might sound punitive or nit-picky on the surface, but demanding that engineers are consistently good at this type of thing is part of what makes a team run well. Being good at this sort of thing isn’t difficult, but it does require consistent effort.
This is the point being made in Programming as Theory Building. It’s why for consequential changes I like to separate “design review” from “code review,” the design review is the step where we all agree on what we’re building and how it should work, the code review is the part where we just confirm that the code works the way we expected it to work and does what we said it would do.
I really like the Rust RFC process and have tried to introduce it (or something approximating it) in most places I’ve worked for the last 10 years: https://rust-lang.github.io/rfcs/
Not sure “design review” has the right flavour to me since I think the design phase should be more collaborative and not so much split into author and reviewer, but I think the core idea is great.
Isn’t that intent captured in the PR description?
Perhaps, but the difficulty is that the reviewer should be responsible for verifying that the committer actually did what they intended to do, and for that the PR description is no help. For example they may have intended to rename all call sites and say so in the PR description, but then missed a few
This is a real issue, and I have sometimes wished for review tooling that can assist with this sort of thing; for example, some way to verify that a diff is equivalent to some regex replacement, or that all references to a modified symbol were updated.
On the other hand, enforcing cross-file constraints (e.g. like call sites needing to be be renamed) is exactly the sort of thing that type systems are good at! The reviewer just has to check for any unintended changes. Failing that, unit tests can be good as well (in much the same way as when if one really cannot express an invariant via SQL constraints / foreign keys / etc, falling back to a daily scan for violations is usually an option).
On GitHub PRs, every new tool you introduce is forced through a few same-looking UX widgets: it’s a bot that comments, or a “check” integration - tests, linter, scans - that becomes a little green checkmark or a red X. Finding the real information from these things 1) means drilling down into a check and looking at raw terminal logs in a narrow web interface, or 2) navigating away via a hyperlink (probably posted by a bot in a comment) to a bespoke interface for the tool.
To be fair, every project is different, and designing a system that works for 100 million coders is going to push you towards this lowest-common-denominator approach necessarily, and I’m glad it’s not my job to figure out how to make it better.
This triggers two thoughts:
Sometimes we @-mention Claude in a GitHub comment and it’s horrible. The review is fine for an LLM, but I just don’t like that it’s a verbose comment. I want the UX to squirrel that stuff away.
I’ve never been a big fan of “chatops” automation either in slack or as github comments. :-/
Hey @matklad, just wanted to drop a note for future research on this topic is to look into Perforce’s “Swarm” Code Review tool. Most of the game industry is built around it, for better and worse. But notably it has essentially stacked PR’s, the comments are versioned with the review version, and all of that good stuff.
https://www.perforce.com/products/helix-swarm
You can self-host it and perforce for free.
Seriously these code review system has ve used is bugzilla!
Specifically the very very very customized webkit bugzilla.
Alas I believe that they have also switched to the dumpster fire that is GitHub.