Code Review for Claude Code
2 points by jbranchaud
2 points by jbranchaud
Seems very cargo culty to have AI do an actual code review, as one of the most important points of code review is to propagate change and systems understanding among team members.
If all you want is Claude to to eye over the changes surely you can put a definition of done checklist in claude.md and fix the jank before you even commit?
Sure, but I would distinguish between having the AI "do" the code review, and having the AI "contribute to" the code review. If the AI is simply pointing out potential issues, it isn't taking away from the human review process and might be enhancing it.
Basically agreed.
I have become quite positive-minded about LLM pre-reviews as I call them, not as a substitute for a human review, but as a first pass to catch things that humans are bad at spotting (and to protect the precious attention budget of fellow team members, which is better spent on the juicy parts of the review after the obvious stuff has already been dealt with. I'll claim without evidence that leaving obvious mistakes in a PR, even just in the review history, is slightly distracting and can cause more interesting points to be missed by sapping focus).
These things can be good at spotting copy-paste mistakes, divergences from specs and missed updates to sections of code/comments that are way outside the diff hunk window. (All of these are quite tedious for humans to notice) I have a basic tool that I run on my machine for my own code and also have been running in advance reviewing my colleagues' PRs. It catches a lot of good stuff, I feel it covers some of my blind spots and I cover some of its blind spots [as it also misses some stuff, of course]. (Mind you, my tool doesn't cost $15 per review like Claude Code Review :), maybe I'm not prompting hard enough or whatever)
In that case it doesn't even need to be pushed up as a PR until that pre-review is complete; it'd be better if it wasn't, actually. Keep the PR history clean and distraction-free.
I find it odd that the two trigger options you can configure are 1) on PR open or 2) after every push. Neither of those it what I'd want. I'd prefer to manually trigger the review with a PR comment like @claude-code-review go for it. My PRs often aren't ready for this kind of review the second I create them. https://code.claude.com/docs/en/code-review
I've reviewed/commented on an incredible number of llvm-project patches (https://github.com/pulls?q=repo%3Allvm%2Fllvm-project+is%3Apr+-author%3AMaskRay+commenter%3AMaskRay , 2200+) I used to squeeze some of this in during the workday at the previous job, but these days it’s strictly a nights-and-weekends effort.
I've now created ~/.claude/commands/review-patch.md to help me review patches. It's a powerful tool for accelerating the review workflow, though it's important to make technical judgment by myself.
Example output:
Commit message: Needs work.
The PR description is stale — it still references getCodeCompressionGroupId()/getDataCompressionGroupId() and the CRTP hook design from the first iteration, but the actual diff uses getSectionName() + glob-based matching with a single getCompressionGroupId().
The description should be rewritten to match the current implementation. Also, "Building on top of the --bp-compression-sort-section=<glob> flag from #168150" is confusing since commit e62a0e7 from #168150 is squashed into this very PR.
Tests: Minor issues.
- bp-section-orderer-cold.s: The verbose output check COMPRESSION: Compression groups: 3 and COMPRESSION: glob 0: etc. prints glob indices rather than the actual glob pattern strings. This is not user-friendly for debugging — a user seeing glob 0 has to
mentally map it back to their command line. Consider printing the glob pattern.
- The existing bp-section-orderer.s test has data order changes (d4,d1,d3,d2 → d3,d2,d4,d1). This is expected from the behavioral change but should be called out in the commit message.
- compression-order-sections.s (MachO): The SPLIT check says glob 0: 6 sections and glob 1: 1 sections. Only 1 section matches __TEXT* — that's _main. But there should be more __TEXT content from data_* symbols that get their own subsections? Actually no, data_*
are in __DATA. The __TEXT segment only has _main in __text. Looks correct, but the test could benefit from a brief comment explaining why the split is 6/1.
Code: Several issues.
1. Behavioral regression in --bp-compression-sort=data: Previously, forDataCompression=true ordered all non-SHF_EXECINSTR sections. Now it translates to .data* + .rodata* globs, which is strictly narrower — .bss*, .tdata*, .tbss*, custom data sections are all
excluded. The test output changes confirm this. This semantic change to a (now-deprecated) flag should either preserve the old behavior or be explicitly documented as intentional. For exact backward compat, the glob should be something that matches "any non-code
section," but globs can't express negation, so perhaps the translation needs a different approach (e.g., a special internal marker for "all data").
2. std::string getSectionName() is wasteful for ELF: sec.name is already a StringRef. Returning std::string forces an allocation for every section on every glob match. The CRTP hook should return StringRef where possible. For MachO, the concatenation
(sec.getSegName() + sec.getName()).str() requires a temporary — confirm this compiles (StringRef doesn't have operator+; you likely need (Twine(sec.getSegName()) + sec.getName()).str()).
3. Dead field bpCompressionSortSections: Both ELF/Config.h and MachO/Config.h store std::vector<std::string> bpCompressionSortSections but it's never read after the globs are parsed. Remove it — the GlobPattern vector is sufficient.
...
Summary: The core idea — replacing the hardcoded code/data split with glob-based grouping — is good and the right direction. However, the backward-incompatible semantic change to --bp-compression-sort=data is a blocking issue that needs to be addressed (either
preserve old semantics or justify the change). The stale PR description, dead config field, and the std::string allocation per section are secondary but should be fixed before landing.
Blocking:
- --bp-compression-sort=data behavior change (narrower matching than before)
- Stale PR description that doesn't match the implementation
I use rr record and rr replay with gdb a lot when reviewing patches. Claude Code can use non-interactive gdb -batch. I hope Claude can give more insights if it runs some tests under gdb.