Behind the Scenes Hardening Firefox with Claude Mythos Preview
33 points by freddyb
33 points by freddyb
I wish they’d publish the prompts and other features used when doing this work. Maybe include the prompts in the bug reports or resolutions for reproduction. They called out non-Mythos models so some of this work could be immediately useful to others?
There's really low bar for most projects. You can start with a basic "review this project for security issues starting from (file) and listing all candidate paths", then follow each one up with "validate this report, produce a proof of concept". It's hard to not find something this way today, using Opus.
I don’t disagree with this, it’s easy to run and find more. From an open source view, it would be insightful to see what was actually used to find the issues. We see the bugs, we see the fixes, why can’t we also see the process used to uncover the bugs? Why not also have the prompts available as a sort of unit or BDD-ish test to avoid regressions?
Do you think the prompts are anything more than: "look for security vulnerabilities in this codebase" ?
Say what you want, but this is really impressive. They found 271 security issues with Mythos and 423 in total. 180 of those were high severity and some of those security issues were two decades old.
It's not completely clear how fair the Opus 4.6 / Mythos comparison was. There's an implied result that Mythos found "271 bugs" in code that was previously scanned identically by 4.6. But the article doesn't quite say that. Were there simultaneous changes to the research harness?
one of several sec-high issues we fixed involving XSLT
Have to imagine this one is here because of the fuss about removing XSLT.
What I'm most curious about here is how many false positives were reported as well. Did the models report twice as many potential vulnerabilities, and these are the confirmed ones? Do the models reproduce before reporting? In the unhidden issues shown here, I see comments around trying to reproduce, potentially by bots that were already in place?
I'm not familiar with Firefox's practices around this in general or now with AI, so it would be very interesting to get more detail here.
I’m also curious to know the rate of false positives. There’s a bit more detail in the earlier blog post Hardening Firefox with Anthropic’s Red Team. It doesn’t contain specific measurements of false positives, but this quote suggests that the rate was low:
AI-assisted bug reports have a mixed track record […] Too many submissions have meant false positives […]. What we received from the Frontier Red Team at Anthropic was different.
[…] Critically, their bug reports included minimal test cases that allowed our security team to quickly verify and reproduce each issue.
That blog post also links to Anthropic’s write-up, Partnering with Mozilla to improve Firefox’s security, which describes two ways in which Anthropic employees assisted their AI tool in reducing false positives:
When doing this kind of bug hunting in external software, we’re always conscious of the fact that we may have missed something critical about the codebase that would make the discovery a false positive. We try to do the due diligence of validating the bugs ourselves, but there’s always room for error. We are extremely appreciative of Mozilla for being so transparent about their triage process, and for helping us adjust our approach to ensure we only submitted test cases they cared about (even if not all of them ended up being relevant to security).