cursed_browser: A web browser with no rendering engine — the VLM reads the HTML and hallucinates the page

80 points by aoeu


nickmonad

This is great! Fun idea.

I was hoping for worse results in the examples though. All of those in the cursed browser are actually... not that bad? There's gotta be some real crazy results out there.

lcamtuf

The idea is funny, but also fails to deliver on the premise? "Every page load is a surprise. Every render is a work of art." - the demos on the page don't really make the case. The only surprising part appear to be images, which I assume aren't fetched, so it just comes up with something plausible.

I think the models have gotten pretty good at putting text in images, so there's not a whole lot that can go sideways if you're basically just telling it "render several paragraphs of text", especially not for websites the LLM is already visually familiar with, like HN, Wikipedia, or the Acid test. They're probably recreating the general vibe from training data, without needing to interpret the CSS perfectly right (if that CSS is fetched at all).

They're also not very good at prompt adherence if you tell them not to use tools, so even though the prompt says "don't use a HTML renderer", I wouldn't trust that this is effectual.