Using edge detection to preserve significant features while downsampling
23 points by Yogthos
23 points by Yogthos
Most pixelation libraries take the lazy route where they just downscale the image and then upscale it back with nearest neighbor interpolation. It's fast but the results are usually pretty messy because the grid just cuts right through important features like faces or distinct edges. You lose a lot of the actual character of the image that way.
I tried something different where I treat pixelation as an optimization problem rather than a simple scaling operation. Instead of forcing a rigid grid onto the image it adapts the grid to match the underlying structure of the picture. The end result is pixel art that actually preserves the semantic details of the original image while still nailing that low-res aesthetic.
The secret sauce here is an edge-aware algorithm. It starts by running Sobel operators to detect edges and gradient magnitude. Once it has that data it initializes a standard grid but then iteratively moves the grid corners around to snap them to those detected edges. It uses a search-based optimization to test local positions for each corner and finds the spot that aligns best with the image boundaries.
It’s definitely heavier than the naive approach but optimizations to keep it usable. The edge detection runs on WebGL so you get a massive speedup there and it uses spatial hashing for O(1) lookups during the rendering phase. On modern hardware you are looking at maybe 100 to 500ms per image which is fast enough for most purposes. It’s pretty cool seeing the grid warp to fit the content rather than just chopping it up blindly.
https://github.com/yogthos/pixel-mosaic
I think this needs some more comparisons. At first glance, the "edge detection" image just looks much like normal anti-aliased (low-pass filtered) downsampling to me, while the "naive approach" image is just aliased downsampling (point sampling). I think the resolution might also be different too?
To properly show that a complex algorithm like this offers an advantage, I think you should to compare it with traditional, cheap image downsampling algorithms (the usual low pass filtering variants, but also things like median sampling, just applying the original color palette to a downsampled image, etc.), all at the same resolution. I can think of many approaches that would take a hundredth of the time to run and probably produce interesting results, so you should spend some time checking against those to see if the approach really is providing tangible improvement for the extra processing time! ^^
I grabbed all the images directly from the app, so all the generated images have the same resolution. I find with this kind of stuff whether something looks good tends to be in the eye of the beholder. I find each algorithm produces a different feel for the image. It's not the fastest approach, but I do find it does a better job preserving edges than other ones I played with. Honestly, it runs fast enough that I don't really see performance being a factor. It's really about the aesthetic.
Aw, I was hoping this would be a blog post.
This really could use a side-by-side comparison vs. the naive algorithm. Also, since you're saying it adapts the grid to the image - is there a way to actually see that? Like, the results do look a lot better for the images I've tested, but I have no clue how this actually works.
Haha was too lazy to do a proper blog post about it once I actually got it working. And you definitely could add edge visualization step. Might be a fun thing to try.
oh so I managed to get intermediate step visualizing working on a branch here if you're curious https://github.com/yogthos/pixel-mosaic/tree/visualize-steps
Very interesting. In the dino + headphone example, you definitely can see anti-aliasing occurring, and it takes something away. The headphone band’s white part disappears. Adding sharpening makes it a bit better, but I prefer “naive” by a lot. I’ll try some other images and see what I get.
Regardless, thanks for sharing; really cool idea!
The softening is definitely a double edged sword. You end up with a smoother transitions, but you also lose some contrast in the process.
Interesting. I have been working on modernizng an old game (https://github.com/bondolo/tribaltrouble) and have been unhappy with the mipmap generation. It does really poorly with the palm tree leaves (https://github.com/bondolo/tribaltrouble/blob/master/assets/textures/models/trees.png) using bicubic scaling. Does your library handle alpha channel? Suggestions for best approach to handling at least the first two mipmap levels using your library?
Yup, alpha is preserved in pixelation and should be properly averaged/blended in edge-aware mode. The image in your link works fine and transparency gets preserved. The easiest thing to do would be to just have the sprites in the same image, and then pixelize them together. Running the library on node for batch jobs is tricky cause webgl is fiddly to get working headless, and using CPU might be slow.
Very nice! The color palette isn’t always great though, with a large area of a mostly solid color, the palette is quite a bit off sometimes. Maybe there’s a bug there?
Quite possibly, I didn't really focus on preserving color accuracy. So the sampling might be off there. Also depends on the contrast setting used.