your hex editor should color-code bytes
86 points by smlckz
86 points by smlckz
Also - you might want to look at hiew (1991). It has colors and it even does edits.
Here is an introduction to it: https://lock.cmpxchg8b.com/hiew.html
Hex Fiend (an open source GUI for macOS) supports this, too. You can select a color theme in the View > Byte Theme menu if you’re on macOS 12 or later.
Hex Fiend supports defining custom color themes with a JSON5 config file. You can assign colors to the hardcoded categories “whitespace”, “printable”, “null”, “extended”, and “other”, as used in the default themes, or you can assign colors to arbitrary predicates that use NSPredicate format strings. I think you could reproduce alice pellerin’s 18 total groups with config like {light: {"null": 0x808080, "b >= 1 AND b < 16": 0xe95c91, …}, dark: {…}}.
I was the one who requested the feature in Hex Fiend, which I did in 2019 after I saw Hexyl. The maintainer implemented it in 2023.
I tried getting an AI to do it, and it didn't work. It should in theory, but in practice it seems the number of predicates you can practically use in HexFiend is quite limited. This has weird effects, usually the "Byte Theme" menu becomes empty (likely because a Swift fatalError is triggered during parsing of one of the themes). :(
Also (less importantly) HexFiend might currently only support ~25 named colors, not hex colors (though this could be fixed relatively easily)
(likely because a Swift fatalError is triggered during parsing of one of the themes)
I bet they would welcome a report of your experience https://github.com/HexFiend/HexFiend/issues
Just wanted to give a shout out to OP's own hex editor project: https://github.com/simonomi/hexapoda
Check it out, it's awesome!
The most helpful thing I've found when trying to read unknown formats: Hide null bytes (00), it makes it much easier to spot the grouping of data. The editors I use most often (Hex Fiend and Hexed.it) have that feature, probably most do.
I haven't tried coloring, and the color scheme here does deemphasize nulls so might have the same effect, but I think I'd find it distracting.
I expected this to be really distracting, but it's actually quite neat. Huh.
I need to finally write my own hexdump(1) clone someday, or find a good one out there. The author mentions hexyl - does anyone here know some other good ones?
i really like imhex. i've used it for a lot of my reverse engineering/struct-unpacking-repacking work.
my favorite part of it is being able to define C/rust-like structs and then map them onto specific addresses to add nice and custom-defined patterning/coloring of bytes, but it might be a bit much for some cases.
I'll second ImHex, but also nominate Kaitai Struct.
Similar to ImHex, it's got a nice little declarative language for defining binary formats, a nice web IDE to let you poke at hex dumps and edit your format description interactively, and the ability to generate binary parsers (in a variety of languages) from your format description. One of the "parser generator" backends also can output it as GraphViz dot source for you to generate a diagram from.
Shameless plug for my own: huxdemp
hah, I immediately recognized that font :)
This looks really cool, it seems to have a lot of the features I want. The Lua plugin support sounded like a bit much, but I think the UXN demo sold me on it. I could also use it to implement octal support and such. Automatically using less(1) is also a nice feature - I probably wouldn't have to guess sensible values for -n anymore - but I wonder how it interacts with piped in input. It's a pretty niche usecase but afaik e.g. hx tries to get it right.
I haven't tried it yet, but I think the only issue I really have it with is that the -f syntax feels pretty verbose. I wonder how it could be improved.
The literate C style is also pretty interesting. Looking at RetroForth's source code, it tells me to go visit http://unu.retroforth.org, but that site is down :( oh well
Awesome, love that the font is entering the local hacker zeitgeist. :)
but I wonder how it interacts with piped in input
It actually always opens less, albeit with the -F / --quit-if-one-screen flag. Now that I think of it, this probably reduces the BSD/macOS compatibility.
feels pretty verbose. I wonder how it could be improved.
Agreed. I think at the time I planned to add a "compact" version of the flag where each column uses a single character, so -f offset,bytes,ascii becomes -F oba. Plugins would complicate this, though, so I put it off.
At least the general style of the flag has some precedence (e.g. in ps), so it's not horrible.
that site is down :(
Huh, not sure why the page was removed! At least it was archived. That said, if I return to the project (probably never, since it's "done") I'll probably make it plain C -- looking back, I think I'm one of those folks who completely misunderstood what literate programming really is.
I just use xxd (well, tinyxxd) which unlike the article suggests does have colour support and has a lot of features, more than hexyl, for this kind of viewer.
I found the pastel colors in the hex view to be somewhat unhelpful—not quite distracting, just unhelpful. Though I did really like the visual distinction between uncolored 00 bytes and colors for everything else. I also liked the bolder colors in the ASCII view.
I'd be curious to see other color palettes for the hex view. And this might be less practical for the terminal, but I'm also curious how it would look to use other forms of distinction, such as different fonts/weights (there's a lot more dimensions than just color!).
hexyl 0.17 even has --color-theme gradient, despite the article seeming to claim otherwise. It's nice, though it might be new (I'm unsure)
For absurd's habitat UI I started color coding UUIDs and it has been such a great experience: https://github.com/earendil-works/absurd/blob/main/habitat/screenshot.png
I’m very interested in other peoples’ experiences with trying to highlight 16/32/64 bit values. I’ve tried Morton encoding, hashing, and my poor attempts at perceptual hashing all mapped to palettes or gradients from colorcet but never got good results. The linked picture is just a simple color gradient except for 00/FF exceptions - the part I like is the Braille glyphs for non-printable bytes. I got the Braille glyphs trick from Asahi’s m1n1. :)
A related open question for me is guessing a good base for display. I was able to find my heuristic attempt here that scored based on length and entropy (to reward digit runs).
I see the value of highlighting that one unique value but besides that I find this really distracting. But I also don’t use syntax highlighting and find most modern CLIs too colorful.
Would be more interesting if the hex editor just highlighted that one unique byte but nothing else, but of course that’s quite specific to this example.
I rarely have to hex edit anything but I've certainly used tools that don't color highlight the data. It does kind of make it a little more clear for me. Not sure if other tools do this but they probably should make it an option...
I couldn't see much benefit, but I was extremely distracted by the way so many of the numbers were underlined. For me, underline is a much stronger visual cue than the subtle colors, so it was hard to get past that. (Also, 2 of the colors they choose look indentical, probably because I'm colorblind.)