How VictoriaLogs Stores Your Logs in a Columnar Layout
25 points by eatonphil
25 points by eatonphil
I would be curious to get a comparison with Loki. Loki performance is pretty bad if you are not able to reduce the number of messages with labels. It seems that VictoriaLogs is tokenizing the message, something Loki may not do.
Yeah Loki does not tokenize it, it leaks into their log query language which requires “piping” the logs into a json decoder before accessing fields. Which means you also pay the CPU cost of parsing json every time.
I assume the tradeoff is that they can handle more ingest then? Or at least the same with a worse CPU?
I guess there's nothing stopping a system from adding a secondary offline indexing step, just knowing that recent logs are slow to query until they're processed.
Maybe marginally, but the relative cost on the query side is so much higher that it really doesn’t make sense to me. To filter the data they need to decompress, then parse the whole structure (not guaranteed to be valid json, which adds extra complexity), then locate the field from the query, then finally apply the filter. Json is not particularly amenable to parsing in a streaming fashion, in addition you might have to copy a long string field if it has escapes in it. So yeah parsing just-in-time is pretty terrible.
I can't compare with Loki (never ran it). What I can share is my experience with VictoriaLogs. I'm running it on a 2014-era Mac Mini (4GiB RAM, no SSD, only HDD), along with a few other services. It's currently ingesting ~5 million loglines a day, but there were times it ingested ~200 million (there are 96.2 million entries in it currently). VictoriaLogs can handle that fine, and most of my queries (as long as I restrict them to at most about 24 hours) finish within 10 seconds, usually within ~3-4.
I think that considering the hardware it is running on, the performance is amazing.
Mind you, the logs I send into it are structured around the queries I typically make, so VL hopefully has an easy job.