TIL: Docker log rotation

60 points by polywolf

Regardless of the subject, it's always really great to see:

short blog posts
solving a specific problem
with an explanation not only for the fix, but also how it was found.

Chef's kiss to the author, here.

maduggan

This has always been a really weird decision by Docker. Their default JSON logger supports rotation but it doesn’t compress and it doesn’t have a default set. I never understood why.

samuelkarp

I suspect it was a backwards-compatibility decision. Compression and rotation were introduced much later after the json-file driver was written.

ptman

I know people don't like journald, but you really should try. It centralises log handling (ok, you can have that with syslog + logrotate as well) and makes sure you don't run out of disk space. It (by default) monitors disk space and limits logs so that there is space available

m90

As a long time syslog + logrotate user that never looked into journald, what are the convenient parts of using journald?
- Halkcyon
  Here's an excerpt I found from an article on the topic:
  
  Think of journald as your mini-command-line-ELK that lives on virtually every Linux box. It provides lots of features, most importantly:
  
  Indexing. journald uses a binary storage for logs, where data is indexed. Lookups are much faster than with plain text files
  
  Structured logging. Though it’s possible with syslog, too, it’s enforced here. Combined with indexing, it means you can easily filter specific logs (e.g. with a set priority, in a set timeframe)
  
  Access control. By default, storage files are split by user, with different permissions to each. As a regular user, you won’t see everything root sees, but you’ll see your own logs
  
  Automatic log rotation. You can configure journald (see below) to keep logs only up to a space limit, or based on free space
  - fanf
    
    AIUI journald’s indexes are limited to log line metadata, basically just the time, not the log contents. It’s much slower to print or grep a journald log than a plain text or gzipped log file.
- dijit
  That's neat!
  
  I'd like to share one approach that I've taken to using, there's a tl;dr at the end but I'd like to talk about why first.
  
  Back in 2014 I started working for a video games development studio and we were making, essentially, an "MMORPG" (though, the RPG was much more emphasised than the MMO aspects throughout development). This put us in a weird position due to a couple of constraints.
  
  We had never developed any large scale system like this before (even though I had run 1% of web traffic at one point in time, you'd be surprised to learn that you can get away with about 25 racks of machines and a CDN, even for heavy sites). This was going to be thousands of machines, not several hundred.
  
  Game developers, really only know (knew?) Windows. So the server itself was going to be written in Windows.
  
  Perhaps a better qualified person would have found a way to do everything necessary within the microsoft ecosystem; but I was not a better qualified person and so I took to inventing the tools needed to make it work; One of the first issues was service management (especially of remote systems), so we developed a service manager that had a websocket interface allowing us to manage (start/stop/inspect) services remotely. The very next set of issues included: logging.
  
  Logging to a file has traumatic tooling on Windows, centralised logging was not very flexible for our QA environments, or developer workstations. So following the same values which we found ourselves in with our service manager: we developed a "log server" which ran via a named pipe on Windows and logged to a ring buffer. We could then access this over websockets and inspect what was going on for a particular machine. Later on we refined this so that it could forward "special" logs that we deemed important enough to centralised storage (Elasticsearch back then).
  
  This, was actually pretty great, no more dealing with log rotation, no more dealing with awkward windows tools, and it works via a web-browser so I don't need special ports to be open (that game dev studio blocked everything by default, something that seems to happen annoyingly often).
  
  This is a patten I have repeated into my current jobs, as much as I lambast journald for taking away control: it has a sort of a ring-buffer mode, you can tell it not to log to files and to keep only some amount of memory active for logs- of course it's not durable between reboots but that's not the point: you can also configure the same "ship the logs" situation too. What's missing is the web-browser component (though that's solved via cockpit if you are ok with the overhead). I even do this for my docker hosts.
  
  Now I can do everything with just a chromebook (without the developer VM) if needed.
  
  tl;dr: log to journald, set it to use memory, give it a reasonable buffer. If you want durable logs, forward what's fun to loki using alloy or something.
  - stephank
    
    That's "log-driver":"journald" in daemon.json for those following along with the article, but you can also pass --log-driver journald to your dockerd service command-line. (Or even to docker run per container.) I know NixOS makes journald the default driver, not sure if any other distros do.
    
    On top of your reasons, I also like journald just because it centralizes logging on the machine. I guess people who aggregate across multiple machines may not care because they rely on tooling, but when looking at a single machine, I no longer have to scour /var/log for where my error is like in the bad old days.
    
    You can actually get web access to journald logs using the separate service systemd-journal-gatewayd. It's an official part of systemd, and offers an API and basic web UI. (NixOS: Add services.journald.gateway.enable = true; and fly to its default port 19531.)
    
    cx
    
    If you're not using the journald log driver and are not planning on shipping the logs elsewhere, "log-driver": "local" automatically logrotates and compresses the logs [1].
    
    [1]: https://docs.docker.com/engine/logging/drivers/local/
    
    mdaniel
    
    systemd-journal-gatewayd
    
    In case it helps others https://www.freedesktop.org/software/systemd/man/latest/systemd-journal-gatewayd.service.html
    
    well, I mean, I was looking for the source in order to find out what "an API" and "basic web UI" meant but the man page mostly answered my question
    
    ema-pe
    
    TIL about that. Do you know why it expects to receive a single socket? It seems strange to me; maybe I've misunderstood something.
    
    sammko
    
    That refers to systemd socket activation wherein the daemon won't get started until a connection is received, at which point an existing socket will be fd-passed to it.
    
    ema-pe
    
    Thanks!
    
    dijit
    
    ah wonderful, thank you for this information :D
    
    rtpg
    
    "Running out of disk space" seems like a perennial failure mode, despite everything. It makes sense (hard disks are there for a reason) but it's funny how there's always something
    
    mdaniel
    
    In the context of this blog post, that's another reason to love maximum instance lifespan or the slightly more manual aws autoscaling start-instance-refresh
    
    it, of course, doesn't cure all your "out of disk space" woes since there are plenty of other places for that landmine to hide
    
    jussi
    
    Sometimes if you can't edit the daemon.json, or want to ship this behavior with the repository, it is also handy to set this in the compose file using anchors
    
    x-logging: &default-logging driver: "json-file" options: max-size: "10m" max-file: "3" services: foobar: image: ... logging: *default-logging
    
    cultpony
    
    As others have noted, using the journald setting for docker logs is way more sane than json file logs. Plus that intergrates 1000 times more neatly into your tooling than yet another log location.
    
    Honestly, docker could use better tooling to produce systemd services for containers or compose setups. I would vastly prefer using systemctl to control my containers and their dependencies if it wasn't annoying keeping a compose and the services in sync.
    
    pmc
    
    Podman Quadlets are a pretty fantastic way to run containers with systemd, though that doesn’t solve the problem of wanting a compose file to be the source of truth.
    
    polywolf
    
    Agree that there could be better tooling. Though I'm not sure what you mean by "keeping a compose and the services in sync", I just use docker compose directly in my service files. Maybe you're referring to quadlets (which do use entirely different syntax than docker compose)?
    
    cr
    
    I never understood why Docker doesn’t simply log into /var/log and ships a proper logrotate file for that.
    
    I’ve got bitten by Docker’s not sane defaults before as well.