Apple File System Reference (2020)

21 points by snej


Apple File System is the default file format used on Apple platforms. Apple File System is the successor to HFS Plus, so some aspects of its design intentionally follow HFS Plus to enable data migration from HFS Plus to Apple File System. Other aspects of its design address limitations with HFS Plus and enable features like cloning files, snapshots, encryption, and sharing free space between volumes.

Most apps interact with the file system using high-level interfaces provided by Foundation, which means most developers donʼt need to read this document. This document is for developers of software that interacts with the file system directly, without using any frameworks or the operating system — for example, a disk recovery utility or an implementation of Apple File System on another platform. The on-disk data structures described in this document make up the file system; software that interacts with them defines corresponding in-memory data structures.

snej

A filesystem is a specific type of database; or else filesystems and databases are different perspectives on a common concept we haven’t really clarified yet. This has fascinated me for a long time.

Reading through the APFS spec I see a lot of similarities to b-tree database engines I’ve read about, most strikingly LMDB with its use of copy-on-write b-trees, although I’m sure it’s also like engines I don’t know such as Postgres. (Apple’s previous filesystem, HFS, was also based on b-trees, which I believe was unusual when it was first designed c.1985. And APFS must also have been heavily inspired by ZFS, which Apple was planning to migrate to in the 00s.)

From this perspective it seems kind of perverse to me that so many databases are implemented as another b-tree layer stored within a plain old stream-of-bytes file in a filesystem! I know there are many practical reasons for this, but what if a filesystem could let userspace programs create entities that used its underlying b-trees to store key-value data? They’d be both files and databases. It seemed like BeFS was taking steps toward this, but I haven’t heard of any further advances in that direction.

(And then of course there’s the NewtonOS approach where there is no filesystem at all, just a database “soup”, but that was more of an oddball graph/object database.)