The hidden cost of PostgreSQL arrays

28 points by sjamaan


simonw

When creating a table, you might expect strict typing. That's true for everything — except the array dimensions. You might think integer[][] enforces a 2D matrix. Except it does not. The [] syntax is effectively syntactic sugar. PostgreSQL does not enforce the number of dimensions of sub-arrays at the schema level at all by default.

That surprised me. I wonder why PostgreSQL is designed that way - I normally expect it to be more strict than that.

doctor_eval

It depends on your use case, but arrays can be a very significant performance benefit, simply because a link table requires not just extra columns (the foreign key at a minimum, but arrays also have ordinality), but a pkey index and possibly an fkey index, depending on how you structure the table. If the linked table is sparse then you can have very significant read amplification relative to arrays.

I’ve used arrays extensively over the years and there are definite performance advantages, despite not being “relational” in a dogmatic sense. They aren’t for every use case and should be deployed with care, but link tables can suffer from the 1+N problem, and arrays can help you avoid it.

mqudsi

PostgreSQL 14 introduced LZ4 as an alternative

Postgres 16 introduced ZSTD, which is even better. (Though LZ4 might still be a better choice for smaller TOAST columns.)

Signez

Great overview of all the pitfalls in using arrays in Postgres (something that I found very tempting sometimes, but always bit me at some point).

I am surprised though by the mention of 2024 in the beginning of the section about vectors in a blogpost that seems to have been be published yesterday… is it a rerun? :)

buffalo7

Text flow on that page is totally broken on mobile, I presume because the code blocks don’t have a fixed with and scroll bars. Too bad.

BinaryIgor

Whether you use a distinct integer[] type or a JSON list [1, 2, 3], you are making the exact same architectural decision: you are prioritising locality over normalisation.

A great way to put it!