Production query plans without production data

69 points by sjamaan

simonw

This article is great.

I didn't know that PostgreSQL 18 added these features - the ability to export the query planner affecting statistics and load them into a smaller test database such that the query planner makes the same decisions as production - and they're clearly explained here.

This feels like a very PostgreSQL feature to me - powerful, cleanly designed and pragmatically addressing an issue that affects large scale production workloads.

zimpenfish

Now we need someone with influence to prod the SQLite team to add similar functionality...
- simonw
  
  Good idea, worth a shot.
  - zimpenfish
    
    And ... it turns out that you can already do that with SQLite. Bloody hell. But thanks for posing the question, at least now we know!
    
    "All of the data statistics used by the query planner in SQLite are available in the sqlite_stat1 table [which] is writable. You can inject whatever alternative statistics you like."
    
    simonw
    
    Hah, classic SQLite! Made for a good blog post.
    
    radim
    
    OP here - this is amazing. Going to try it later this week. Btw, I just figured out that I can do more on PostgreSQL using hook get_relation_info_hook - that allows to override even that last remaining issue - so now I can experimentally flip the plan even with 1 row in table.
    
    Will release the extension to allow this as part of RegreSQL
    
    minus
    
    Good article, but the LLM writing style is really off-putting
    
    radim
    
    Anything specific? I'm genuinely asking as author (non native English speaker)
    
    fanf
    
    I stopped reading early on, after this:
    
    PostgreSQL 18 changed that. Two new functions: pg_restore_relation_stats and pg_restore_attribute_stats write numbers directly into the catalog tables. Combined with pg_dump --statistics-only, you can treat optimizer statistics as a deployable artifact. Compact, portable, plain SQL.
    
    This last sentence reads to me like the original point of the paragraph had been turned into advertising copy by a slop machine. I think it’s trying to say that pg_dump --statistics-only emits SQL that can be loaded by the _stats functions, but it surely isn’t portable (because it’s specific to PostgreSQL) and SQL is not known for being compact.
    
    Which made me think the rest of the article would bury its point under pages of vague slightly-wrong fluff, and I would never be sure if a sentence was trying to something meaningful or if it was autogenerated padding. (Which is why LLMs make text harder to read, not easier.)
    
    I’ve given the article a second chance and it’s much better than the introduction suggested.
    
    There are things that don’t make much sense, eg towards the end “As we have seen earlier, it's not worth trying to inject relpages” when there wasn’t anything specifically about relpages earlier in the article. (It’s a field in the relation stats, and the article showed that the relation stats were not enough to get the right plan, but there was nothing about what not to try doing.) And I guess the “compact” in the intro refers to the size of the stats dumps relative to the table contents, but it isn’t clear if the examples are using fragments of real stats dumps (which would illustrate the size nicely) or if they were faked up by hand.
    
    minus
    
    Your CI database has 1,000 rows. Production has 50 million. The planner makes completely different decisions for each.
    
    The planner is still using the sequential plan. Only the estimated number of rows has changed. Why? If you remember from previous article, it's where column level statistics come into play.
    
    These particular lines are what made me comment. The first one sounds quite sensational but otherwise doesn't really flow well, imo. Short sentences like this are a tell-tale sign of LLMs (or so I've read) and do appear throughout the article. If you've had an LLM re-formulate your words, maybe you don't need to; if it's your writing, you can just take it me not enjoying your style. That being said I'm also not a native speaker.
    
    Just to be clear, I'm just referring to the style; the content is well put together and informative, thanks for writing the article
    
    radim
    
    Thank you for follow up (really).
    
    I agree this is might be subjective, but when writing some of "boring" deep dives it's tricky to provide content that is approachable. Plus by now I am biased as I'm learning how to "sell" those techniques during my talks. And my writing is more or less conversational.
    
    When it comes to short sentences I actually like them (those are in most cases my edits - LLMs do proof reading and suggest changes a lot, but they often go overboard).
    
    But agree, I do struggle from time to time keeping the same narrative pace.
    
    sjamaan
    
    As a writer who completely eschews LLMs, I tend to run the old style(1) UNIX command (with --print-long N) over my texts to identify problematic sections. This will also tell you how easy the text is to read in terms of an academic school year etc (based on things like the syllable count in words and the word count in sentences).
    
    Since I'm an emacs nerd I also use artbollocks-mode to find "weasel words" and passive voice.
    
    Getting objective metrics on your writing can help you identify why and how your writing can be improved and you can see it improving as you polish it. This is more useful because it forces you to think how to improve, rather than just feeding it to an LLM and hoping its cliched suggestions are actually any good.
    
    dasm
    
    I'm not an authority here, so take my comment with a grain of salt. You stated you use "passive voice". This advice contradicts with all what I learned about writing from "Publication Coach", eg https://www.publicationcoach.com/common-writing-errors/ (but there is more).
    
    Edit: I think I misunderstood your statement. You said "find [...] passive voice" so it probably means you want to get rid of it, not to preserve.
    
    jkaye
    
    My LLM-detector (brain) is usually pretty good, and I didn't find much in here that I thought was off-putting. Just another opinion for the author.
    
    jhd3
    
    Cool - ~~Greenplum~~ Cloudberry has a tool called minirepro that did this. I saw it used by the team that was writing the new (at the time) cost based optimizer to chase down customer reported issues.
    
    alper
    
    We literally got bit by this last week but also we're still running Postgres 15 so it'll be a while before we can take advantage of this feature.