I Replaced Redis with PostgreSQL (And It's Faster)

35 points by pauloxnet

isra17

A few caveat:

LISTEN/NOTIFY doesn't work with db proxies such as pgbouncer which any non-trivial application will need at some point because postgres doesn't deal so well with high connection limit on its own.
CREATE INDEX CONCURRENTLY idx_cache_key ON cache(key) WHERE expires_at > NOW();: Postgres only allow immutable function / expression in index WHERE clause. This index isn't valid.
My experience of postgres + high update rate on indexed column usually lead to disaster where the stats get quickly outdated and the planner very confused and fallbacking on full table scan. Periodically cleaning up table with DELETE FROM ... query also prove to be more resource intensive (and slow) than it should be when there's a lot of data.

Postgres is great, but in my experience it works best when you keep it to high INSERT/SELECT, thing can get ugly quickly when you switch to high UPDATE/DELETE, which depending on the use case, redis can provide a much nicer painless experience.

zie

I mean I don't really disagree with your caveats, but "high" can mean lots of different things for different people.
Generally with PG, as you start to hit limits, you can just throw more resources at it. i.e. add more ram, cpu, faster disk, etc and delay actual scaling problems for a long time doing that.

To put some firm numbers that we have seen:

Our peak connection count was around 500 a few times. As long as you have plenty of ram, it's not really a problem. Though ideally you want to stay in the low 100's if you can, especially on resource constrained hardware, where "high" connection counts can get annoying.

Same with high U/D operations. 500 op/s might be ridiculously high for some people, but 5000 op/s might be normal for someone else. I've been into the low thousands of ops/sec with PG without trouble.

All that said we believe throwing more hardware every time we hit any slowness in PG is a great idea. Avoids having to work very hard at optimizations. Hardware is almost always cheaper than our human debug time.

YMMV, as your workloads are probably different than mine. That said, as long as one is not resource constrained, you should rarely expect PG to actually be the problem one is having.
jc00ke

I was pleasantly surprised that PgDog supports LISTEN/NOTIFY, at least experimentally ATM.
valenterry

Listen/notify only really makes sense with long-running processes anyways, and pgbouncer isn't necessary in many cases (e.g. basically all jvm applications).
- isra17
  
  basically all jvm applications
  
  I suppose you are talking about application side connection pooling? If so, how is that specific to JVM? Afaik, most language postgres driver comes with connection pooling by default. If you have multiple independant process connecting to postgres, you would still need pgbouncer even if each process uses their own connection pooling.
  - valenterry
    
    It's just that jvm applications are usually long running and not per-request, so application side pooling is the perfect fit.
    
    Also the jvm has hikariCP which goes far beyond what any builtin driver offers when it comes to connection pooling.
    
    If you have multiple independant process connecting to postgres, you would still need pgbouncer even if each process uses their own connection pooling.
    
    Yes, but there is no point in doing that in a long running application, except for lack of a good connection pooling library or missing language support.
- BinaryIgor
  
  Can you provide more specific example of where higher rate of updates/deletes has lead you queries to go through sub-optimal execution paths, like skipping indexes for example? I have being using Postgres for years and I've never bumped into a situation where it haven't used an index that was clearly beneficial.
  - fanf
    
    It can happen when the table statistics give the planner incorrect hints about cardinality or selectivity. There’s some discussion in the docs https://www.postgresql.org/docs/current/planner-stats.html
  - isra17
    
    If I remember well, I did get bit a few times when doing something like SELECT * FROM jobs WHERE processed_at IS NULL and then thinking I can just set an index on WHERE processed_at IS NULL and everything will be so fast. Except if this table fills up quick Postgres outdated stats means that it will try to do a full-table scan because it thinks the table is empty (which might be true most of the time, except when it's not, then :fire:) or end up filled with dead tuple and scanning the index end up skipping most of the entries and get very slow.
    
    Another even more common foot gun with the cleanup queries is that until you vacuum the dead tuple, you DELETE FROM big table WHERE date > :cutoff_date: with an index on date is going to get slower and slower until it starts to timeout because the index is just full of dead useless tuples.
    
    From my understanding that's a limitation of their update / delete design with index, trade some speed for writing at the cost of getting a bit more stalled data in indexes. I think MySQL has a different design that makes it a bit more efficient in this regard.
    
    Also, Postgres query planner is good most of the time, it will find a plan that is super efficient for 99% of the queries, but will end up with timeouts on this 1% for some reason. I find it annoying that Postgres developers opinion is that the planner is smarter than us and we should spend hours debugging what IO or stats is making Postgres prefer a bad plan over using the obvious index that will work for 100% of the case. At least we have pg_hint_plan to get around this...
    
    FWIW, I use Postgres a lot with a large table, some with very high update / row ratio and many of our selecting queries are quite complex. These are issues I doubt I would encounter on any of my personal project or a more traditional SaaS app.
    
    Don't get me wrong I love Postgres and use abuse it as much as I can, but it definitely have sharp corner and limits.
    
    BinaryIgor
    
    As pretty much all databases and approaches do! Anyways, thanks for the detailed answer :)
  - pickfire
    
    The combined operations is actually incorrect, postgres single transaction is only a single request, but redis + postgres if done using async and pipeline probably will be similar or faster.
  - dprkh
    
    Now replace PostgreSQL with SQLite, rewrite your back-end in Rust, and switch to a 5-dollar VPS. I swear, having only one instance of your application is very powerful.
    
    doctor_eval
    
    I assume this is not sarcastic, so … you can run pg on a $5 vps, you don’t need to rewrite in rust (!) to do it, and SQLite, while fast and reliable, has a whole steaming pile of its own problems. So this is not the best advice I’ve ever read on Lobsters.
    
    dprkh
    
    I think running PostgreSQL when you could be running SQLite is wasteful. I am also not aware of any embedded relational database that could rival SQLite.
    
    doctor_eval
    
    SQLite is fine, but it’s absolutely not a replacement for PG.
    
    dprkh
    
    Can you share some reasons why you would choose PG over SQLite?
    
    Here are reasons why I prefer SQLite:
    
    No network overhead
    
    Can run it in-memory during tests
    
    Easier to use (it's just a library)
    
    Easy to write extensions
    
    doctor_eval
    
    Out of the box, PG has a sane, extensible type system. Column types are checked. It doesn’t struggle when there are multiple writers. It’s safely accessible by multiple clients simultaneously. There are many high quality extensions for it that extend the type system (eg postgis). Testing against postgres is trivial. Network overhead via sockets is negligible (esp relative to IO overhead).
    
    I’ve used SQLite in a reasonably complex mobile app and it was fine, though I was surprised at the amount of ceremony I needed. But out of the box it’s not even SQL compliant, and there are heaps of gotchas and things you just “need to know”, like strict tables and various pragmas.
    
    So, I certainly did not find SQLite easier to use. Yes it’s just a library, but there’s a lot more to usability. I personally would only use SQLite in places (such as apps) where I have no other choice. I would not be worried about data integrity, but the ergonomics and missing functionality would be annoying.
    
    It’s not a religious thing. I’ve used both databases heavily. SQLite is fine. Both databases have their strengths and weaknesses. SQLite is very strong if you have no choice but to use an embedded database. But as someone who spends most of my days inside databases, given the choice, I’d take PG any day.
    
    number5
    
    But out of the box it’s not even SQL compliant,
    
    What do you mean by that?
    
    doctor_eval
    
    I was wrong about that. It’s the type enforcement I was referring to. TIL that, apparently, type enforcement isn’t required by the standard. But to get the expected behaviour from the database you need to use a proprietary sql extension (strict). Which really is my point.
    
    dprkh
    
    Type system is a good point. I really like the newtype pattern, and PostgreSQL allows you to implement it nicely with composite types and domains, which SQLite doesn't support. But after much consideration, I decided to not bother with that in the database, and enforce types in the application only, because, in Rust, you can do that in a more robust way, which cannot be easily replicated in PostgreSQL. Also, if you use SQLite, and enforce types in the application, conceptually, it is pretty much the same as enforcing them in the database, I think. For example, here is an email address newtype in Rust:
    
    #[derive(Clone, Debug, Display, Eq, PartialEq, Hash, Type, Deserialize, Serialize)] #[sqlx(try_from = "String")] #[serde(try_from = "String")] pub struct Email(String); #[derive(Clone, Copy, Debug, thiserror::Error)] pub enum EmailError { #[error("invalid format")] InvalidFormat, } impl TryFrom<String> for Email { type Error = EmailError; fn try_from(value: String) -> Result<Self, Self::Error> { if !EmailAddress::is_valid(&value) { return Err(EmailError::InvalidFormat); } let this = Self(value); Ok(this) } }
    
    I am using the email_address crate here which actually parses and validates the input string according to the email RFC, and doesn't just use some brittle regex, or whatever. This implementation is shared between serde, the library for JSON parsing, and sqlx, the library for database access (which also has to parse data from database format). This allows me to seamlessly re-use this newtype both when I need to accept a user's email in my back-end API, and when I need to later read that email from the database. I don't currently do this, but there are Rust WASM web frameworks like Yew or Dioxus, and by using them you could also share this newtype with your front-end, for validating forms, etc. I think this is really big because you share the exact same code everywhere you need to parse that data. This is one of the reasons why I mentioned Rust (not because of religion). I really believe that everyone should be writing at least their back-end in Rust, and at least not in things like Node.js or Python.
    
    I know that you cannot have more than one writer at the same time in SQLite, but that has not been a bottleneck for me. I also know that, for example, Telegram uses SQLite, and it appears that even at their scale it works out for them (they shard their servers that use SQLite).
    
    doctor_eval
    
    I'm happy if SQLite works for you, but the pattern you're using assumes you'll only ever manipulate the database within your application, and that's almost never true in my use cases. Exposing the type system to literally any client that speaks SQL - even bash - is a huge benefit in my work; I can log into the database from a shell and have the exact same view of my data as my application does.
    
    It's great that Telegram uses SQLite, but presumably they had enough funding to burn through the problems they must have encountered (seems reminiscent of Facebook's use of PHP). If it works for them, or it works for you, then that's awesome. But SQLite doesn't work for me as well as PG does. And that's why we have choices!
    
    But my god that is a huge amount of complex code just to implement an email address type.
    
    dprkh
    
    and that's almost never true in my use cases.
    
    That's totally ok. Obviously, there are use cases where SQLite would not be the best choice, I am merely arguing that, for an average app, a monolithic Rust + SQLite back-end would be superior to a Node.js + PostgreSQL one. Or at least I don't see any evidence that could contradict that.
    
    But my god that is a huge amount of complex code just to implement an email address type.
    
    It can appear complex if you are not familiar with Rust, but there really is nothing complex there. It just:
    
    Defines a newtype.
    
    Defines an error type.
    
    Implements a trait that provides a function to check newtype invariants.
    
    If you think that is too much boilerplate, you could make a custom helper macro, but for me that works fine.
    
    zie
    
    In PostgreSQL:
    
    CREATE EXTENSION citext; CREATE DOMAIN email AS citext;
    
    Then, when using it in a table:
    
    CREATE TABLE address ( id bigint generated by default as identity primary key, email citext UNIQUE, email_verified bool );
    
    This just enforces uniqueness and case insensitivity. Trying to do more validation is just asking for trouble, as the email address specs are broken, misunderstood and not worth trying to do. Instead it's better to just send an email and make sure the email address works. That's what you really care about anyway, does the address work. Not if it's valid or not. You could of course use almost any procedural language one could want and do all the verification, including with rust, but again it's a mostly pointless exercise. Just send an address with a link to verify if the address actually works and is deliverable.
    
    All that said, I agree with @doctor_eval's comment too.
    
    danielrheath
    
    Sometimes I want to open a console and run queries against the database; out of the box, sqlite will not do that if the webserver is running, since it's got the database open.
    
    dprkh
    
    It will, if you use the WAL mode, and you should.
    
    stephenr
    
    Reason #1 should be a warning about the costs of hosted services more than anything.
    
    I don't understand reason #2. You're using hosted services - worrying about backups or replication is what you're paying them to do for you. If you're going to manage it yourself anyway you can get a lot more resources for the money you're paying to AWS, by hosting it on regular VPS'.
    
    Reason #3 is solved by two factors:
    
    (a) HA redis (which, again, you're paying for someone to provide it isn't your problem) means in general it won't be down.
    
    (b) using a smarter approach to deletes: open transaction; delete in the DB; delete from redis; commit transaction. In the event the commit fails after the delete completed, this shouldn't be a problem - it's just a cache anyway.
    
    peter-leonov
    
    It hits the nerve of MyISAM vs InnoDB in MySQL. The former did not have transaction support and was significantly faster and easier to operate (at the expense of consistency). I'm guessing Redis vs Postgres must have the same dynamics: at first they're comparable but with load comes the price of complexity.