Your Serverless Quota Check Has a Race Condition. Postgres Already Has the Fix.
Two concurrent requests, one quota, and the bug that lets a paying user end up at 101 of 100.
Here's a bug that doesn't show up in testing, doesn't show up in code review, and shows up in production the first time a single user runs two requests at the same time: your usage quota leaks.
If you run a serverless API with credit or quota tiers, free tier gets N calls, paid tier gets more, you almost certainly have a "consume" path that looks like this:
- Read how much the user has used.
- Check it against their limit.
- If there's room, record the new usage.
Read, check, write. It looks atomic. It isn't. And in a serverless deployment it's especially not, for a reason that's easy to miss.
Why it breaks
Two requests arrive for the same user at nearly the same instant. In a serverless platform they land in two different execution environments: two different containers, two different processes, two different database connections. Each opens a transaction. Each reads "used = 99, limit = 100." Each concludes there's room for one more. Each writes. The user is now at 101, having paid for 100.
The instinct is to reach for a transaction and assume the database sorts it out. It doesn't (not by default). Postgres' default isolation level is READ COMMITTED, under which each transaction reads from its own snapshot. Both transactions genuinely see "used = 99" because neither has committed yet when the other reads. There is no row to lock, because the row that would conflict is the one you're about to insert. Classic read-then-write lost-update, hiding behind a BEGIN.
The serverless part makes it worse only in that it removes the accidental protection you might have had: on a single long-lived server you might have happened to serialize this in application memory with a mutex. Across N stateless containers there is no shared memory to put a mutex in. Whatever coordinates this has to live outside the processes.
The usual answers, and what they cost
Reach for Redis (or ZooKeeper, or etcd). Take a distributed lock keyed on the user, do your read-check-write, release. This works. It also means standing up and operating a second stateful system, paying for it, monitoring it, and inheriting its failure modes — lease expiry, fencing tokens, "what happens when the lock service is down." For a lot of systems that's a real piece of infrastructure added to protect three lines of logic.
Crank isolation up to SERIALIZABLE. Now Postgres will detect the conflict — and abort one of the transactions with a serialization failure, which your application has to catch and retry. Correct, but under real contention you get retry storms, and every caller now needs retry logic. You've traded a quota leak for a throughput cliff.
Accept the leak. Tempting, occasionally defensible, but it's literally giving away the product.
There's a fourth option that needs no new infrastructure, no retry logic, and no extra failure modes, because it uses the one piece of shared, durable, already-present coordination state every one of those containers is already talking to: the database itself.
Postgres advisory locks
Postgres has a lock facility that has nothing to do with rows or MVCC: advisory locks. They're application-defined locks identified by an integer key; it just guarantees that two sessions asking for the same key serialize. The lock table lives in the Postgres server's shared memory, so any connection, from any container or process, contends on the same lock. That's the whole trick: your database is already the shared coordination point. You don't need Redis to be a shared coordination point because you already have one.
The pattern, in full:
BEGIN;
-- Serialize all "consume" operations for THIS user, across every
-- connection and container, for the life of this transaction.
SELECT pg_advisory_xact_lock(hashtext($1)); -- $1 = user_id (text)
-- Now we are the only one in this critical section for this user.
SELECT coalesce(sum(amount), 0) AS used
FROM credit_ledger
WHERE user_id = $1;
-- (application checks: used + cost <= limit)
INSERT INTO credit_ledger (user_id, amount, created_at)
VALUES ($1, $2, now());
COMMIT; -- the advisory lock releases automatically here
Three details carry the weight:
xact: transaction-scoped.pg_advisory_xact_lockholds the lock until the transaction ends and releases it automatically on commit or rollback. You cannot leak the lock by forgetting to release it, and a crashed request can't strand it: the transaction tears down and the lock goes with it. (This also matters if you run a transaction-mode connection pooler like PgBouncer: transaction-scoped advisory locks are pooler-safe; the session-scoped variant,pg_advisory_lock, is not, because the pooler hands your connection to someone else after the transaction.)hashtext($1): keying on a string. Advisory locks take an integer key, but your user IDs are probably UUIDs or strings.hashtextmaps a string to a 32-bit integer so you can lock on it directly, with no separate string-to-int mapping table. The honest caveat:hashtextcan collide, so two different users could occasionally hash to the same key and serialize against each other unnecessarily. It's correctness-safe (you never get less mutual exclusion, only rarely a little more) and if false contention ever bothers you, use the two-argument formpg_advisory_xact_lock(namespace_int, hashtext($1))to namespace your locks and shrink the collision surface.Per-user keying — bounded contention. Because the key is derived from the user ID, only concurrent requests for the same user serialize. Two different users take two different keys and sail straight past each other with full concurrency. You are not putting a global mutex on your consume path; you're putting a per-user one, which is exactly the granularity the problem has.
When not to do this
Be honest about the edges. This serializes same-user consume operations at the database, so a single user firing many concurrent requests will queue (which is the point), but know it's there. Set a lock_timeout so a pathological wait fails fast instead of hanging. And it assumes a user's writes all land on the same Postgres instance; if you've sharded users across independent databases, "same user → same shard" has to hold (it usually does, because you'd shard by user); advisory locks don't reach across separate database servers.
But for the common shape (one primary Postgres, per-user quota, serverless compute that scales horizontally) this replaces a whole piece of lock infrastructure with one line inside a transaction you were already opening. No new system to run. No retry logic. No new way to page yourself at 2 a.m. The coordination primitive was sitting in your database the entire time.