This week weāre talking about serverless Postgres! Weāre joined by Nikita Shamgunov, co-founder and CEO of Neon. With Neon, truly serverless PostgreSQL is finally here. Neon isnāt Postgres compatibleā¦it actually is Postgres! Neon is also open source under the Apache License 2.0.
We talk about what a cloud native serverless Postgres looks like, why developers want Postgres and why of the top 5 databases only Postgres is growing (according to DB-Engines Ranking), we talk about how they separated storage and compute to offer autoscaling, branching, and bottomless storage, we also talk about their focus on DX ā where theyāre getting it right and where they need to improve. Neon is invite only as of the recording and release of this episode, but near the end of the show Nikita shares a few ways to get an invite and early access.
Nikita Shamgunov: [43:55] Or you can say, āOh, well, this row lives in Tokyo. So if you want to modify that row from New York, you pay the latency for modifying that row, and New York and New Tokyo does not pay the latency, because that row is closer to you.ā Thatās easier, but it still requires you to decide where this row lives.
Now, there is actually a very practical solution to this, and as a database person, that pains me a little bit because of how simple that is⦠But I think from a practical standpoint, it will actually satisfy a lot of users. And I think thatās what weāre going to start with, and thatās what Fly is doing as well, a little bit. So you split your queries into reads and writes, you say your primary write replica lives in a region, you let the user choose that region, you replicate it from that region to as many regions as you need⦠You actually unlikely need more than five, but you can go all the way to 26, which is the number of data centers in AWS, or it can go to 200, like Cloudflare⦠At some point, it will get tricky to replicate to 200, so you will need to separate replication from the engine as well. But regardless, you can send reads to a local replica. But you need to understand that that replica will be behind of the master copy by X amount of milliseconds.
I believe at some point ā I havenāt checked recently, but at some point Fly had a heuristic. If youāre less than 400 milliseconds behind, weāll send reads to your local replica. And this in a way dumb approach can surprisingly go very, very far. It will have side effects. Well, whatās a side effect? Well, itās called read after write. So you write and you immediately read what youāve just written, and then that thing might not have arrived to the local copy yet. So you feel like you wrote a number one, and then youāre reading that number, and itās still an old value, like zero, or something. That can be mitigated by messing with the proxy, and the proxy can detect those read after write patterns, and in that particular situation send the read into the right replica as well.
And the more Iām scouting the market and Iām researching alternative solutions, and Aurora shipped multi-master. Multi-master works, Spanner shipped multi-master. But the more I talk to people out there, Iām realizing that that simple and understandable paradigm is oftentimes more powerful because of its simplicity, to compare to all the paradigms of like, āOkay, weāre gonna run a distributed consensus with three to five locations in the worldā, and now either all of your latencies are very long, or you need to put some sort of machinery in place where you start fine-tuning, āOkay, well, this data lives here, and this data lives thereā, and if you let the system decide which data lives where, that introduces uncertainty, and it changes from having like a simple solution, like AK47, into like this sophisticated thing, where people just stop understanding when the latencies are short and when the latencies are long.
So what do we do internally at Neon? I think weāre going to ship what Iāve just said, where weāre just going to have multiple read replicas around the world, and our proxy will be routing traffic to the local replica. soon. In parallel, weāre working with a famous database professor Daniel Abadi out of University of Maryland, and heās the creator of what is called the Calvin Protocol, which is the foundation of FaunaDB. He is applying similar ideas into the Neon architecture. The difference is row-based versus page-based. That does require, today ā as we are halfway through that research project, that requires people to assign data to regions. And you can pose this as this thing called partitioning, so for every partition you need to say, āWell, this partition lives here.ā The moment you do that, a bunch of things fall apart a little bit, wherein creating an index across all the regions becomes harder, and stuff like that.
[48:12] So while I find this fascinating, and Iāve spent like a decade thinking about it - you know, on and off - I think that simple, straightforward approach where you say, āWell, my primary is here, and Iām going to have up to 20 replicas in the worldā can satisfy 99.9% of use cases.