SQLite's web renaissance
everyone's favorite little embedded db is having another moment
I won’t call SQLite’s current moment a comeback, because the most used database engine in the world doesn’t have anything to come back from. I’m going with “renaissance”, because despite its already mass adoption, there has been something of a rebirth of interest from one software sector that had previously relegated it to dev & test environments1: web apps
We asked SQLite creator Richard Hipp about his database’s gap in penetration back when we first told the story of SQLite’s success in 2016:
Now for a website where you’ve got a lot of write concurrency, you need to move to a client-server database engine because you need that server process there to coordinate the concurrency… There’s just no way to do that in a serverless database like SQLite. So for so many things you don’t have that concurrency.You’ve just got a single actor or one or two actors accessing at a time; it’s not a factor, and SQLite works great in those situations. It’s where you get into big concurrency that it breaks down.
The SQLite team is cool with this and even have a saying about it: “We don’t compete against Oracle, we compete against fopen.”
We followed up on this topic with Richard recently when we had him back on the pod last year:
SQLite was originally designed to be more of the database engine for the edge of the network versus the core of the network. It’s out on the peripheral devices, not in the core data center…
SQLite’s architecture hasn’t changed. It’s still “serverless” in the coolest, oldest-school sense of the word. What has changed, however, is the world around SQLite. Today, competing with “Oracle” (or more often PostgreSQL, MySQL, and friends) as a web app’s backend persistence layer sounds like a reasonable proposition. So, what changed?
Buzzword alert! New paradigm incoming! Wikipedia says:
Edge computing is a distributed computing paradigm that brings computation and data storage closer to the sources of data. This is expected to improve response times and save bandwidth.
For a minute edge devices were IoT toasters and iPhones. But then our buzzword overlords in big tech marketing departments realized that everyone already owns a toaster and a cellie, which makes them much more difficult to sell.
So, the term has been rebranded to to refer to… CDN PoPs, I guess? Buzzwords aside, you know what I’m talking about:
- AWS Lambda
- Cloudflare Workers
- Netlify Functions
- Google Cloud Functions
- Et cetera
All of these services exist because2 of a central truth rooted in physics: the closer your web app is to your end user, the better. CDNs made that easy for static responses by copying the files around the world. Edge computing is trying to make that easy for dynamic responses by distributing your application servers around the world.
But there’s one big catch with this strategy: most of the interesting things web apps do are powered by data that is stored in a database. Caching is hard3
. You can distribute your compute all around the world, but if your database isn’t co-located with that compute… it’ll only get you so far.
It turns out making client-server databases cloud native (i.e. ready to be distributed all around the world) is also hard. There are people working on it, but there’s a lot of work left to do.
Fly.io acquired Litestream, which performs streaming replication of a SQLite database to various locations (other files, S3, etc) and now employs its creator, Ben Johnson. Here’s what Ben has to say about SQLite:
SQLite isn’t just on the same machine as your application, but actually built into your application process. When you put your data right next to your application, you can see per-query latency drop to 10-20 microseconds. That’s micro, with a μ. A 50-100x improvement over an intra-region Postgres query.
Cloudflare is using SQLite to back D1, their database “designed for Cloudflare Workers.” Here’s what they have to say about it:
With D1, we want to take configuration off your hands, and take advantage of Cloudflare’s global network. D1 will create read-only clones of your data, close to where your users are, and constantly keep them up-to-date with changes.
A couple months back, someone Asked HN who is using SQLite as a primary database. Here’s what the top commenter reported:
I use SQLite in production for my SaaS. It’s really great — saves me money, required basically no setup/configuration/management, and has had no scaling issues whatsoever with a few million hits a month. SQLite is really blazing fast for typical SaaS workloads. And will be easy to scale by vertically scaling the vm it’s hosted on.
Litestream was the final piece missing from the puzzle. Now that they have continuous backups like you get from other databases, they’re “so on-board the SQLite train you guys.”
Are edge computing platforms and renewed interest from web developers kickstarting SQLite’s web renaissance? We’ll have to wait and see. But one thing’s for sure, it’ll be fun to watch!
I'm sure there are corners of the software world where people have been using it in production web apps all along, but it's not commonly discussed. ↩
I've been told Lambda and its ilk are also more convenient/productive than traditional server-side application development, but I remain unconvinced on that point. ↩
OK merely caching is easy. Cache invalidation is the hard part!↩
Sign in or Join to comment or subscribe