Is your Postgres ready for production?
A solid pre-flight checklist by PG aficionado Craig Kerstiens. It covers all the essentials: backups, high-availability, logging, statement timeouts & connection pooling.
A solid pre-flight checklist by PG aficionado Craig Kerstiens. It covers all the essentials: backups, high-availability, logging, statement timeouts & connection pooling.
You know how collaborating with Google Sheets, Airtable & friends is super easy because anybody with a web browser and the correct permissions can get in on it? With Mathesar, you can enable that kind of access to any Postgres database!
Super powerful? I think so. Super risky?! Likely, especially if a lot of your data-related logic is in app code. Worth it? That’s up to you to decide…
Yup, that Neon. Congrats to Nikita and team for this big win.
Databases and web apps go together like peanut butter and jelly. In a word, they’re inseparable. And despite all the amazing innovations in NoSQL data stores, often a good old relational database is the most reliable tool for the job.
We want to make it completely seamless to develop applications that need databases on Replit. Starting today, you can create and instantly begin to use PostgreSQL databases from within the Replit workspace.
The fine print.
Under the hood, this product is powered by our friends at Neon who have created a lightning-fast serverless database. The database will go to sleep after 5 minutes of inactivity. Most clients should handle the reconnection seamlessly.
This week we’re talking about by Postgres with Craig Kerstiens, Chief Product Officer at Crunchy Data, and a well known ambassador for Postgres. Just Postgres. That’s what this week’s show is about.
This is an incredibly detailed and fun story producing a system architecture that includes Postgres, Elasticsearch & an OCR service running on a cluster of iPhones.
How much effort will one man put in to help himself and others find that meme!
Stephan Schmidt warns: “We have invited complexity through the door. But it will not leave as easily.”
One way to simplify your stack and reduce the moving parts, speed up development, lower the risk and deliver more features in your startup is “Use Postgres for everything”. Postgres can replace - up to millions of users - many backend technologies, Kafka, RabbitMQ, Mongo and Redis among them.
He then goes on to list nine things you can use Postgres for that replace more specific solutions: fulltext search, geospatial queries, etc.
This week we’re talking about serverless Postgres! We’re joined by Nikita Shamgunov, co-founder and CEO of Neon. With Neon, truly serverless PostgreSQL is finally here. Neon isn’t Postgres compatible…it actually is Postgres! Neon is also open source under the Apache License 2.0.
We talk about what a cloud native serverless Postgres looks like, why developers want Postgres and why of the top 5 databases only Postgres is growing (according to DB-Engines Ranking), we talk about how they separated storage and compute to offer autoscaling, branching, and bottomless storage, we also talk about their focus on DX — where they’re getting it right and where they need to improve. Neon is invite only as of the recording and release of this episode, but near the end of the show Nikita shares a few ways to get an invite and early access.
The picture below explains it better than I can with words. 👇
Craig Kerstiens:
Today I’m excited to introduce a new place for devs to polish their Postgres skills, a Postgres playground from Crunchy Data. What is the playground? Put simply it is:
- Postgres running in your local web browser
- With canned datasets you can load
- Guided tutorials to follow along to learn about the power of Postgres
Since it’s running in the browser you can only connect to it using the embedded psql
interface provided. So, it’s not actually useful for real workloads, but it is useful for learning and experimenting.
Reading other people’s code is a sure-fire way to improve as a developer. But what about as a database designer? The same process applies!
This post by Shekhar Gulati is him sharing what he learned by applying that principle to GitLab’s schema.
I learnt a lot from the GitLab schema. They don’t blindly apply the same practices to all the table designs. Each table makes the best decision based on its purpose, the kind of data it stores, and its rate of growth.
More posts like this, please!
Miguel Grinberg built a SQLite-backed web app for the blogging team at Twilio to track their effectiveness that started to respond slugishly when it grew much bigger than he expected (6.5 million individual daily traffic records, and with a user base that grew to over 200 employees).
He thought this might be a good time to switch to Postgres, but he wasn’t sure if the wins he expected would be realized:
Having publicly professed my dislike of performance benchmarks, I resisted the urge to look up any comparisons online, and instead embarked on a series of experiments to accurately measure the performance of these two databases for the specific use cases of this application.
In this in-depth article on Twilio’s blog (no doubt being tracked by Miguel’s web app as we speak) he goes into the details of his efforts with lots of interesting findings along the way (even a plot twist!). I’ll leave you with this statement from the end:
If you are going to take one thing away from this article, I hope it is that the only benchmarks that are valuable are those that run on your own platform, with your own stack, with your own data, and with your own software. And even then, you may need to add custom optimizations to get the best performance.
While scaling up machine learning at Instacart, Montana Low and Lev Kokotov discovered just how much you can do with the Postgres database. They are building on that work with PostgresML, an extension to the database that lets you train and deploy models to make online predictions using only SQL. This is super practical discussion that you don’t want to miss!
When we had Paul Copplestone from Supabase on The Changelog I asked him what a cloud native Postgres would look like. He replied,
Decoupled compute and storage. So the idea is that you should be able to attach the compute part of it to a storage, hopefully like an infinite storage; you know, anything that is infinitely scalable. If you can do this, and in particular, if the compute can start up very fast, maybe in, say, a hundred milliseconds via some sort of HTTP response, then that’s cloud native, yeah.
Enter Neon, which sells itself as:
The multi-cloud fully managed Postgres with a generous free tier. We separated storage and compute to offer autoscaling, branching, and bottomless storage.
Behind this new business from MemSQL co-founder Nikita Shamgunov is the linked (Apache-licensed) server written in Rust.
CloudNativePG is distributed under the Apache License 2.0, and is now owned and governed by a newly formed community of contributors to the project, built on solid principles and values inspired by the Cloud Native Computing Foundation (CNCF).
Among these are openness, fairness, inclusivity, technical excellence, “community over product/company,” built-in quality and built-in security. I’m part of the initial group of maintainers of the project, currently made up by the top six committers of the project within EDB.
If you’ve been following Ship It’s Kaizen episodes, you know we had troubles running Postgres in K8s and recently moved to a managed (sorta) database. I wonder if Gerhard will be tempted to try this operator out anyhow…
Train and deploy industry-standard or completely custom machine learning models, directly powered by your business data, into your production stack, with an open source Postgres extension.
Is there anything Postgres can’t do?! 😉
Rasmus Porsager created Postgres.js –the fastest full-featured PostgreSQL client for Node.js and Deno. Today he joins Jerod for a deep-dive on Postgres, why he created this open source library, and how you can use it to build pg-backed JavaScript applications.
Hot off the press from our friend Ben Johnson:
Postlite is a network proxy to allow access to remote SQLite databases over the Postgres wire protocol. This allows GUI tools to be used on remote SQLite databases which can make administration easier.
The proxy works by translating Postgres frontend wire messages into SQLite transactions and converting results back into Postgres response wire messages. Many Postgres clients also inspect the
pg_catalog
to determine system information so Postlite mirrors this catalog by using an attached in-memory database with virtual tables. The proxy also performs minor rewriting on these system queries to convert them to usable SQLite syntax.
It appears that Ben is taking code contributions this time around, but only for bug & documentation fixes.
Where by “auditing” they mean change tracking, not security testing:
Auditing is particularly useful for historical analysis. To demonstrate, imagine you have a
users
table that tracks when a user is online. You might add astatus
column which can have one of two values:online
andoffline
. How would you track how long a user is online for throughout an entire month? An auditing system would track every change with timestamps, and so you can measure the difference between each timestamp and sum them up for the entire month.
What really impresses me about the Supabase team is how much they’re leaning in to Postgres’ rich feature set to build high-quality-yet-low-level abstractions for their users. Paul Copplestone emphasized this when we had him on the show back in January.
Schema changes are usually critical operations to perform on a high volume database. One thing off, and you are looking at an outage. pg-osc
makes it easy and safe to run any ALTER
statement on a production database table with no locking.
One of the things we discussed with Paul Copplestone from Supabase was what, exactly, might a cloud native Postgres look like? Well, perhaps it will look like OrioleDB:
A new storage engine for PostgreSQL, bringing a modern approach to database capacity, capabilities and performance to the world’s most-loved database platform.
OrioleDB consists of an extension, building on the innovative table access method framework and other standard Postgres extension interfaces. By extending and enhancing the current table access methods, OrioleDB opens the door to a future of more powerful storage models that are optimized for cloud and modern hardware architectures.
If learning from incidents grabbed your attention, here’s a good one we can all learn from:
It was 5:17pm today, just as I was wrapping up work for the day, and my manager pinged me with the following chat:
<manager>: Hi Jeremy - we have a <other team> ticket - escalated to <leader>, <leader>, etc. <principal> is on trying to advise as well. Are you available this evening if needed for diagnostics? <coworker> is on the call now
No obligation; just checking in to see what my availability is. Quickly thinking it over – I didn’t have any plans tonight, nothing in particular on my agenda. Why not? If I can help then someone else on the team won’t have to, and I don’t have anything better to do tonight.
The game is afoot. It might be a long night…
PgBouncer is a popular connection pooler for Postgres. PgCat is that same functionality, only in Rust. This is alpha software and there is no worse place to run alpha software (IMHO) than in places that have direct connections to your database. Still, this is worth linking to because it’s yet another instance of open source infrastructure code getting Rusty.
This post by Shayon Mukherjee is in response to the recently logged post by Mike Coutermarsh in praise of MySQL for infra folks:
This is not a MySQL vs PostgreSQL post. This is just a small summary of what I have come to appreciate about PostgreSQL as an Infrastructure Engineer.
Always good to keep in mind whenever comparing technologies: trade-offs abound and certain people in certain situations will value certain things about different technologies differently. That’s for certain. 😉
From Mike Coutermarsh:
For years I’ve been noticing this pattern of infrastructure engineers I really respect preferring MySQL and product engineers preferring Postgres. It took quite a while for me to understand it. Especially coming from my background as a product engineer. Infrastructure engineers generally…
Read on to hear why (from Mike’s perspective).
This post covers 5 cases where the author uses the JSON data type for column definitions:
Craig Kerstiens has also weighed in on this topic for us.