Databases Icon

Databases

Databases, structured data, data stores, etc.
123 Stories
All Topics

SQLite unixsheikh.com

SQLite the only database you will ever need in most cases

Anybody following our news and/or podcast feeds for a bit already know of SQLite and have likely used it and continue to use it, but for those just joining us, here’s yet another dev with effusive praise for the little database engine that could:

SQLite is one of those projects that I wish I had known about long before I did. I had heard about it, but for many years I never thought about taking a serious look at it because I was under the false impression that is was a tiny database only useful for personal address books and small embedded devices.

I highly recommend you take a look at SQLite!

Tim github.com

Transform Gist into your personal key/value data store

Sometimes all a project needs is the ability to read/write small amounts of JSON data and have it saved in some persistent storage. Imagine a simple data-model which receives infrequent updates and could be represented as JSON object. It doesn’t demand a full-blown database, but it would be neat to have a way to interact with this data and have it persist across sessions.

This is where gist-database comes in handy, by leveraging the power of the gist api you can easily create a key/value data-store for your project.

This is a perfect solution for low write / high read scenarios when serving static site content with Next.js and using Incremental Static Regeneration to keep your cached content fresh.

Databases brandur.org

An alternative to `deleted_at` for soft record deletion

You’ve probably done the whole deleted_at thing to mark records as deleted without actually deleting them from the database. That strategy is fraught for a few reasons.

In this post, Brandur Leach proposes an alternative strategy with the following endorsement:

Speaking from 30,000 feet, programming is all about tradeoffs. However, this is one of those rare places where as far as I can tell the cost/benefit skew is so disproportionate that the common platitude falls flat.

Anytime your fellow developer can make that statement in earnest, whatever they’re talking about is worth a try…

Saul Pwanson visidata.org

Connect VisiData to SQL Databases with vdsql

vdsql is a new plugin that allows VisiData to connect to databases and query them directly, using the database’s own query engine. It uses Ibis to generate SQL for many popular backends, including Postgres, DuckDB, Clickhouse, and more.

vdsql v0.2, released this past week, is already quite useful, and development continues to improve both vdsql and VisiData for bigger data!

GitLab shekhargulati.com

My notes on GitLab’s Postgres schema design

Reading other people’s code is a sure-fire way to improve as a developer. But what about as a database designer? The same process applies!

This post by Shekhar Gulati is him sharing what he learned by applying that principle to GitLab’s schema.

I learnt a lot from the GitLab schema. They don’t blindly apply the same practices to all the table designs. Each table makes the best decision based on its purpose, the kind of data it stores, and its rate of growth.

More posts like this, please!

Databases brandur.org

Soft deletion probably isn't worth it

This article confirms my biases because I’ve always despised every soft delete implementation I’ve come up with. Most of them have looked something like what the author describes:

the technique has some major downsides. The first is that soft deletion logic bleeds out into all parts of your code. All our selects look something like this:

SELECT *
FROM customer
WHERE id = @id
    AND deleted_at IS NULL;

And forgetting that extra predicate on deleted_at can have dangerous consequences as it accidentally returns data that’s no longer meant to be seen.

ORMs help with this, but not enough. You set it as a default scope and then there’s that one time where you also want the deleted records so you come up with a custom query or dig into your ORM and try to find how to bypass the rule. Yuck!

He goes on to describe other problems as well. Maybe it’s all a big case of YAGNI?

Once again, soft deletion is theoretically a hedge against accidental data loss. As a last argument against it, I’d ask you to consider, realistically, whether undeletion is something that’s ever actually done.

When I worked at Heroku, we used soft deletion.

When I worked at Stripe, we used soft deletion.

At my job right now, we use soft deletion.

As far as I’m aware, never once, in ten plus years, did anyone at any of these places ever actually use soft deletion to undelete something.

Tooling prql-lang.org

PRQL is a modern language for transforming data

The P in PRQL (pronounced “Prequel”) stands for Pipelined, which I’m convinced is a great way of writing and reasoning about queries:

A PRQL query is a linear pipeline of transformations

Each line of the query is a transformation of the previous line’s result. This makes it easy to read, and simple to write.

It compiles to SQL, which means it’s compatible with most databases already and there are currently bindings for Python, JS & Rust, which is the compiler itself.

Try it out in their web-based playground. (Thanks, Wasm!)

Databases github.com

Dragonfly – a cost-effective, high-performing, and easy-to-use Redis replacement

Dragonfly is a modern in-memory datastore, fully compatible with Redis and Memcached APIs. Dragonfly implements novel algorithms and data structures on top of a multi-threaded, shared-nothing architecture. As a result, Dragonfly reaches x25 performance compared to Redis and supports millions of QPS on a single instance.

Dragonfly – a cost-effective, high-performing, and easy-to-use Redis replacement

Databases github.com

Grist is a lot like Airtable, but open source and more customizable

In their own words:

Grist is a modern relational spreadsheet. It combines the flexibility of a spreadsheet with the robustness of a database to organize your data and make you more productive.

Since so many people make the Airtable comparison that I did in the headline, the team behind Grist has written up a comparison of the two offerings.

CSS leemeichin.com

Yes, I can connect to a DB in CSS

Just wow. This is an impressively hacky hack. You’re probably wondering how? As many such things do, it all starts with a new (Chrome-only) API:

A new set of APIs affectionately known as Houdini give your browser the power to control CSS via its own Object Model in Javascript. In English, this means that you can make custom CSS styles, add custom properties, and so on…

And it ends with something that looks like this:

main {
  // ...
  --sql-query: SELECT name FROM test;
}

Databases github.com

PRQL – a modern language for transforming data

PRQL (pronounced “Prequel”) aims to be “a simpler and more powerful SQL”

Like SQL, it’s readable, explicit and declarative. Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL.

To get an idea on PRQL’s design, they provide this SQL statement as an example:

SELECT TOP 20
    title,
    country,
    AVG(salary) AS average_salary,
    SUM(salary) AS sum_salary,
    AVG(salary + payroll_tax) AS average_gross_salary,
    SUM(salary + payroll_tax) AS sum_gross_salary,
    AVG(salary + payroll_tax + benefits_cost) AS average_gross_cost,
    SUM(salary + payroll_tax + benefits_cost) AS sum_gross_cost,
    COUNT(*) as count
FROM employees
WHERE salary + payroll_tax + benefits_cost > 0 AND country = 'USA'
GROUP BY title, country
ORDER BY sum_gross_cost
HAVING count > 200

And then translate it to PRQL, which looks like:

from employees
filter country = "USA"                           # Each line transforms the previous result.
let gross_salary = salary + payroll_tax          # This _adds_ a column / variable.
let gross_cost   = gross_salary + benefits_cost  # Variables can use other variables.
filter gross_cost > 0
aggregate by:[title, country] [                  # `by` are the columns to group by.
    average salary,                              # These are the calcs to run on the groups.
    sum     salary,
    average gross_salary,
    sum     gross_salary,
    average gross_cost,
    sum     gross_cost,
    count,
]
sort sum_gross_cost                              # Uses the auto-generated column name.
filter count > 200
take 20

Konrad kolaente.dev

Simple, zero-fuss docker database backups

Back in the olden days, I would just put a mysqldump > dump.sql in a crontab and called it a day. When I started to host more and more stuff with docker, I first just migrated that approach to docker and put it all in a container. That still required me to mess around with config files. Once I started to host postgres containers it all got even more complicated. Thus, I needed a new solution.

I built this tool to make backups easy: Simply point it to a host running docker containers and it will automatically inspect and find all mysql/mariadb and postgres containers and do backups of them on a schedule. No configuration required, it “just works”.

Databases github.com

A collaborative IDE for your databases, right in your browser

Slashbase is an open-source collaborative IDE for your databases in your browser. Connect to your database, browse data, run a bunch of SQL commands or share SQL queries with your team, right from your browser!

It’s written in Golang and Nextjs React Framework (SPA) and runs as a single Linux binary with PostgreSQL. Documentation is currently WIP.

It’s early days and security will be a major concern to get right, but this has a lot of potential to unlock some cool use cases.

Simon Eskildsen sirupsen.com

Careful trading complexity for 'improvements'

Simon Eskildsen (of napkin math) shares a word of warning about one possible decision-making trap:

Whenever you find yourself arguing for improving infrastructure by yanking up complexity, you need to be very careful.

He applies this thinking to a common technical proposal of switching from a general-purpose RDBMS to a specialty database to account for growth and scale.

I’m a proponent of mastering and abusing existing tools, rather than chasing greener pastures. The more facility you gain with first-principle reasoning and napkin math, the closer I’d wager you’ll inch towards this conclusion as well. A new system theoretically having better guarantees is not enough of an argument. Adding a new system to your stack is a huge deal and difficult to undo.

Databases rachelbythebay.com

A terrible schema from a clueless programmer

Rachel by the Bay:

There’s a post going around tonight about how someone forgot to put an index on some database thing and wound up doing full table scans (or something like that). The rub is that instead of just being slow, it also cost a fair amount of money because this crazy vendor system charged by the row or somesuch. So, by scanning the whole table, they touched all of those rows, and oh hey, massive amounts of money just set ablaze!

The usual venues are discussing it, and I get the impression some people have the wrong approach to this. I want to describe a truly bad database schema I encountered, and then tell you a little about what it did to the system performance.

A fun story with an excellent twist at the end.

Databases pradeepchhetri.xyz

ClickHouse vs TimescaleDB

Two up-and-coming database options compared:

Recently, TimescaleDB published a blog comparing ClickHouse & TimescaleDB using timescale/tsbs, a timeseries benchmarking framework. I have some experience with PostgreSQL and ClickHouse but never got the chance to play with TimescaleDB. Some of the claims about TimescaleDB made in their post are very bold, that made me even more curious. I thought it’d be a great opportunity to try it out and see if those claims are really true.

Databases simplethread.com

Relational databases aren’t dinosaurs, they’re sharks

I’ve heard way less people throwing SQL under the bus than I did back in the high-hype NoSQL days, but this article by Justin Etheredge does a good job of laying out some of the advantages and disadvantages of the RDBMS side of the fence:

The next time you hear someone describe relational databases as yesterday’s technology, or the next time you see someone assume a relational database can’t handle the needs of their unproven MVP, stop and ask them how they are going to account for the tradeoffs they’re making. Make sure they understand they aren’t skipping a dead dinosaur, they’re taking a pass on the thousands of human-years of effort that have made relational databases the sharks of the data industry.

Player art
  0:00 / 0:00