Databases Icon

Databases

Databases, structured data, data stores, etc.
71 Stories
All Topics

The Changelog The Changelog #429

Community perspectives on Elastic vs AWS

This week on The Changelog we’re talking about the recent falling out between Elastic and AWS around the relicensing of Elasticsearch and Kibana. Like many in the community, we have been watching this very closely.

Here’s the tldr for context. On January 21st, Elastic posted a blog post sharing their concerns with Amazon/AWS misleading and confusing the community, saying “They have been doing things that we think are just NOT OK since 2015 and it has only gotten worse.” This lead them to relicense Elasticsearch and Kibana with a dual license, a proprietary license and the Sever Side Public License (SSPL). AWS responded two days later stating that they are “stepping up for a truly open source Elasticsearch,” and shared their plans to create and maintain forks of Elasticsearch and Kibana based on the latest ALv2-licensed codebases.

There’s a ton of detail and nuance beneath the surface, so we invited a handful of folks on the show to share their perspective. On today’s show you’ll hear from: Adam Jacob (co-founder and board member of Chef), Heather Meeker (open-source lawyer and the author of the SSPL license), Manish Jain (founder and CTO at Dgraph Labs), Paul Dix (co-founder and CTO at InfluxDB), VM (Vicky) Brasseur (open source & free software business strategist), and Markus Stenqvist (everyday web dev from Sweden).

Brad Fitzpatrick tailscale.com

An unlikely database migration

So the Tailscale team were using a single text file as a database (as you do) and it worked great… until it didn’t.

Even with fast NVMe drives and splitting the database into two halves (important data vs. ephemeral data that we could lose on a tmpfs), things got slower and slower. We knew the day would come. The file reached a peak size of 150MB and we were writing it as quickly as the disk I/O would let us. Ain’t that just peachy?

So, migrate to MySQL or PostgreSQL, right? Maybe SQLite?

Nope, Crawshaw had other ideas.

I won’t ruin the surprise and tell you what they went with, but I will say it’s a widely deployed system amongst cloud natives…

Chua Bok Woon github.com

sq is a code-generated, type safe query builder and struct mapper for Go

From reading through the README, this seems like a nice balance between a full-blown ORM and hand-rolling all your own SQL. For example, this point from the The mapper function is the SELECT clause. section:

In sq whatever you SELECT is automatically mapped. This means you just have to write your query, execute it and if there were no errors, the data is already in your Go variables. No iterating rows, no specifying column scan order, no error checking three times. Write your query, run it, you’re done.

Databases github.com

Graviton is like ZFS for key-value stores

Graviton Database is simple, fast, versioned, authenticated, embeddable key-value store database in pure Go… Every write is tracked, versioned and authenticated with cryptographic proofs. Additionally it is possible to take snapshots of database. Also it is possible to use simple copy,rsync commands for database backup even during live updates without any possibilities of database corruption.

Still in Alpha, but a lot of work has been done and there are features a-plenty.

Practical AI Practical AI #94

Operationalizing ML/AI with MemSQL

A lot of effort is put into the training of AI models, but, for those of us that actually want to run AI models in production, performance and scaling quickly become blockers. Nikita from MemSQL joins us to talk about how people are integrating ML/AI inference at scale into existing SQL-based workflows. He also touches on how model features and raw files can be managed and integrated with distributed databases.

Go github.com

A lightweight, high-speed immutable database for systems and applications

With immudb you can track changes in sensitive data in your transactional databases and then record those changes permanently in a tamperproof immudb database. This allows you to keep an indelible history of sensitive data, for example debit/credit card transactions.

There are so many options for storing data these days. If you haven’t heard Go Time’s excellent episode on databases yet, Jaana does a great job of explaining some of the trade-offs.

Go Time Go Time #132

The trouble with databases

Databases are tricky, especially at scale. In this episode Mat, Jaana, and Jon discuss different types of databases, the pros and cons of each, along with the many ways developers can have issues with databases. They also explore questions like, “Why are serial IDs problematic?” and “What alternatives are there if we aren’t using serial IDs?” while at it.

Jaana Dogan Medium

Things I wished more developers knew about databases

Jaana Dogan started with a draft and this tweet and ended up laying down some serious knowledge on databases.

A large majority of computer systems have some state and are likely to depend on a storage system. My knowledge on databases accumulated over time, but along the way our design mistakes caused data loss and outages. In data-heavy systems, databases are at the core of system design goals and tradeoffs. Even though it is impossible to ignore how databases work, the problems that application developers foresee and experience will often be just the tip of the iceberg.

CockroachDB openmymind.net

Migrating from Postgres to CockroachDB

This is a nice lessons learned post from one engineering team making a database switch.

Overall, I’m happy with how the effort turned out and with CockroachDB in general. Because it uses PostgreSQL’s wire protocol, existing PostgreSQL drivers should work as-is. But we did run into some challenges that are worth pointing out. Here’s a list of things you might want to consider…

I like the update at the end, which emphasizes the important of tests for making a switch of this magnitude:

The system that was migrated has solid tests and good coverage. While a lot of the differences we ran into are obvious (like lack of range types and triggers), others were more subtle (especially the odd on conflict behavior). Test coverage made a pretty significant impact in the speed of the migration and our confidence in pushing live.

Rust github.com

The Rust SQL Toolkit 🧰

SQLx is a modern SQL client built from the ground up for Rust, in Rust.

  • Truly Asynchronous. Built from the ground-up using async-std using async streams for maximum concurrency.

  • Type-safe SQL (if you want it) without DSLs. Use the query!() macro to check your SQL and bind parameters at compile time. (You can still use dynamic SQL queries if you like.)

  • Pure Rust. The Postgres and MySQL/MariaDB drivers are written in pure Rust using zero unsafe code.

Databases github.com

A NewSQL relational database designed to process time-series data, faster

Our approach comes from low-latency trading; QuestDB’s stack is engineered from scratch, zero-GC Java and dependency-free.

QuestDB ingests data via HTTP, PostgresSQL wire protocol, Influx line protocol or directly from Java. Reading data is done using SQL via HTTP, PostgreSQL wire protocol or via Java API. The whole database and console fit in a 3.5Mb package.

According to the great knowledge base in the sky, NewSQL is, “a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing workloads while maintaining the ACID guarantees of a traditional database system.”

Go Time Go Time #108

Graph databases

Mat, Johnny, and Jaana are joined by Francesc Campoy to talk about Graph databases. We ask all the important questions — What are graph databases (and why do we need them)? What advantages do they have over relational databases? Are graph databases better at answering questions you didn’t anticipate? How is data structured? How do queries work? What problems are they good at solving? What problems are they not suitable for? And…since we had Francesc on the hot seat, we asked him about Just for Func and when it’s coming back.

Databases cs.cmu.edu

The next 50 years of databases

One question I ask a lot of folks I interview is what $PROJECT_X looks like three to five (sometimes 10) years from now. Very few people answer that question without some hemming and hawing.

Enter Andy Pavlo, Associate Professor of Databaseology at Carnegie Mellon, throwing his hat in the ring on the future of databases 50 years (!) from now:

The role of humans as database administrators will cease to exist. These future systems will be too complex for a human to reason about. DBMSs will finally be completely autonomous and self-healing. Again, the tighter coupling between programming frameworks and DBMSs will allow the system to make better decisions on how to organize data, provision resources, and optimize execution than human-generated planning.

That is just one of roughly eight things Andy predicts. Fun to think about, if nothing else.

Derek Sivers sivers.org

PostgreSQL self-contained stored procedures example

Based on Derek’s now page he has ended his 7 year sabbatical and he’s taking Seth Godin’s advice to publish something every day. What Derek shared here is part of that commitment…

This week, I wrote a shopping cart to sell my books directly from my own site. So I took a couple extra hours today to put my code into public view, so anyone can play around with it.

It’s a working self-contained shopping cart / store. It’s a very concrete example of using stored procedures to keep all the data logic together in one place. You can use it from JavaScript, Python, Ruby, or any language you want, since all the functionality is in the database itself. It works.

Security osquery.io

Query your OS like a database

osquery exposes an operating system as a high-performance relational database. This allows you to write SQL queries to explore operating system data. With osquery, SQL tables represent abstract concepts such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events or file hashes.

osquery> SELECT name, path, pid FROM processes WHERE on_disk = 0;
name = Drop_Agent
path = /Users/jim/bin/dropage
pid = 561

Julia Evans jvns.ca

SQL queries don't start with SELECT

Yesterday I was working on an explanation of window functions, and I found myself googling “can you filter based on the result of a window function”. As in – can you filter the result of a window function in a WHERE or HAVING or something?

Eventually I concluded “window functions must run after WHERE and GROUP BY happen, so you can’t do it”. But this led me to a bigger question – what order do SQL queries actually run in?

Kind of a snappy headline because Julia is talking about order in terms of execution and most of the time we’re thinking about order in terms of authoring. But still, TIL!

Liran Tal Snyk

Sequelize ORM found vulnerable to SQL injection

SQL injection is a serious vulnerability, effectively allowing an attacker to run roughshod over your entire database. If you’re using Sequelize, drop everything (pun unintended) and get patched up.

As a testament for Sequelize’s commitment to security and protecting their users as fast as possible, they promptly responded and released fixes in the 3.x and 5.x branches of the library, remediating the vulnerability and providing users with an upgrade path for SQL injection prevention.

Databases abe-winter.github.io

ORMs are backwards

I think all ORM users have a journey from ‘there should be a way to’ to ‘this is saving me so much work’ to ‘I have to reach into the vending machine to get my change out’.

I see the value in ORMs, but I also see where Abe is coming from in this article. I think the sweet spot for an ORM is when you’re just getting started making apps and you want to minimize how many technologies you need to learn to get there. I certainly learned SQL over a slow, productive period while utilizing its features from the warm embrace of Active Record.

Stick around to the end of the article where he reveals the anti-ORM he’s working on to solve some of these problems.

0:00 / 0:00