Rails and SQL's Relationship Status: "It's Complicated"

Aspirations

Reactive Record started out from a question I posed to myself: at what point does the buck stop? In a CRUD-type application, and most things are CRUD-type, I wondered where the application dwells. If that sounds oblique, by this I mean where is the place that I can look and see a coherent overview of the parts of my system? In conventional Rails, I think we don’t have a good answer to that. When doing things “by the book,” the notion of “my application” is a diffuse and miasmatic thing. It is often spread among dozens of files. The primary structuring mechanic seems to be accretion rather than declaration, and we’re left without a clear place that we can point to and say “that’s the application.”

I had this feeling bouncing around in my head about the “diffuseness” of Rails but I didn’t have a clear outlet and I couldn’t put my finger on what exactly that diffuseness was. Luckily, around this time, I heard a few things which crystallized some of the notions that had been floating half-formed in my head. Among these were Gary Bernhardt’s “Where Correctness Is Enforced” screencast and John Carmack’s QuakeCon keynote, the latter of which contains this pithy-brilliant quote:

"Everything that is syntactically legal that the compiler will accept will eventually wind up in your codebase.”

Gary’s screencast impressed upon me that the same is true for the database. Unless you’re in the habit of using hard database-level constraints, you will eventually end up with junk in the database. Junk in the database is bad in its own right, but the real problem with that junk is that it leeches into your application. It’s like a crack in the foundation of a house: you keep having to patch the plaster in the ceiling because the walls are shifting. If you’re worried about data integrity, about things like null lurking in otherwise innocuous code, you start to get defensive. Or at least I do. When I see a chain of calls like: chain.of.method.calls, my eye twitches just a bit. A nil could be lurking anywhere in that fluent interface and nil:NilClass doesn’t have the methods, of, method, or calls! I’m sure that there are solid object-oriented reasons why the above code is bad, other than the nils, but much of my unease boils down to the fact that I can’t trust my own data.

What those two inspirations made clear to me was that I should learn to lean heavily on database constraints. What could be DRYer than sticking to business logic in your application and using the database to handle boring data integrity problems? More generally, I’m learning to look for what I call declarative layers in applications. Before I go into what exactly I mean by declarative layers, let me talk about my understanding of DRY.

Essence of DRY

Although the way that it is quoted often makes it seem as though it is a justification in and of itself, it is important to remember that “Don’t Repeat Yourself” has an actual, final cause. The aim of DRY is not to shuffle code around or to serve as an excuse to demonstrate your cleverness, rather it is “compiled wisdom” counseling avoidance of duplicated code. And duplicated code is itself not self-evidently bad, but clearly becomes bad in the face of change – which is to say in any real codebase. But often we encounter “be DRY” as a principle that’s divorced from the justifications for it.

When we examine the justifications for being DRY though, we start to see some other practices that would achieve the same ends. And these other practices are DRY as a side-effect of how they work and not just circularly DRY for DRY’s sake.

Be declarative

Declarative is better than implicit or procedural. I submit that this:

person = {first_name: "Chris", last_name: "Wilson", mood: "affable"}

is better than:

person = {}
person.merge! first_name: "Chris"
person.merge! last_name: "Wilson"
person.merge! mood: "affable"

The details may stray into personal preference but I hold that there’s a real objective sense in which the first really is better than the second. Declarative code is DRY because we’re explicit about the structure we’re building and, unless we’re modifying it later, we’re doing it just once. Remember that the “R” stands for “repeat,” as in “don’t.” The benefits of avoiding global and mutable state are many.

Be derivable

Does being declarative leave us with any problems? Sure, if all of our data is in just one place and it comprises inert structure, then we have a problem when we want to use it. We have a very real logistical problem; how do we get our data from point A to point B, where we want to use it?

This is where derivation comes in. No, I’m not talking about taking the derivative of a function or a complex Wall Street financial gizmo. Nope, I’m just talking about the simplest sense of “derive” which is “to obtain something from (a specified source).” Imagine that we have some code that we’d like to write but the issue is that there is some perfectly good data, declaratively written, that implies the same code that we’re about to write. Wouldn’t it be nice if we could get our data to write our code? Or maybe, could we derive the code we want from the data we have?

It sounds esoteric, but this comes up all the time! The central idea I was trying to get at in Reactive Record was just this sort of problem (with a big nod to Gary Bernhardt’s talk). In the database, I have some DDL that declares a constraint upon the some data in my application:

CREATE TABLE employees (
    ...
    email      VARCHAR(255) NOT NULL UNIQUE,
    ...
    CONSTRAINT company_email CHECK (email LIKE '%@example.com')
);

Which would be equivalent to something like this in Rails:

class Employee < ActiveRecord::Base
  ...
  validate { errors.add(:email, "Email must be from example.com") unless email =~ /.*@example.com/ }
  ...
end

And that’s just what Reactive Record generates. I wanted to be able to use the declarative constraints from the database in the application.

An organizing principle

What I’m really arguing for is a different organizing principle for the applications that we build. Remember when I mentioned a declarative layer? A declarative layer is some place in an application that coherently describes the application and the relationship between its various components.

An application should have a place where we can look and grasp the system as a whole. This is essential for comprehensibility. There should be a place where everything hangs together. It’s good to be modular of course, but there should also be a root of the tree subcomponents.

The qualities that SQL and its associated DDL constraints have make for an excellent declarative layer. If it is used well, it encourages us to think hard about what data our application is manipulating and in what ways it is connected to other data. We should think about the types, relations, and the data involved in our apps and not just the business logic. If we do that well, we’ll also end up with a veritable fount of knowledge about the application embodied in the database schema.