Predrag Gruevski and Chris Krycho joined the show to talk about SemVer. We explore the challenges and the advantages of semantic versioning (aka SemVer), the need for improving the tooling around SemVer, where semantic versioning really shines and where itās needed, Types and SemVer, whether or not thereās a better way, and why itās not as simple as just opting out.
Chris Krycho: I think the answer is yes. And one of the reasons that Iām more bullish on sticking with SemVer and putting tooling around it is because I did survey the rest of the world as it were when it comes to versioning, and there are a lot of approaches that just say āAh, these problems with SemVer are fundamental. Scrap it.ā One of them, SoloVer, is just have one version number - itās 1, 2, 3, etc, just go up. And that has a certain appeal to it, but the actual fundamental issue there hasnāt changed. All it does is take the burden off of the maintainer of a library, and put it on all of the users. It says āOkay, now youāre responsible anytime any one of your dependencies changes, including transitively, anywhere in your dependency tree. So go read the release notesā, which tend to encode things like breaking changes in the release notes. Because again, communication problem, right? We want to know āWhat did this do?ā Also, as an aside, all of those proposals include things like āWell, you can also stick like pre-release numbers on the end.ā And Iām like āHold on, hold on⦠It seems kind of like weāre backing our way back toward this whole SemVer thing now, arenāt we? Shouldnāt your pre-release just be another number?ā
I think there is a sense in which there is a maybe fundamental local maximum. Maybe itās local, but the hillās so big that weāre not going to find a different path. I could be wrong about that. But when I go looking around, the things that seem like they might change the calculus here donāt so much eliminate the value of SemVer as they do build on it. So a good example here is what the Unison programming language does. Pretty small language, but it is aimed at industry. Itās not pure research. And they do something thatās really wacky, in the best way. You donāt store your code as plain text. Instead, they take advantage of the fact that theyāre a pure functional programming language, with really well specified semantics, and they say āOkay, we can take your code, normalize it, hash it, and store the compiled output of it with a pointer to itā, which means a whole bunch of interesting things⦠But for the purposes of versioning means when I make it breaking change, the original version is still there, because that hashed, compiled version of it got committed to a database instead of to plain text. And that database version is what anybody who depends on it sees.
[00:34:06.12] So when I add a new parameter to my function, the consumers are still pointing to the old function, which means they can pull this update and say āOkay, I can progressively switch over to the new function signature, but I can do that at will, and the two can live next to each otherā, and because it is a pure functional programming language with no side effects that arenāt managed off in the runtime, etc, etc. You know, leave all that aside. Suffice it to say because of that choice, they can just āship a breaking changeā without ever breaking anyone.
The reason you still want SemVer here though is because SemVer is a communication tool. And so SemVer lets you say, āOkay, there are these new features in the library. Hereās a bug fix. Youāre going to want this one.ā And even though that means you need to actually go update which compiled version of this function youāre pointing to, youāre getting data from that, and when you go to publish your library, you want to be able to use that information. Even knowing that itās not going to break your users in the same way, it does let you then say āOh, I didnāt actually mean to make a breaking change here. I wanted this to be compatible and to just keep working forward.ā
So things like that, I think, are pointers in the right direction. Thereās also a couple of papers out there from folks at the Nova University of Lisbon, who are asking āWhat happens if you bake versions as types into Java?ā Java because itās the kind of default language to do this kind of research on. Their proposal is very interesting from a type theoretic and versioning perspective, and would never get adopted in industry in a million years, because itās just way too much boilerplate⦠But it does the same thing weāre talking about; it bakes this notion of backwards compatibility in, in a way that I think if you were going to actually ship something like that in an industrial programming language, you would actually want SemVer as basically how you do it. And their type system that they slap on top of Java, effectively encode SemVer with keywords. Itās upgrades, and replaces, and things like that.
So I think thereās work to be done here, but I donāt think itās going to be in the near term, for one. So weāre going to need the tooling. And even if and when we see something like that type system on top of Java, or what Unison is doing, becoming more widespread, I think those kinds of things lower the risks in really interesting and important ways⦠But they would still really benefit from the kinds of tooling that weāre talking about. They also though highlight, I think, one of the things thatās easy to miss in these kinds of discussions, which is a lot of times people like me, who are type theory nerds, etc. like to go looking for that kind of a solution to a problem. It has two limitations. One is thatās never going to work for Ruby. I say āneverā, but you could imagine a world in which type adoption for Ruby is at 100%, but that world seems very unlikely to me⦠Not least because a lot of people who love Ruby love it because itās dynamically typed.
And second, doing all of those things purely at that, like āLetās bake it into the type system levelā has costs, because it turns out that itself then becomes a thing that you need to think about in terms of the versioning of your language. Because one of the things that shows up is that the more foundational, whatever your tool is - like, if youāre an app, and you just have consumers, itās not that big of a deal. Your versioning is basically purely marketing. If youāre a library, you have a bunch of apps that use you and maybe some other libraries. If youāre a framework that everybody else builds on, how well you do this now affects everybody else in the entire ecosystem. If youāre a programming language, youāre kind of doing it at the maximum level, and you still have to communicate those versioning constraints to other people. And the more complicated your type system is, the harder it is to actually understand what the implications are for versioning.
[00:38:09.28] So the tendency that people like me have, to say āAh, bake it into the types, and itāll be rigorous and checked foreverā can actually undermine your net goals here, because now youāve made it harder to think about this fundamental communication problem.