Go Time – Episode #328

OpenAPI & API design

with Jamie Tanna

All Episodes

We’re talking OpenAPI this week! Kris & Johnny are joined by Jamie Tanna, one of the maintainers of oapi-codegen, to discuss OpenAPI, API design philosophies, versioning, and open source maintenance and sustainability. In addition to the usual laughs and unpopular opinions, this week’s episode includes a Changelog++ section that you don’t want to miss.

Featuring

Sponsors

Coder.com – Instantly launch fully configured cloud development environments (CDE) and make your first commit in minutes. No need to traverse README files or await onboarding queues. Learn more at Coder.com

Notes & Links

📝 Edit Notes

Chapters

1 00:00 It's Go Time! 00:40
2 00:40 Sponsor: Coder.com 03:38
3 04:18 Welcoming Jamie Tanna 👀 01:40
4 05:58 OpenAPI 02:31
5 08:29 oapi-codegen 02:30
6 10:59 Code maliability 04:58
7 15:58 Why use this? 04:01
8 19:58 Read-me driven development 03:08
9 23:06 Managing versions 03:24
10 26:30 Versioning your APIs 12:14
11 38:45 Open rewrite 11:00
12 49:45 Maintaining 06:27
13 56:12 Email bankruptcy 10:30
14 1:06:43 Unpopular Opinions! 00:40
15 1:07:23 Jamie's unpop 02:31
16 1:09:54 Jamie's 2nd unpop 02:30
17 1:12:23 Outro (join ++!) 01:42

Transcript

📝 Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. 🎧

Welcome, welcome, welcome to yet another episode of Go Time. This week we are talking about OpenAPI and OAPI CodeGen. And joining me as co-host this week is Johnny Boursiquot. How are you doing today, Johnny?

I’m not on Linux… [laughter]

Yeah, no, that’s true. You are not on Linux. And our guest today, who is joining us from the land of Linux on the desktop, is Jamie Tanna. How are you doing today, Jamie?

Hey. I’m not doing as well as if I was probably joining on a Mac or Windows.

Fair. I would say Mac. Windows has been having a bad couple of weeks, so… Maybe not. [laughs] Yeah, so for those of you who don’t know, Jamie is a senior software engineer at Elastic. He’s also an open sourcer, blogger and does a whole bunch of other stuff, too. So yeah, thank you for joining us, Jamie.

I must say, it was interesting listening to the theme music. I think it’s the first time I’ve ever listened to it at one time speed… Because I usually listen to my podcasts at 1.7… So I was listening to it like “Wow, it sounds totally different.”

“Why is this so slow…? [unintelligible 00:05:50.13] so slow on this podcast.” [laughter]

Yes. Okay, so this week we’re talking about, as I said, OpenAPI, and a nice little library written in Go called OAPI CodeGen. So to start us off, Jamie, why don’t you tell our audience, for those who don’t know, what is OpenAPI, and then what is OAPI CodeGen?

Sure. So OpenAPI is an open standard for describing and documenting your API. So one of the things that it’s generally good to do if you’re building APIs or web services is writing documentation. A lot of the time that ends up just being like a big [unintelligible 00:06:34.06] where you have some code snippets, and it’s completely unstructured, and depending on who does it over time, it may get maintained as things change. So OpenAPI is a JSON or YAML-based standard for documenting how your APIs work.

It gives you a way of defining all the inputs, outputs, defining models, doing a load of very complex stuff that we’ll talk about later… But it provides a really nice way of being able to have a consistent language to describe your API, that is more importantly machine-parsable and machine-generatable, so you can then use that in a number of other means.

So would I use something like this to maybe specify, sort of have a general purpose spec, and then generate SDKs for various languages, or something like that?

Exactly, yeah. So it can be used at a number of different points in the life cycle. So as you mentioned, if you, the API producer, have written an API, you can define your spec, and then either you can create first-class SDKs for other people to use, or for instance as a consumer of an API, if there is an OpenAPI spec, you can then take that and you can generate your own code off the back of that. It can also be used for a lot of other purposes, and one thing I didn’t mention is it’s primarily used for HTTP APIs, RESTful APIs. It’s generally not a good fit for things like GraphQL, because GraphQL has its own complex schemas, and it’s best using [unintelligible 00:08:13.04] for that. And also if you’re using things like gRPC, that’s a completely different set of interfaces, and it makes more sense to use protobuf and assorted technologies instead of trying to use OpenAPI for that.

[00:08:29.06] Cool. So that’s OpenAPI. What is the OAPI CodeGen library?

So OAPI CodeGen is a Go command line tool and library for generating Go code from your OpenAPI. So it started – I think it was five-ish years ago, by my co-maintainer, Martin, who was working at a company called DeepMap, which has been acquired by NVIDIA… And one of the things he was looking at was, at the time, OpenAPI 3 was fairly recently out.

Previously, there was a thing called Swagger, which has been retroactively renamed OpenAPI 2.0. And so there were a load of tools that were built around Swagger, such as Swagger CodeGen… And the OpenAPI 3 and above ecosystem wasn’t really very modern. Well, not very modern. There weren’t a lot of good tools out there. And looking at the state of OpenAPI Generator, which was one of the other ones, it didn’t really provide very idiomatic Go code. So the product itself was built with Java, and so one of the problems was if you wanted to generate code, you had to install Java as well, which was fine on local machines, but in CI one of the great things about Go is it’s lightweight. You just need to install the Go toolchain and you’re ready to go. If you’re then also installing Java and various other things, it just adds on the burden of what you need to run your builds.

So Martin was looking at this, looking at the non-idiomatic Go code that there was, and was like “You know what, I can do that.” As many engineers think, “You know what? I can do better than that.”

In a weekend, most preferably…

[laughs] Exactly. And so OAPI CodeGen was born, and it provided a way of generating more idiomatic Go code from your OpenAPI spec. So provided you have a fully-written OpenAPI spec, it would then generate code. So it would give you your client boilerplate, it would give you any server-side boilerplate for various numbers of routes and servers, and it would also generate all your models. So you don’t have to do a load of pretty boring work that can be easily automated by a machine. It generates all that code for you, so you can then focus on actually integrating with it, instead of “Okay, well, this JSON field needs to be called this thing. And this thing is actually a more complex struct, so here are all the things I need to write for it.”

As with tools that generate code, is it recommended to sort of touch or not touch the code, the Go code being generated through the tool? How malleable is that code?

That’s a good question. So we recommend you don’t touch the generated code at all. So what we do is we generate a single file of Go code, but we provide a number of interfaces for consumers to implement. So for instance, if you’re generating a server based on, say, the new enhanced routing in Go 1.22, you get generated a load of boilerplate for this method on this interface, is this HTTP method, and this route. And then you simply implement a subset of the code which gets wrapped by the boilerplate.

[00:11:53.01] So it will do things like pull out required HTTP headers, it will parse your query string, and it will pass a load of parameters into your methods. And then you need to do a subset of the things. So you then don’t need to modify any of the generated code.

One thing we do find is not everyone’s happy with the code that we’ve written, and sometimes you want to tweak certain things. So there’s a number of configuration options that you can tune the way that things like initialisms work… If you have internal initialisms, for things like internal product names and stuff like that. And you can also actually override all of the templates in the codebase with your own custom templates. So if you wanted to completely change the way that that, as I mentioned, that server interface gets generated, you can control that through your own handwritten templates.

What do you find to be the most common set of complaints about the project?

So one of the more difficult parts of OpenAPI to replicate in Go [unintelligible 00:12:54.07] there is an incredibly powerful part of OpenAPI is some of the more complex schemas that you can write. So OpenAPI is – especially in the most recent version, 3.1, it’s based on JSON schema. Within JSON schema as a standard for defining schemas of other objects, like JSON and YAML, one of the things you can do is you can say things like “Here is a list of objects. Some of these objects may match this schema. Some of them may match this other thing.” They all may have, say, an ID field and a discriminator field. Then some of them may contain three other objects. Some of them may contain different things. So you can express a load of really powerful things. But one of the problems is then you have to try and actually correspond that with Go’s type system, the sort of things that you can do.

So one of my favorite APIs that I’ve worked with is endoflife.date. It’s a great place where you can go to find out, like, is the version of Go that I’m using end of life? And when does support end? Stuff like that. But one of the really horrible things they have for legacy reasons is the fact that these software can either have a date that is unsupported by, or end of lifed from, or it could just be end of life or unsupported. And so in the response that you get back from the API, you either get a boolean or a string that is an ISO 8601 date.

And so literally in the last couple of weeks I’ve been doing some more integrations with it using OAPI CodeGen, and I really hate it, because you then have to work out, “Is this a bool or is this a string? And if it is a string, how do I parse that? And if it is a bool - okay, how do I respond to that?”

And so I’ve worked with a number of APIs, and unfortunately our users have worked with a number of APIs where there’s either these sorts of fields that the developers of the APIs maybe haven’t used as strong a typed language, so they can change things around, or they want to do more complex things where I know sometimes this field is here on like admin users, but for regular users there aren’t things. So by adding those really complex ways of describing things, which are really powerful, it also means that for us as a project, we have to try and maintain that, and also fit it within Go’s type system.

So we are constantly finding new edge cases and new things that we never considered people are doing. Like, they will have, say, an object which will be one of three types. So it will have like an any of, and then within there, in some cases there’s an enum that is different in values between each one. And it’s like “How do you even generate those sorts of things?” Yeah, so that’s probably the biggest one, is trying to deal with complex types and schemas.

Alright, so there’s lots of – I mean, as you mentioned, there’s the old Java way of generating your code for you…

What does OAPI CodeGen give you over the other things you could do, or like handwriting an implementation of an OpenAPI specification?

[00:16:15.21] So as I mentioned, one of the things that we do is if you have a spec, we will then generate code for you. So one of the things that kind of forces you to do is write that spec up front. And there are a number of different tools in the ecosystem that do that. One of the things we focus on is you have to design your API up front, by actually writing that YAML spec. And as part of that, that means that we actually have a slightly easier job, so we don’t need to try and work out from your actual codebase what are all the types that should exist. We can just look at this single or multiple YAML file, and then say “These are all the types that could exist, and these are all the things that you need.”

So for me, one of the most painful things I find when I’m ever working with anyone’s APIs is “What are all the types that I need to define, to do like the minimum amount of my job?” And sometimes you can use things like Matt Holt’s JSON-to-Go generator, where you chuck in a blob of JSON and it will transpose that to a Go struct, and it will kind of do a pretty good job. But a lot of the times it just takes time, and it’s not particularly useful time for engineers to be doing. For me, the biggest selling point I have for my own usage is just generating models and types that you want, on both your request and response. But as a library, we also have a number of other things.

As I mentioned, you can generate full clients… So they’re not as nice an SDK as some of the commercial tools in the ecosystem, that generate some really nice, usable things… But I would say the client works, and it is fairly nice to work with. I wouldn’t say it’s the best, because it’s definitely not… But it is fairly nice to use.

And then the other one that I find has really helped when I’ve been building Go APIs is the ability to generate server boilerplate. As I mentioned, being able to wire in your router to “Okay, this function is actually to be called for this endpoint, on this method”, and wiring, and all of that. It’s not something you have to do that much, but it’s one of those things that is like “Why not just automate it if you can?” And so if you’re doing the work to define up front, “These are the inputs that I require in this given HTTP call, and this is the URL”, why not generate the stuff to do that?

And then from there, having a slightly lighter-weight function that you need to implement is quite nice. So instead of you now needing to say “Okay, I have a method which requires a header for like correlation ID, or tracing ID, and I also have like a path parameter.” Instead of you having to write the code that goes into the request and grabs up the header, and that also grabs up the path parameter, you can now get past those on the function call. So it just removes a little bit of code, that is not that much in the grand scheme of things, but it’s a little bit nicer. And then your actual function implementation can be a little bit more lighter-weight. You don’t have to think as much about the HTTP; you can focus a little bit more on the actual logic, which then makes unit testing those methods a lot nicer… Because again, you can pass in more of the things that you need, rather than trying to pass in like a very valid HTTP request.

[00:19:58.21] Hm… I’m trying to slot this into my workflow. I’m trying to think of where does this make sense. So something you mentioned earlier made me think of sort of readme-driven development. It’s almost like you’re basically specifying what your API is going to be before you generate any code for it, or anything like that. I suppose if you are tasked with consuming, say, some third-party API, they have some documentation… You could generate an Open API spec that you then generate Go code that you use to interact with that third party. And not necessarily relying on the third party to provide an open API spec. Chances are they won’t. Chances are they will provide a client instead, or they’ll have some sort of SDK. But you yourself as a developer consuming this API, you could have your own Open API spec to generate code to – basically, the boilerplate client that you can then use… In the case where the third party doesn’t provide one, you could generate your own to interact with some third party. So from a typical workflow standpoint, how are people using this tool?

So as I mentioned, I’m at Elastic, and we have a number of teams on our new serverless platform using OAPI CodeGen to reduce the boilerplate for writing services. So for teams who want to do the upfront work, to think “This is how our API should look”, they will then do that and generate boilerplate and everything, and then just wire in the bits.

We also have cases where you need to do cross-service communication. If the project API has just written their specification, you can then take that and you can generate an automated client, and that then produces a lot of the work that you need to do. So now really the main thing is, is the specification that the other service has written in sync with the version you’ve got? And that’s really the big thing. But because the other team have done that work to make the spec nice and usable, that works really nicely.

As you say, not everyone does that. And so there are tools on the market that you can do to sniff your network and watch what sort of requests you’re making… And then through that, it will actually generate you a spec. I find that if I’m in a case where I’m working with an API that doesn’t have an Open API spec, I won’t actually go and write one for them. I will just write as lightweight a client as I can do, and I won’t go through that additional step of OpenAPI, because I feel like in that case that’s kind of a waste of my time. Even though I can get a load of boilerplate built out of it, I find that it works nicely when you own the API, and you have more control over the contract… But if the external service can change however they want, it makes it a little bit less easy.

How do you manage – if you are on a team that basically helps fellow teams by producing a spec that those teams can generate clients for, how do you as the owning team manage versions and changes?

So that’s a really difficult one. It’s not a solved problem. So it depends on, as a team, how do you want to do versioning? I generally prefer doing server-driven media type contact negotiation, so using the [unintelligible 00:23:40.17] accept header, where you can actually version the request that you are waiting for. I absolutely love that as a model. It’s really hard to get right. And a few years ago, I spent a lot of time digging into it… I’ve recently gone back to one of my projects that heavily uses it, and I’m like “How does this work…?” So now I kind of recommend using like a version in the URL, so like /v1/ping, and things like that. So that at least means that if you are actively making a breaking change, you are thinking “I’m going to move it into v2.” But that’s not always the case. There are often accidental issues that come in.

[00:24:21.17] So one of the things I like doing in my projects is where we have an API with Open API, where it then generates a client for the use of like end to end tests, or integration tests, however you want to call that. And those generated files are always committed to the codebase. And so in the diff, when it goes to code review - huh. Why has that field changed type? Or why am I suddenly having to regenerate a load of things? Why do I have compilation errors? That’s usually like the first step of “Have we just broken something for someone?”, which I find is a good first step.

Dave Shanley has written a number of great OpenAPI libraries and tools, and one of those gives you insights into “Has anything changed between these two versions of spec? And, indeed, has anything broken as part of that?” So wiring in tools to make that easier is always good.

As I mentioned, humans will hopefully think about “Oh, I am removing a header”, or something like that that would be a bit breaking… But we don’t always catch everything, because we are fallible. So adding some tooling in there to help it be a little bit harder to break is always good.

And then in the case that you do have to break things, like you have to let people know, and you have to do like phased deprecations, and make sure that you’re not giving people a week to migrate between versions… Because that’s then going to like break production, which isn’t the nicest thing to do…

I mean, unless you’re a company that shall not be named, you can just push it out and see what happens… [laughter] Maybe bring the entire modern society to a halt.

The world… Distributed QA.

Yeah, yeah. Yeah, have the customer QA it. A tangential thought here, but that is perhaps still relevant… How do you all version your APIs? …in the path, right? Say you go with v1, and then you’re making a breaking change… You’re changing one thing. To me, going from a v1 to a v2, that sounds like a very big step up. Ideally, I’d like a v1.1. Or if it’s a non-breaking change, maybe a v1.0.1. Do you all put all them dots in the path? Do you follow a different system? I’ve always wondered. How do you version in the path incrementally?

So at least the way I’ve done it in the past, you will only have the major version in the path. And a consumer shouldn’t really care if it’s v3.1 versus 3.10, because hopefully you should be incrementally adding things. So like if the consumer is expecting like a load of new fields from version 3.10, and you’re on version 2.5, or 3.5, and those fields aren’t there, then there’s a problem. But generally, it should only be additive changes, that can just be sent to the user and they don’t need to worry too much.

[00:27:50.01] My experience… [laughs] My experience has been it’ll be like the same v1 for like years. And then the accumulation of these incremental changes makes the original v1 unrecognizable. You can’t in good conscience say “Oh yeah, this is the v1 that we’ve been making incremental changes for for the last five years.” It’s still the v1 that you implemented – like, there inevitably will be breaking changes, right? So to me, the notion – versioning is hard, right? I’m not saying otherwise. But to me, the notion that you can keep something out of v1 and expect the original implementation, your consumers to not change… But the moment you now need to communicate “Hey, there’s a breaking change”, you’re forced to either communicate the breaking change, still keep the v1, or decide to go on a v2, in which case you’re like now making a conscious, like “Oh, this is a big jump. Going from v1 to v2, this is no small task”, and then come to find out just one field changed, right? So it’s a hard balance to maintain. I don’t have a great answer for it. Over the years I’ve seen it go both ways, and no one way has felt great.

I think – similar to like how I think about Go modules and stuff, I think people should push out the major versions when they feel they need to make a breaking change. And if they then realize that there are, say, the Go GitHub SDK, which is at version 63 the last time I saw it, they are very happy consistently pushing out breaking changes. But also, they don’t support any of the old ones, to my knowledge.

So one of the things they say is “You know what? We are going to be very happy pushing out these changes, but also we don’t care about the old versions.” When you’ve got a production API that is used internally, you have a bit more control. But as soon as it’s public, it’s so hard to change that.

Brandur Leach of brandur.org wrote a really interesting blog post on Stripe’s engineering blog, almost seven years to the day ago, about how they support all of their versions since like 2011. That’s a long time to support every bad decision you’ve made.

[laughs] It’s expensive, too. That’s gotta be expensive.

They’ve built some really interesting engineering around it, and you can do it, but you need to make that decision of “How long do we actually want to support this?”, if say you want to stay on version one as long as possible… Okay, but you’re not going to be able to change things that you wish you could have. You’re not going to be able to make improvements, or change subtle things that could get away in like a patch release, but probably shouldn’t be… There’s an XKCD about space bars being key to someone’s workflow… [laughter] Yeah, I’ll dig it up. It’s a good one. And it’s the fact that - like, every bug you fix is probably part of someone’s workflow. So technically, any change you make could be breaking to anyone. And then if you go down that rabbit hole, you’re like “Well, everything’s a major [unintelligible 00:31:15.05] and then who cares…?

Yeah… Me - I’m just like, I don’t like version numbers. You know me. Get that out of the get out of the URL. That doesn’t belong there.

You like version letters. We are rocking with vc, vd… [laughter]

I mean, I feel like a lot of the struggles with version numbers come from like maybe we haven’t spent enough time thinking about what we’re doing, and we’re kind of putting things out too fast… And maybe we need like an instability period followed by more stableness to make sure we’ve thought through our API well enough… So I feel like a lot of like the – you know, we mentioned Stripe’s API going back to 2011. A lot of the stuff that we use every day - HTML, CSS, JavaScript - that stuff goes back way further than that… I think it’s compatible all the way to like HTML two or three… It’s a super-old ‘90s technology, and a lot of that stuff still works. They’ve deprecated some things, but there’s not – I mean, there’s no longer any HTML version numbers; that doesn’t exist anymore.

[00:32:24.04] I know people have like HTML 5 stuck in their head, but we just kind of gave up on version numbers, because it just wasn’t working for – it didn’t make any sense for the type of ecosystem that they exist in. And I think that’s how a lot of APIs are as well, where it’s just like, I don’t know if like this is really communicating what we think it’s communicating at the end of the day… And I think perhaps we should shift as an industry. And I think things like Open API could help with this as well, or really any type of schema definition thing… But shift to thinking about having forward/backward-compatible APIs in general, instead of thinking that “Oh, I can put this in a little box, and then I can like go build a new thing.” If you’re going to build a new thing, just call it something else.

Usually, if I run into something where it’s like “Oh, this is a breaking change”, it’s probably a breaking name, too; you probably need to go find a different name, or call it a different thing, or implement it in terms of what is already there, or something like that. But I think version numbers – I’ve become over time less and less enamored with version numbers, even in things like Go.

I was talking to one of the people from the Go Tools team about this at GopherCon… Just like, are modules the right space to be putting version numbers? If you have a bunch of packages and one of them has a breaking change, but none of the other ones do, is that a good version bump? It’s like the same thing you were talking about, Johnny, which is just like if I have one thing that changes that’s breaking in my API, do I need to version the whole thing? Like, what’s the granularity of my versioning? And I also think that that’s just like not a good use of our industry energy, to try and figure out how to make this thing that we made up that we think we need to have work, and maybe we should find something that works better… Like, I get why version numbers are intuitive, I just don’t like them. They just never feel like they work.

But what’s the alternative, though? I mean, versioning - all it is is a snapshot in time. You’re basically saying “On this day, at this time, this is the behavior that was exposed, this is what you can expect, this is what you can do.” And at some other point in the future there will be a new snapshot, and at some point in the past there was a different snapshot.

Now, you as the person responsible for consuming this thing, you kind of need to note that. Because you whatever you’re building on top of a – be it the HTML specification, or the TLS version, or whatever it is, whatever piece of infrastructure, internet infrastructure versioning spec you want to abide by, or be it your other team’s API endpoint that you need to integrate with to get a lot of business work done, you need to know “At one point in time this is what was supported.” And I don’t have a problem with that. The problem I may have is if I’m now forced to go into this sort of exercise of trying to figure out what’s broken, what’s not broken, if I’m tasked with keeping up… Especially for something that maybe is receiving – maybe some some functionality has been deprecated or no longer works, and I have to maintain my client in consuming this API, because the business wants to take advantage of this new thing that the producer of the API has made available in this later version… So now I’m like balancing all of these things. I need to consume the new capabilities, to accomplish the new business goal, but now that also means I have to go back and now address all of the breaking changes that you’ve made as the producer of the API, all of the changes you’ve made since v1. So now I have to take advantage of v2, so now I have to walk back every single version, see what you changed… And then hopefully – in my experience, it’s never easy; especially the more versions are between where you’re starting and where you need to end up. The more versions have come out for the given thing, the more of a headache it is.

[00:36:31.00] So I don’t know if there is a better alternative to versioning, and it’s hard to not keep up with version updates, because you’re only delaying the pain when you finally need to update. I don’t know if tools like OpenAPI help at all in this domain. I just don’t know if there’s a better way.

I feel like there’s two answers to that. I think one is – like, versions, I think they are a snapshot, but they’re like a snapshot with a really weird granularity, where I think the granularity is like never quite what we want it to be. Either it’s like way too big, so it encompasses a bunch of stuff that like “Maybe some of this stuff changed, or maybe it didn’t”, or it’s like way too small, and you just have a proliferation of version numbers and you’re trying to figure out what to do with all of them.

And regardless of what you do, I feel like having automatic upgrade paths is one of the things that everything should have… So in Go, being able to rewrite the version one of the module in terms of version two of the module means that you can now just upgrade seamlessly through things… And I think you could do something similar with APIs. I mean, if you as the API designer sit down and actually think through what you’re doing, and what these changes are, and what the breaking changes is. I think that’s my major gripe with people putting versions into their APIs, is it’s a way to be lazy that just makes it more difficult for your consumers to actually consume your API at the end of the day… Because when you just want to be like “Oh, it’s version two”, it allows you to not have to think about how you get from version one to version two.” It’s like, “I don’t know, have fun, bro. Figure it out.” Whereas if you don’t have that ability to just kind of be like “Now we’re just going to do it.” Or if you have to like give people something, like maybe a shim layer, that once again, implements version one in terms of version two, then it also becomes much less of a burden for you to maintain that old version, and you can have it around for longer, and then eventually just deprecate it and be like “Okay, it’s been six years. If you’re still on this, sorry. We’re turning this off” or “You’re not getting this functionality anymore”, or “You have to go implement whatever yourself.”

Have either of you seen a project called Open Rewrite?

I have not.

It’s probably a sacrilege… It’s a Java project, and one of the things they’re doing - and they’re currently branching out to a couple of other ecosystems… I think there’s the JS and C# ecosystems. But one of the things it does is it provides a way to create recipes for “How do I manipulate my code?” So speaking of upgrades in particular, in the Java ecosystem there’s Spring Boot, which is a very large web framework, and performing an upgrade between the major versions recently involved a lot of work. So it’s things like you wanted to upgrade from Spring Boot 2 to Spring Boot 3. To do that, I need to upgrade the version of Java I’m using, from say Java 8 to Java 17. Okay, to do that, I have to do this list of things. And actually, as part of that, there’s also this list of things. And it provided a great way of you’re just pointing to a pre-written recipe that ran through - I think it was like six or seven hundred different steps. And it transformed your project from Spring Boot 2 to 3 seamlessly, because someone else had done a lot of work to do that. And I think, for instance, as library or tool authors, if we had a way of making it easy for people to migrate between versions, it would be great, because - yeah, a lot of the work is already done, because someone has thought upfront, “Okay, I’m making this change, and here is also a set of automation which will just transform your project from 2 to 3”, as an example. And so something like that could be quite useful, but one of the other problems is people don’t care, or don’t make the time.

[00:40:35.27] I think it was earlier this year, late last year, we made a change in OAPI CodeGen to introduce our first breaking change that bumped our module paths to v2. And as part of that, it was getting rid of a load of dependencies that we didn’t need. But to do that, we needed to break a few things. And off the back of that, when I’ve seen different companies using OAPI CodeGen in the wild, I’ve raised PRs to bump them from one to two. Tell me how many of those have been merged? [laughter] Yeah, there’s things like that; like, I’m actively going out and doing the work to upgrade it for people, and they’re not picking up on it. So…

Yeah, and I think that’s probably one of the other reasons why I don’t like versions, is because it allows us as an industry, as a collective to keep leaning on this old way and old method of thinking about things; I think it’s just not helpful for us. I think it’s not helpful for the producers of APIs or the consumers of APIs, because it allows people to just be so much more lazy than I think they might be if they had to actually - you know, not necessarily spend more time thinking about it, but if they had to approach things a different way. I think the other part of the problem is that there’s just not enough tooling to encourage people to think about things in the right way.

I think, if you look at the way GraphQL works, or the way gRPC works, a lot of it’s about defining the data and handing people the data. I’ve seen a lot of Open API specs too where it’s like “Here’s information about what the HTTP request is, and what the data looks like”, but it doesn’t tell you anything about like how the API works, or like how they think you should use it. It’s just like “Here’s the stuff, assuming you already understand what this API is”, which is very different than – I mean, I keep using HTML as an example, but I think it’s a really good example. It’s very different than the HTML specification, which is also an API, which I think people might not realize; it’s like, HTML is both the markup language and the DOM API that comes with it, that powers all the JavaScript stuff that we have… But there’s a lot of like non-normative, as I call it, sections, that just describe like the how and why of things, so that you can understand the API and the system a little bit more. And I just don’t really see that out in the wild a lot, and I think that’s because people don’t think that they need to put that information out there… But when you want to implement or use an API, it’s much easier to do it when you have the information, so you kind of know where to start, or know what this thing was meant for… And I think it also helps get rid of some of those – kind of, as you mentioned, off label spacebar heating my computer things that people will do with APIs or whatever, and it’s just kind of like “Oh, here’s the thing that was laying around.” It’s like actually, I don’t really want you to use that thing like that. There’s this much better way to do it over here, but how would you ever find that if no one ever wrote about this other API, or this other call you could do over here?

[00:43:30.09] You need – again, I keep getting stuck on the temporal nature of specifications, APIs, what have you. In the case of HTML there’s some nuance there, because browsers or clients can choose to implement certain parts of the spec, or not. For the longest time - I think maybe still to this day - you’ll have Firefox that chooses to implement certain parts of HTML in one way, and Chrome and Chromium uses a different way, and Safari does it a little differently… Who knows. S you have these variants. But for an API – so I’ll pluck an example out of sort of real life. I do cloud consulting, so a lot of times I have to integrate with third party APIs. If I didn’t have a version to pin to, I could be liable for writing code that works today. All of a sudden, the provider of this third party API, without letting me know through some sort of versioning, they could change the API, and my code that I give to my client works today, and then all of a sudden next week it stops working. Now, I as a consultant am liable, because I gave [unintelligible 00:44:39.27] that doesn’t work anymore. It worked for a week and it didn’t work next week, right? So I have to rely – legally speaking, I have to rely and say “Hey, you want me to use this third party thing, and this is the version that they have. My code works with this version.” And if they break things, you have the choice of bringing me back to fix it, or you know that you will have to get somebody else to fix it and update it. So I have to use these things as part of my tool set, and I think a lot of people, whether you are a consultant or not, you kind of have to follow the same approach as well, right?

I mean, there’s nothing saying that like something you depended on in their API, that you saw as, once again, a usable feature, they see as a bug, and they fix it in the version, and now your code is also broken. That’s the thing with versions, they don’t actually give you a concrete promise, they give you an illusion of a promise. The only way that you can make sure that the code you wrote today is going to work tomorrow is if the people that are producing the thing that is using the API or the browser in the case of HTML or whatever actually are adhering to the promises made in whatever specification they provided you.

The version number just is like “Oh, well, this feels like a stronger promise”, but it’s not, at the end of the day, and there’s some cases where you do need to go back and make breaking changes for like security reasons, or whatever. If you were using something that’s a giant security vulnerability for them, they will absolutely patch that up and be like “Sorry, you’re all broken. You’ve got to go fix it. And we’re not bumping it a version number, because this is a security vulnerability, so we can’t, say, leak our customer data just so you, Johnny, don’t get dinged by the company that hired you to build the thing.” So I think version numbers aren’t that stable of a thing, at the end of the day. Certainly not as much as we think it is. I think we have to extract promises out of companies. I think it’s much better to be like “Oh, here’s the specification that this company agreed to, and this is what we build it against.” And if your API no longer meets the specification, then that’s not your fault as a consultant, Johnny. You built it to the specification that you were given. It’s on that company. And I don’t think you need a version number to get that specification. I think a version number is wholly insufficient to get that specification. I think you want that written specification of “These are the APIs.

This is how they work. This is the behavior. This is everything you’re expecting”, so that there is no ambiguity about how something should work, or why it should work, or who’s to blame if something breaks down the line.

But you have to lock that down temporarily, though. Whether you use a version number or whether you use a date… AWS is notorious for using dates as versions for things, for example. Whatever it is, whatever label you want to put on it, you have to say “Hey, as of this label, this is how these things work”, right?

Yeah, I mean, I think that that’s perfectly fine, of saying “The specification as date x.” You can get snapshots of the HTML specification, for instance, that are tied to a specific date. So you can go implement something in your browser and you’re like “Well, this was what it was at this specific date.” But also, it really depends on the type of contract that you’re doing. HTML works in the way it does because of, you know, you have these browser vendors, and there’s kind of like nobody here that can be like “No, you must do it this way.” So it’s all a shared agreement. Which is not dissimilar from how APIs you consume from third parties work. It’s just different, because there’s only one implementation instead of multiple implementations. So now you have to extract that same sort of promise in a slightly different way, but I don’t necessarily think we need the version numbers to give us that. Or even even dates, really. It doesn’t matter.

[00:48:16.12] At the end of the day, what you want is sign something that says, or have an agreement that says “This is what I expect from you. And if you want to change it, then we need to have a conversation”, or someone needs to have a conversation. Or in the case of a consultant or a contractor to a company, if the contract changes on their end, you can bring me back, but you can’t come after me and say I didn’t do the right thing. I did the right thing as of this. So if you want to judge my work, you’ve got to judge it based on this specific thing. Which once again, whatever pointer you use to reference that thing, whether it’s a version number or a date or hash or what have you, as long as the thing itself is stable and stuck in time, that’s what you care about. That’s the agreement you want to be focused on.

And I think at the end of the day, version numbers, like… I don’t know, even with all these version-numbered APIs, can you actually – with Open API you can. But with other things, can you go download the actual contract that’s being agreed to? If they go back and they change all of their documentation and just completely [unintelligible 00:49:14.28] an entire API endpoint, how are you going to be like “No, it said this before.” I mean, I guess you could go use the Wayback machine if they’re big enough, but that just seems like a very precarious situation to put yourself in, instead of having the “No, here’s a contract that I had of some sort.”

And I think things like Open API can be those contracts. I think that’s why we have these at the end of the day, is because they function in that way. I think we just kind of muck them up a bit with version numbers.

Well, until you give me a better alternative, I have to rely on version numbers. [laughs] And speaking of agreements, who has agreed to maintain this thing? Is this thing like financially supported, or how is this project kept alive?

Beautiful segue. So up until very recently, it was two very busy, very tired maintainers. Now it’s two very busy, very tired maintainers with a tiny bit of financial backing. So yeah, earlier this year, Marcin, co-maintainer, put out a post, because we’re both, as I say, very busy. Open source is one of those things that is an evenings and weekends thing for both of us, and trying to slot it in between – I think I technically have like two dozen open source projects, of which a few are pretty busy. I have a load of life stuff that I want to do, I have a load of personal projects I want to do, I have a very active blog… So a couple of years ago, I got involved with OAPI CodeGen when I started a job, and I was starting to work in Go; I started on the project. And then similar to the XZ Utils backdoor, but not quite, I gained the trust of the maintainer, I started getting more rights to the project, and I was a good maintainer, and we got to the point where we were kind of getting through issues, but as work got busy, we just couldn’t do as much. It was just one of those things. So months would go by without a release. The issues would pile up, the PRs would pile up… So it’s something we’ve been talking about for about a year before we finally launched it was we would like to get paid for the work. And I’m using “we” in the royal we. So I would like to get paid for my work on OAPI CodeGen, because it takes a lot of time. And I think people are getting a lot more empathy for open source maintainers, but the amount of time they think it takes to maintain projects - double or triple that, and then add a little bit more… Because sometimes even just trying to look through the most recent issues on the project can take several hours. And I potentially don’t have several hours to go through them.

[00:52:08.00] And there’s code contributions, which are people who are trying to fix issues that have either been long standing, or recent. Sometimes there’s changes to make. We’ve recently made some big changes to our documentation to actually have some, that’s quite considerable. So as part of that, as people raise contributions, I want to make sure the documentation is in a good place for those new things. So that requires some coaching of contributors.

So one of the things we’re starting to do is GitHub Sponsors. So there’s currently a single tier, where companies or individuals can spend $150 a month, which pays for one hour of my time. And that is starting to work towards making the project more sustainable. So far we have two paying sponsors, which we’re very appreciative of. And then I mentioned I work at Elastic. Elastic have given me four hours a month to work on the project, which is really awesome, really appreciated.

It’s not nothing, but it sounds very small. Four hours a month… Again, it’s not nothing, but…

Yeah. And it’s one of those things… Like, looking at the issues that are raised by individuals at companies, and looking at email addresses of people who are contributing… Like, we’ve got some pretty large companies using us, and none of them are contributing. They’re contributing code. That’s good. But also, it takes my time to review the code. Every time I merge something, I’m now on the hook for that, for perpetuity, until I push a breaking change or two and get rid of stuff. So it’s one of those things - it takes a lot of time to maintain, and so I’m trying to balance that. And every time I look at the issue tracker, there’s at least another four, five, six issues and PRs to look through. We still have issues from pretty much the first year the project was around, that still haven’t been solved, or I need to triage and work out. Have we fixed those in like the four and a bit years since this was first raised? Because some of them we have, some of them we haven’t… And it’s like, trying to conceptually work out where are all the gaps in the project, as well as I also want to do some interesting stuff.

I recently had the enhanced routing in the standard library. I’d been very eager to work on that. I’m very excited. And I think I’d started it, and then we got a PR from someone. I was like “Thank you very much.” It’s good, because we now have the capability, but I was looking forward to that.

Store your joy… [laughs]

Yeah. And it’s trying to balance “What do I want to do?” and “What do I have to do?” So trying to get people to pay for like free labor is good, because I’m someone with ADHD, so I have fun trying to balance just generally, and trying to work on prioritization and stuff like that.

Fun, huh? You call that fun, huh? [laughs]

Yeah. It is a lot of fun. And especially – in fact, how many issues have we got right now? So we’re up to 439 issues and 134 pull requests. If I spent half an hour on each of those, which - it’s actually going to probably take more than half an hour for someone. I don’t even want to do the maths. But that’s a long time. And some of them will be duplicates, some of them will be fine… But yeah, there’s a lot of time, and there’s a lot of big companies using it, who it will be nice for them to try and help make this more maintainable. But at the same time, I know that chances are if we stopped maintaining it, or if there was a big lapse, people would just go elsewhere, or people may fork it. But that’s the good thing about open source. But also, there’s a lot of stuff to maintain in there. I’ve been working on it for two years. I still don’t know all the intricacies. Marcin, my co-maintainer, has worked on it since the beginning. I’m sure he doesn’t understand some of the stuff at times. Yeah, it’s a difficult one.

[00:56:12.03] What’s the equivalent of, in the open source world of declaring email bankruptcy? [laughter] Just delete all the issues, delete all the PRs…

So you say that… There have been a couple of projects that have done that in the last few years. I can’t remember offhand… But it’s really frustrating when people do that, because there’s things like you raise an issue, and it’ll be like “Oh, this is a duplicate. Fine.” But if the issue tracker is completely empty, and you’re like “Oh, this project’s been around for years. They’ve done really well. They’ve fixed everything. They’ve implemented everything they need.” And it’s like, no, there’s actually like 900 things that they’ve just closed…

Closed… [laughs]

Yeah. And I want to get on top of it, and I want to at least have a way of working out “How many duplicates have we got? What are the key areas we need to dig into?” Over time, I don’t want Marcin and I to be the only people working on this, so how do we create a maintainer and contribute a ladder to make it so other people can get involved? But I absolutely cannot spend time on that, and trying to do governance and make the project more sustainable in terms of that, because that takes away time from fixing bugs that have been open for years. And I’ve been working towards a release for at least a few months, because every time I sit down to do it, there’s just so many things. In fact, recently I hit a massive milestone being an open source maintainer. I got a very angry, abusive email… So yay me!

Of course, of course. Yeah, rite of passage. [laughter] Somebody feeling entitled on the internet.

But yeah, I’ve found their LinkedIn, I’ve found their website and stuff… So that’s in the back pocket for maybe getting in touch with their company and being like “Did you know that they’re saying this?”

Right. You can actually pay to get this done.

Yeah, exactly.

Yeah. I feel like that’s one of the unsolved mysteries of our industry right now, is how do we extract ourselves out of this exploitative, I guess, relationship that people have with open source maintainers, where they kind of like – I mean, we’re missing a whole bunch of infrastructure. Surely, I mean, you’ve just talked about there’s 400 issues. How do you actually triage that? At a company, you’d hire staff, project managers and people like that to go through and keep everything sort of clean. I mean, we’ve all worked at companies that use Jira. Like, Jira is never as nice and pristine as we might like it to be. But like open source maintainers, you’re kind of on your own for that.

So I think – I mean, it’s fair, to some degree, to just be like “Nah, I’m going to close all of this, and if it’s important, then someone will reopen it. That’s the best I can do right now.” I think there’s lots of people working in other models. I think like Ben Johnson and his model where he’s just like – I guess it’s the same model as SQLite’s model, where they’re like “You can have this code, but we’re – and you can file issues if you’d like, but we’re just we’re just writing the code; we’re just maintaining it, putting it out there, and that’s kind of it. Like, don’t give us any code contributions. Don’t like give us any of that stuff.” I think that’s like an interesting take on the model. But I think in general, we just need to do something, whether it’s actually financially support people, or more companies like what Filippo was doing, where it’s just like “No, there’s just people that –” Like, we have to find a way to pay to be open source, but not like attached to a company open source. Not the whole oh, you work at Google, or Apple, or Facebook, or whatever. You work in the open source part. But then like if you want to leave that company, all of a sudden it’s like “Well, no, sorry. I can’t work on this anymore. I’ve gotta get a job that worked on closed source stuff.” So I think that’s something we need to figure out, and probably figure out pretty soon, because –

[01:00:10.17] Well, we have the we have we have the model for this, right? And I think we’ve had models that I think have been successful, we just haven’t applied them, at least from my view, to smaller projects. Take the CNCF, for example, or Apache Foundation, or any of these sort of larger entities, that do get corporate sponsors. They do get money in the door. It’s just that I think perhaps what hasn’t been tried yet is - or maybe it has and I’m just unaware of it - for a project like this one, OAPI Gen, to maybe become part of a larger entity, whose only job is to take in funding from various corporate benefactors, and provide project managers or product managers for different open source projects that meet a certain threshold of maybe popularity, or usage, or whatever it is… Expecting open source maintainers themselves to take on the burden of doing all this stuff, maybe setting up a foundation or a nonprofit, and trying to be the product manager or project manager and also try to do the code writing, and the maintaining, and this and that… It’s just a lot. So what if there was an entity that sat above, and maybe they had the same sort of rubric for taking on a project just like a CNCF does, that says “Hey, if you have at least this many, or this much usage, or whatever the criteria happens to be, then you can become part of this project. And every week, month, whatever it is, you get this stipend or this monetary financial support to keep this project going. That way the maintainers themselves are not stressing over trying to find a way to support themselves; there’s already an entity that does that. So I’d be interested in hearing if something like that already exists, and if it’s not right, I think there’s an opportunity there.

So I guess two quick things. The first one is - so you mentioned Apache Software Foundation. Do you remember Lock4Shell, the vulnerability with the Lock4J library?

Yes. Yeah.

So they’re under the Apache Foundation. Apache gets a load of donations. The maintainers were not seeing any of that. So they were still effectively working for free, under this big foundation that was meant to be covering things. And…

It still didn’t happen.

Yeah. And so there’s some things where big foundations aren’t necessarily going to solve things. Recently in the Tidelift community Slack someone mentioned a new thing called Common House, which I’ll drop a link in the show notes, which is a small foundation. So it’s not trying to be as big as the CNCF. It’s trying to be a lightweight governance and financial sustainability home for projects. And I quite like that as a start, and it’s the sort of thing that maybe we’ll look at doing in the future… But as you say, I think if you’re able to have like a community manager who is maybe a fractional, maybe spends a day a week on your project, I think that would be a huge amount of difference. Because I’m trying to do all the hats. I’m trying to fix bugs, merge code, community management stuff, trying to think of governance, and sustainability, and all sorts of things that just, like, there isn’t enough time. There’s barely enough time in the day for things I want to do, let alone things that I’m being shouted at for doing.

[01:03:51.19] Yeah, I think there’s a place for foundations, but I think as soon as you get big companies, lots of money, you’re just going to be dealing with a lot of politics. And that is what most people don’t want to be dealing with. And I think there are models for supporting – like, GitHub Sponsors is a good model for how to kind of more directly support the maintainers.

I think the big problem we have right now is much like in software engineering companies as a whole, all of this stuff, the stuff you mentioned, Jamie, is all like glue work, which there are people who are extremely talented at doing, but they don’t feel valued in open source communities, because you’re not – I mean, even people that just want to do doc updates don’t feel like they’re valued, don’t feel like this is a worthy contribution, even though maintainers are saying, “Yes, please, please give us doc updates. Please give us like all of this stuff.” I think the greater ecosystem that we have doesn’t assign enough value to the project maintenance things that need to get done, so that people that want to do that work, they aren’t able to come in and do it. Also, companies don’t pay those roles as well as they pay software engineering roles, so those people don’t have as much time or as much incentive to just go give away some of their effort for free.

So I think probably the primary problem is actually getting people that are skilled at doing this work to want to do it in the open source realm, which - I mean, if you’re just going to get a whole bunch of nasty emails and not get a whole lot of benefit and not really be respected, irrespective of if you’re getting money or not, you’re not going to just do that with your time, unless it’s a project that you’re really personally invested in. And I think because of the state of a lot of open source stuff, those people don’t have projects that are accessible for them to use, because we’re already stuck in this problem. If you’re not someone that writes code every day, you only write code sometimes, and the documentation for most projects is not great, and the community management isn’t great, you’re going to have a hard time getting into that project to actually start helping out or working on it.

So I think this is much less of a “We need some entity to come in and help fix it”, and more “We need to do something as an industry, as a community to start welcoming in people that will help us fix these problems”, and actually giving them equality, in some way, to people that write code… Because at the end of the day, it’s not all about code. I know we love to think it’s all about code. It’s not all about code. There’s all this other stuff that we have to be doing, and we as an industry just need to start actually acknowledging and respecting that.

Is that an unpopular? That’s an unpopular opinion. But maybe –

I sure hope it’s not an unpopular opinion.

… but maybe we should make room for some unpopular opinion. I know Jamie brought a few.

Jamie did bring a few. So we should move to our wonderful unpopular opinion segment.

Alright, Jamie, it looks like you have a few unpopular opinions. So just go for it.

Yeah. So yesterday I was actually listening to Carlana’s last episode on Go 1.23. And one of the opinions of “Make sure you’ve got a list of them for next time”, for like regular panelists… So I made sure to put a few in the bank. So my first one is that TypeScript’s type system is much better than Go.

What?! Explain yourself, sir.

Also, it’s called TypeScript. I mean, is that a bad type system…?

No, no, I refuse… Explain yourself, sir.

So especially when you’re working with OpenAPI, the types that you can do with TypeScript - it’s so much better. So I’ve worked with a number of APIs where we’re also doing like backend/frontend TypeScript, and for instance, things like enums are so nice. Like, Go doesn’t really have enums compared to other languages. For instance, there’s stuff like if you have, say, an object with string keys in TypeScript, you can then get compile-time checking to make sure that you’ve actually indexed into something that exists in the object, and stuff like that. There’s things like partial types, and unions, and some types… And there’s loads of goodness that, especially when you’re working with something complex like complex OpenAPI schemas, it really shows. But even in my regular stuff, every time I use some things that I could do in TypeScript, I’m like “Ah… Go needs to get better.”

I’m looking forward to the results.

Oh, boy… Let me tell you…

Some reunion types would be – I could see that. There’s been times that I’m just like – like you brought up, like “This is a boolean OR a string.” I see the appeal of those types specifically. Enums I don’t – I sometimes get, and then sometimes I’m just like “Nah…” But definitely some types/union types. I think that would be a nice addition to Go. And we’re kind of – we’re like almost there with the way generics work. It feels like they’re within reach, but they’re just not there yet.

I will withhold my thoughts on this one. Let’s move on to the next one, Jamie… I want to be nice to you all the way to the end of the episode. [laughter]

I mean, we could do a Plus Plus section of you just riffing into me. [laughter]

Yes, we’ll do some Plus Plus content. But first, first, [unintelligible 01:09:51.10] of this second unpopped, because I don’t understand.

Okay, so this is my second and final one. Spoons are inadequate for ice cream.

What do you use, a fork?

How are they – what are you gonna use? A fork? A knife? Chopsticks? What other utensil – a spork?

Your fingers? Your hands? Your feet? What?

So for those that can see the video, I’ve got this little spatula… It’s spoon-shaped, but it’s a spatula.

You see, I knew you had a big mouth… [laughter] This man out here using a spatula to eat his ice cream.

So ice cream melts. If you’ve got a spoon and you’re trying to get the last bits, you’re either going to be like scratching the container, the bowl or whatever you’re doing, and then you’re damaging the bowl, or you’re just leaving a load of really good ice cream in the bowl. So if you use something that’s flexible, and that can get all in there, it’s way better.

Get a plastic fork, mate.

This is better.

Okay, a plastic spatula. Wow…

I’ve also started branching out into - not even like spoon-shaped spatulas… So I’ve got one that’s kind of like triangular-y, square-y, and that’s quite interesting.

Yeah, you’re like stabbing your mouth as you consume food, I guess, too. Yikes…

But then the ice cream cools down the pain, so…

This man likes pain with his ice cream. Got it, got it.

Okay. Well, we’ll see how popular or unpopular those are. But yeah, yeah, we should do a little Plus Plus. We don’t do enough Plus Plus on GoTimes. We’ll just do a little Plus Plus. For those of you that aren’t subscribed to Changelog++, I mean, you’re going to miss out on some great conversation we’re about to have. You should go sign up for that. You get to eliminate the ads, get some good extra content… It’s great. Get all Changelog in one nice little feed. It’s beautiful. Go sign up for that. But if you’re not - sorry, this is where the episode ends for you.

Thank you, Jamie, for joining us, and talking about OpenAPI and other things…

Thanks for having me.

Yeah. And thank you, Johnny, for joining us. And if you’re a Plus Plus subscriber, keep listening.

So yes, welcome, listeners, to the secret part of the show… Because you’re in the cool club, where Johnny – Johnny’s gonna give us some of his insights…

Oh, not insights… I’m just gonna sort of lay into Jamie for a second. So – [laughter] So Jamie, you are one of those folks that I think as you get more comfortable with other language features…

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

Player art
  0:00 / 0:00