Ship It! ā€“ Episode #90

Kaizen! Embracing change šŸŒŸ

with Adam & Jerod

All Episodes

This is our 9th Kaizen with Adam & Jerod. We start todayā€™s conversation with the most important thing: embracing change. For Gerhard, this means putting Ship It on hold after this episode. It also means making more time to experiment, maybe try a few of those small bets that we recently talked about with Daniel. Kaizen will continue, we are thinking on the Changelog. Stick around to hear the rest.

Featuring

Sponsors

Fastly ā€“ Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com

Fly.io ā€“ The home of Changelog.com ā€” Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Changelog++ ā€“ You love our content and you want to take it to the next level by showing your support. Weā€™ll take you closer to the metal with extended episodes, make the ads disappear, and increment your audio quality with higher bitrate mp3s. Letā€™s do this!

Notes & Links

šŸ“ Edit Notes

All episode notes are in šŸ™ GitHub discussion changelog.com#440. Feel free to add your thoughts / questions!

Gerhard, Jerod & Adam

Chapters

1 00:00 Welcome 00:48
2 01:10 Breaking the big news 10:07
3 11:16 Experimentation 02:06
4 13:22 Staying plugged in 03:14
5 16:36 How did we get here? 02:23
6 19:15 Sponsor: Changelog++ 00:59
7 20:13 Dagger with Changelog 10:05
8 30:19 Codespaces 02:41
9 33:00 Who else does this? 06:47
10 39:47 Rotating all our secrets 10:44
11 50:30 Feature flags 06:24
12 56:55 What if this works? 09:35
13 1:06:30 Crunchy Data 03:31
14 1:10:01 Where is Kaizen going? 01:27
15 1:11:28 1Password 03:33
16 1:15:01 Wrap up 00:53
17 1:15:57 Outro 00:37

Transcript

šŸ“ Edit Transcript

Changelog

Play the audio to listen along while you enjoy the transcript. šŸŽ§

Change is constant, and the one thing, the one lesson which really helped me was to not fight it, but embrace it. Some may think, ā€œOh, this sounds very agile-ish, and I thought we are post agileā€, but this is one constant, right? Change will always happen. And if anyone has been paying attention to the world, things have changed so many times in the last couple of years. So thatā€™s the one thing that will always be constant - change. So with that in mind, me embracing change and change being constant, Iā€™ll be taking a break from Ship It after this episode.

Thatā€™s a gut punchā€¦

Is a little bitā€¦ [laughter] But thatā€™s why I want to make it sound as positive as it can be, because it is. So if you remember when we started, I was experimenting so much, and trying so many things, crazy ideas, like ā€œLetā€™s use Kubernetes for Changelog.ā€ Remember that one?

I do recall. I do.

For sure.

And then Jerod came and said ā€œNo, letā€™s use Flyā€, and we tried that as well. So we were experimenting quite a lot before Ship It, or I was experimenting quite a lot before Ship It. And then, Ship It was taking more and more of my time, to the point that I was rushing from one thing to another thing, to the next episode, the next episodeā€¦ And I had less time to experiment. So I would like to do more of that.

More experimenting, less shipping of Ship It.

Less shipping off Ship It episodes, yes. Thatā€™s right. But definitely shipping. So things will still continue changing on the Changelog side; the improvements will not stop. And if anything, a couple of other areas are already picking up, like Dagger, for example, for me, which means I need more of my headspace, and more of my A game for that thing.

Embracing the change. So the big Why, if we say why in general, itā€™s because you were stretched too thin in order to do the experimentations that you love, and you need some headspace. Dagger taking off, taking over, and Ship It being very much your passion project, a side project for youā€¦ had some financial stability, but was never going to be - or at least in its current form, not going to be a full-time thingā€¦ And something had to give, because you were burning on both ends, and we donā€™t want you to burn out. And so there you have it.

Thatā€™s right. I was checking myself, basicallyā€¦ And itā€™s really important to know when to stop and what to stop. And to know how to rearrange things. And everything is temporary. I think thatā€™s something that is worth emphasizing. Nothing will last forever, not even us.

Right.

But hopefully, weā€™ve had some great time together. More amazing things will come, because this is not the end of it. Itā€™s just a pause, and we donā€™t know how it will continue, in what shape or formā€¦ I donā€™t think thatā€™s the approach - nothing wrong with the approach. But we can improve on it some more. Some video would be niceā€¦ Thereā€™s so many videos that we shot in the last two years since we had Ship It, but we published very few of those. Like working with various people, experimentingā€¦ But we never had time.

I remember episode 33, Merry Shipmas; recorded with the Upbound folks, recorded with the Dagger folks at the time, because I wasnā€™t part of Dagger back thenā€¦ And the third thing was Parca. We were profiling our app, and everything was running in Kubernetes at the time, to understand where the CPU time is spent. And Parca improved so much since, but we havenā€™t installed it in the new world, which for us is fly.io. So thatā€™s maybe one thing worth bringing back. I donā€™t know. Weā€™ll see. But I know that we have many more ideas of things to improve. So small bets; more small bets. More trying things out and see what sticks, and embracing change.

So this is episode 90. So you made it to 90 episodes before this hiatus, this pause, so congrats on 90 episodes. Most podcasts do not make it that far even. Unfortunately not 100, which would have been a coup de gras; it would have been perfect.

However, if it would have been 100, it would have felt more like the end. And this is not the end, right? So 90. Like, who stops at 90? Obviously, something else is going to come after 90. Itā€™s not a natural place to stop. 100 would be like ā€œThatā€™s it. The book is done.ā€

Right. We would call it a grand finale, and you would sail off into the sunset. Well, for me, I am a little ā€“ of course, embrace the change. Iā€™m a little bit sad. I know we have a lot of listeners who truly love this show. Itā€™s a unique show in our catalog, in Changelogā€™s catalog. You talk about things that we donā€™t talk about elsewhere, in ways that we canā€™t talk aboutā€¦ And so, of course, we will miss it. For me, selfishly perhaps, my favorite episodes are divisible by 10. I like the Kaizens, maybe because I get to listen to myselfā€¦ No, thatā€™s just a joke. I just enjoy catching up with you, andā€¦

[06:28] Not a joke. [laughs]

No, I do like it. Iā€™m starting to like it.

You have a nice voice, Jerod. Thatā€™s what it is. Letā€™s be honest.

Itā€™s not how I say it, itā€™s how I say it. No, Iā€™m really joking.

Itā€™s how you hear it.

Yeah. [laughter] Itā€™s not my voice thatā€™s great, itā€™s the things Iā€™m saying. Thatā€™s the best. Just kidding. But I love our Kaizens. If the interviews never came back, I could get over it. If the Kaizens never continued, I donā€™t think I could get over it. So we donā€™t know exactly whatā€™s coming next, but I think Kaizen needs to continue to be a thing that exists in our world. And we donā€™t know what form thatā€™s going to take; maybe itā€™ll be on the Changelog, maybe itā€™ll be on some show that doesnā€™t exist yetā€¦ Maybe itā€™ll just be a show called Kaizen. I donā€™t know. But we donā€™t want to lose you entirely, Gerhard. We want you to continue to experiment, and push forward our operations here, our platform, pushing us into new things so we can learn along the way, and sharing that, at least the navel gazing part of Ship It. What do you think?

I love it.

If you remember, one of the ideas for the show titles before Ship It was Kaizen.

Right.

Thatā€™s how ā€“ itā€™s so embedded within meā€¦ I mean, I never see myself stop doing that. And the fact that we can talk about it - I think itā€™s great. The cadence makes sense. It fits with everything.

Right. And in fact, your idea to us, your pitch for this show was basically just the Kaizen stuff. And I said, ā€œNobody wants to listen to us every week talk about our platform every week. We need to mix in some interviews.ā€ And so that became Ship It. It was the interview shows, and then I thought you picked a pretty good cadence, of every ten, every two and a half monthsā€¦ Almost quarterly, but using the episode numbers brilliantly to map out a Kaizen episode that made sense. I think if we would have come out and done a weekly Kaizen with us three, I donā€™t think itā€™d be the show that it has been. And so I think that was a good collaboration by us, to realize that, but also, you were definitely on to something in terms of just an enjoyable format that people do like to follow and say ā€œThese crazy guys just air their dirty infrastructure laundry, right here on the air, for us to learn from.ā€ And I think thatā€™s cool.

Yeah, I think so, too. And I really liked the new GitHub discussionsā€¦ I mean, we had the one for Kaizen eight, now we have for 40, which is discussion for Kaizen nine, which is this episodeā€¦ And it captures all the things. I think that works really, really well. You have the written format, you have it in GitHub, you have pull requests, issues, all things connectedā€¦ I think itā€™s something worth celebrating. And while we donā€™t ship once every two and a half months, because that would be crazy, we do talk about the highlights. And I think that is a nice forcing function to always keep moving forward. Always keep improving. It keeps reminding us of what weā€™ve accomplished.

Adam, do you wanna chime in here? Youā€™ve been nodding along, but you havenā€™t said anything.

I think heā€™s too sad.

I am a little too sad, honestly. I was having trouble coming up with words, because you know, ending is always challenging. I guess pausing is a little easier. But itā€™s bittersweet for me, because thereā€™s a lot to like about it, obviously, and thereā€™s a lot that came from our deeper relationship, and everythingā€¦ But Iā€™m also about quitting when it makes sense. The Dip from Seth Godin was, by far, one of my favorite books in terms of like self-development. And that book isnā€™t really about quitting necessarily (I guess it might be), itā€™s about knowing the right time to quit, I suppose; or pause even something. And thatā€™s a challenge, because too often weā€™ll push ourselves beyond our limits, and things break. Sometimes those things that break are really important to us, and thatā€™s called regret. And so none of us want to live with regret. I donā€™t want you to live with regret. I want to do great things together, but not at the expense of the things that are important to you and to us. And I think from a listenership, I would love the listeners to come to this and say, ā€œThatā€™s really awesome, to know when to pause.ā€

[10:38] I mean, for a while there I had to pause Founders Talk, and other things that were way back in the day, to make sure that we can focus on the Changelog podcast. A couple years back Mireille and I paused Brain Science because it was just too fast of a clip for us; we were both really busyā€¦ Weā€™re still in the midst of bringing that show back, but we have great ambition and great plansā€¦ But you have to look at what youā€™re capable of, and what you want to achieve, and kind of pair the two up, and say, ā€œIs this sustainable?ā€ And if itā€™s not, be wise and put your no down. Because too often do we say yes when we should just say no.

On the note of more video stuff though, and this experimentation, and this Kaizen, and some of itā€¦ It sounds like what we really wanted from this was the experimentation and the freedom, and then the cadence of the actual podcastā€¦ Which, I agree, a weekly podcast is incredibly hard to do. If youā€™re listening to this right now, anybody whoā€™s shipping a show weekly, for years, theyā€™re not quite superheroes, but theyā€™re darn close, because it takes a lot to show up every single week, and do something that is worthwhile. And if you have a growing audience, like weā€™ve hadā€¦ And this show has been part of that. Thatā€™s a big, big challenge.

However, even like on todayā€™s topic, like DHH, and cloud, that conversation out there, like this backlash against the cloudā€¦ Like, I would have loved if ā€“ that show was great, by the way. I loved that episode. But like in terms of experimentation and videos on YouTube, I would love to see ā€“ because you donā€™t have to have like a rhythm; you can just do it when you wantā€¦ A deep-dive or a peek behind the veil of their non-cloud cloud; their own infra. Like, what does that mean, to stand up your own infrastructure? ā€¦and just have a 20-minute DHH screen-share with you, and you guys just hammer it out for like 20 minutes. Thatā€™d be cool for me, every couple months. Like, nothing thatā€™s weekly; just something thatā€™s like ā€œShow me behind the screen. Give me a peek at your infra. What are your choices, whyā€™d you make them? How does it work?ā€ etc. Thatā€™d be cool to me. And with no necessary cadence; just like whenever it makes sense. And that kind of fits into your desire to explore. Because youā€™re an explorer, Gerhard, you know? You like to push the boundaries of you on the edgeā€¦ But I think this show may have limited you from doing that, potentially.

Adam, you just said behind the screen. Was that a slip of the tongue, or are you workshopping a new title scheme? [laughter]

You know, always, Jerod. Always.

I like where this is goingā€¦ [laughter] Behind the keyboard.

Have you done that on purpose, orā€¦?

Not away from the keyboard; behind keyboard, behind the screen, behind the camera.

There you go. So thatā€™s the big news. Thatā€™s probably a surprise to most, if not all, in terms of Ship It subscribers. A lot of these people are like - they listen to Ship It every week, and they just heard this, and theyā€™re like ā€œWell, that sucks for me.ā€ Touchpoints - like, weā€™re talking about potential experimentation; how can they stay plugged in with you, what youā€™re doing, and maybe with the future of the showā€¦ Obviously, donā€™t unsubscribe from your feed reader, unless youā€™re a super clean freak, because there might be new things getting published into the feed. Just go ahead and let it go inactive, and if we ever publish here again, youā€™ll just automatically get them. So Iā€™ll say that much myself, subscribe to the Changelog; it probably would be a good idea. But Iā€™ll just throw that in there as a shameless self promotion. But for you, Gerhard, how can people who want to stay connected with you personally, beyond Ship It, where should they go?

[14:14] Yeah. So Iā€™m still on Twitter. Itā€™s still a thing. Iā€™m on Changelog.social, even though I havenā€™t tweeted anything yet, if thatā€™s a thing to do itā€¦

I havenā€™t tooted, there we go. Sorry.

You toot there.

See? Iā€™m not up to date on all these things, so I think thatā€™s an area worth improving.

No one wants to be up to date with that word.

Yeah. Iā€™m still very much on the Changelog Slack, on the Changelog GitHubā€¦ Thatā€™s where I intend to spend more time, since this whole Kaizen thing behind the scenes for Changelog is not going to stop. Weā€™ll still be improving things, thereā€™s pull requests, thereā€™s issues, thereā€™s all sorts of things happening thereā€¦ Maybe even discussions. I mean, we had this second GitHub discussion, where everyone is welcome to participate, where weā€™re talking specifically about what we are going to improve about Changelog. So Iā€™m not sure how Chris Eggert knew how to jump in and help out, and do that improvement, or Jarvis Yang, and thereā€™s a couple of others. Or Noahā€¦ How Noah Betson knew how to do this, and a couple of others. But this is still going on. We are still on GitHub; weā€™re still doing things. Weā€™re still on Slack, on the Changelog Slack. So weā€™re still there, itā€™s just like the show, the cadence, the weekly cadence - we are pausing that until we figure out, or I figure out what comes nextā€¦ Which would be still like with listeners, with people, as like ā€“ I really like Adamā€™s idea. Itā€™s closer to what I had in mind a couple of years back. And Iā€™m craving for experimenting more, and only putting an episode out there maybe in a different format, when itā€™s ready. It doesnā€™t mean once a year, but it means less than once a week. So between once a week and once a year, thatā€™s somewhere the sweet spot, which I have yet to discover.

There you go. So not continuous delivery, but some sort of deliveryā€¦

Not of episodes, because there are so many other things, right? I mean, it has to be meaningful. I remember, for example, the Merry Shipmas, episode 33. That took a lot of early mornings, late nights and weekends. I have no idea how I could make time at that point for it. It was crazy. I no longer have that time now, which means that I no longer can do those things, which means that itā€™s all in the episodes and the few hours here and there, which is just not making me happy. Anywaysā€¦ We are improving that.

Right.

It might make sense to say how we got here, which I think if you listened to this show since the beginning, you know kind of how we got hereā€¦ But how we got here originally was like you, Gerhard, was our SRE for hire, essentially. You helped us stand up our infrastructure way back in 2016, when ā€“

Thatā€™s correct.

ā€¦when Jerod was exploring delivering and deploying an Elixir application to production. Iā€™m paraphrasing the story, of course, but how we got here was by shipping, and we would talk about that once a year on the Changelog podcast. We liked doing that so muchā€¦ Weā€™re essentially just regressing back to the original blueprint, essentially, right?

Not once a year, though. More than once a year.

Well, maybe less than once a year, but back to the blueprint of youā€™re still working with us on our infrastructure; thatā€™s not changing. Weā€™re gonna still keep improving that; thatā€™s not changing. Weā€™ll keep developing partnerships. One of the ones weā€™ve just formed recently was Typesense. Behind the scenes Jerod and Jason Bosco are like hammering out some cool stuff with Typesense for our search, and thatā€™s so cool. But these things are gonna keep continuing, weā€™re gonna pause the podcast, essentially. The extra is changing, and weā€™re regressing back to the normality, essentially. The opportunity to put your explorer hat back on, put a smile back on your face, and leverage your time so wisely.

[17:49] Exactly. Thatā€™s exactly right. And in a way, we are kind of going back to the beginning from the shipping side of things, because we have a huge improvement that went out in the last two and a half monthsā€¦ And thereā€™s even more amazing stuff coming out in the next two and a half months, so on like the next Kaizen, in the time period. And it means that I will have more time to do a better job of that; focus more, do moreā€¦ And obviously, that means for me CI/CD as code. So we are going back to the initial idea of like ā€œHey, how do we get Changelog out there? How do we use ā€“ā€ for example, back in the days it was Docker, for deploying on Docker Swarm, running on Linode, set up with TerraForm. Or was it Ansible? I think it was Ansible.

It was Ansible and Concourse CI.

There we go. Concourse CI. Exactly. So in a way, we are back there, right? Itā€™s the continuation of Concourse CI, itā€™s the continuation of thatā€¦ There is a PaaS now, which is Flyā€¦ But again, itā€™s going to be a lot more. Integration with servicesā€¦ And I know that Jerod is missing certain thingsā€¦ And stuff is coming, but for that, we need more time.

So describe to us this big update, this big improvement that you did over the last two and a half months. I think we touched on it in Kaizen 8, but it wasnā€™t finishedā€¦ Now, this was a Dagger version 0.3, I believeā€¦ First of all, explain what the improvement is, and then you can get into what you had to do to pull this off, and where itā€™s going from there.

So Merry Shipmas - I keep coming back to that, episode 33 - we introduced Dagger in the context of Changelog. What that meant is that we were migrating from Circle CI to GitHub Actions. Rather than trading one YAML for another YAML, I thought ā€œWouldnā€™t it be nice if we had CI running locally first, and remotely next?ā€ And remotely would be via a very thin interface. That interface with Dagger. You can run it locally, you run it in whatever CI you have, invoking the same command, and the same things will happen, because your CI now runs in containers. And I donā€™t mean CI like the actual operations. That was November 2021.

Beginning of 2022 I joined Dagger. We did a lot of improvements, and end of last year, which was just a few months ago, we released SDKs, which means that you can write your CI/CD system, your pipelines, in code. Whether itā€™s Python, whether itā€™s Go, whether itā€™s Node.js, itā€™s no more YAML, itā€™s no more weird things, weird configuration languages, that some perceive weirdā€¦ Itā€™s the code that you know and love. So what that means is that now you can write proper code, that declares your pipeline, like all the thingsā€¦

[21:56] And I say ā€œdeclaresā€ because itā€™s lots of function calls. Sort of like lazy chaining, which eventually gets translated into a dag, hence Dagger, the name. And then, everything gets materialized behind the scenes. Some things are cached, naturally, other things arenā€™t.

So that means that right now we are in the phase where, from Dagger 0.1, which is using CUE, we now have Go in our codebase. And I want to know how do you feel about that, Jerod? How do you feel about having your Elixir spoiled (hopefully not) by some Go code?

No, I feel good about it. I feel like a renaissance man. We have all these different things; we taste of the best Elixirs, and we also can just pull in some Go when we want toā€¦ I mean, thatā€™s diversity, thatā€™s inclusionā€¦ Iā€™m happy about it.

Thatā€™s amazing. So no more YAMLā€¦

Also happy about thatā€¦

No more CUEā€¦ No more makefiles.

I was going to learn CUE. I donā€™t have to learn CUE now.

Exactly. You have to learn Goā€¦

No more makefiles. Zero makefiles.

Now you got me.

Yeah. The top one went, and the others will disappear as well from the subdirectories when we finish the migration. So thereā€™s no more top makefile.

Okay, so where do I go? I look for a .go file, itā€™s in there somewhere, to look at whatā€™s going on.

So everything Dagger-related is in mage files.

Okay. And mage is Goā€™s version of make, or rake, or like a task runner thing?

Itā€™s just like to invoke things, just to have like different entry pointsā€¦ So for example, right now we have three entry points. The first entry point is the Dagger version 0.1 legacy, where we can run the old pipeline. 0.1 is 0.3. That was one PR. So we had PR 446, where we run the Dagger 0.1 pipeline, the CUE one, and 0.3 using the Go SDK. So the entry point is Dagger version 0.1 :shipit. And that wraps the old pipeline.

Thereā€™s also a new - again, this is like image, so it exposesā€¦ I mean, you can think of those like subcommands. It all bundles up in a binary, and it has like different subcommands. And if you donā€™t provide any command, itā€™ll show you ā€œHey, you can run these things.ā€ Thatā€™s in essence what it is.

So we have image is a namespace runtime. So we can now build the runtime image using Dagger version 0.3. Not only build it, but also publish it to GHCR. And that is pull request 450. So now we are building and publishing the Changelog runtime image to GitHub Actions. Sorry, using GitHub Actions, or within GitHub Actions, using a very thin Dagger layer. And all it does is basically just go run. Go run, the main Go file, and the command is image runtime, and off it goes to GHCR. So if you go to GHCR.io/thechangelog/changelog-runtime, you will see our image in all its beauty. What does that mean? It has a very nice description; weā€™re making use of certain labels that the open container spec has. So thereā€™s like a specific label to show the description in GHCR.

So GHCR - thatā€™s GitHubā€™s deal, right? Thatā€™s their registry.

GitHubā€™s Container Registry. Thatā€™s it.

Okay. I havenā€™t used this before, so Iā€™m a newb here. Iā€™m used to Docker Hub. So this is like GitHubā€™s version.

Exactly.

Oh, Iā€™m looking at this Changelog runtime, and it has an emoji next to itā€¦

How beautiful is that? [laughter]

Gerhard got some emoji in thereā€¦ So youā€™re already talking my languageā€¦

Elixir version 1.14.2, so you see the descriptionā€¦ I mean, you can see the version that we use in the actual tagā€¦ And thatā€™s what weā€™re using in production right now. That went out this weekend.

So weā€™re using that runtime image.

Okay. And this was built via Dagger, inside GitHub Actions?

Thatā€™s right. Yup.

And you can also run it locally, if you want.

When you run it locally, are you running it inside Dagger? Whatā€™s the terminology here?

[26:04] Okay, so youā€™re running it ā€“ so it runs Go on the outside, it provisions a Dagger engine inside Dockerā€¦ Because if you have Docker, it needs to provision like the brains, if you wish, of where things will runā€¦ So by default, if you have Docker, it knows how to provision itself. When the Dagger engine spins up, all the operations run inside Dagger engine. The really cool thing is, if anything has been cached, it wonā€™t run it again. So imagine our image, when you pull down our imageā€¦ So when we build this runtime image, obviously we have to pull down the base one, which is based on the hexpm image, and thatā€™s from Docker Hub, then it needs to install like a bunch of dependenciesā€¦ And by the way, all that stuff - I mean, if you look atā€¦ I have to show you the code. This is too cool, Jerod. Check this out. So if you go to the pull request 450, and if you look at image files, image, image.go, look at line 50 to 61.

ā€˜build. Elixir(). WithAptPackages(). WithGit(). WithImagemagick().ā€™ So this is like a chain of function calls that youā€™ve named nicelyā€¦

Thatā€™s it. And you can mix and match them in whichever way you want. So when, for example, we convert the rest of our pipeline to Dagger 0.3, weā€™ll do build, weā€™ll take Elixir, with packages, and whatever else we want. And when we want to publish the image, we can chain, again, the function calls however we want. For example, we do not want with Node.js when we publish our image, but we do want with Node.js when we build or compile our assets. So this way, we can chain all the functions, get all the bits from the various containers, various layers, assemble it, and make sure that all dependencies will be the same. Because with Node.js knows exactly which Node.js version we do; and it doesnā€™t matter where you call it from. And because all the operations are cached, they wonā€™t rerun. Some of these can take a really long time, by the wayā€¦ Anyway, so Iā€™m super-excited about this. So this is ā€“ and by the way, Noah, if youā€™re listening to this, Iā€™m very curious to know how much easier it is to bump our dependencies with the new approach.

I was just going to ask that, because Iā€™m looking at line 16, it says elixir version equals, and then itā€™s a string, 1.14.2.

Thatā€™s it.

Can I just change that string?

Thatā€™s it.

And thatā€™s it?!

Thatā€™s it. Change the string, commit and push, and the CI will take care of the rest.

Whooo-weee!! Now weā€™re talking.

Oh yeah, baby.

Iā€™ve asked you for this for years. Like, can I go to one place in the code and just change the version, and itā€™ll be done?

Thatā€™s it. And thereā€™s like more and more stuff that we can add on top of that. For example, we can change the local files. You know, we still have, in contribute.md, if you go that ā€“ by the way, that was updated as well to tell you how you change things. So that was updated to reference the new files. Those steps, we can start removing them, because we can automate more and more of that stuff. So we can, for example, go and update the Elixir version in the readme, in the contribute.md, wherever we have it. Itā€™s all code, at the end of the day. And itā€™s not scripting.

Meaning itā€™s only in the readme? Like, you could have it in the readme only?

Meaning that it will only be in the image go. Thatā€™s it. When you bump it into image go, and the pipeline runs, it will update all the other places.

Oh, itā€™ll update the readme for you.

Exactly.

I was gonna say, itā€™d be crazy if you actually just had that version in the readme, and it read it in the image goā€¦ Which you probably could do, because itā€™s Go code.

It could do that. Yeah, it could do that.

That doesnā€™t sound smart, but it just would be interesting.

[29:45] Yeah, no. You want it to be in code. You want it in code. And not to mention that when itā€™s in code, by the way, we can have ā€“ again, we still need to figure this part out, I supposeā€¦ But we could have things that automatically bump it. When a new version comes out, it bumps it in code, the pipeline bumps it everywhereā€¦ And because the pipeline runs, it checks if the new version works.

And then opens up PR and then we can just merge?

Thatā€™s it, Jerod. Thatā€™s it. Thatā€™s it.

See, itā€™s stuff like this that gets me really excited. [laughs]

Youā€™re getting me.

Okay. So thatā€™s cool. How does that play into the other thing which happened recently, thanks to Chris ā€“ and by the way, by the time this episode goes out, we will have shipped an episode of the Changelog with Brigit Murtaugh from the Dev Containers spec, from the VS Code team, talking all about this, in which Chris gets multiple shout outs. So heā€™s probably getting sick of hearing us talking about him at this point. He opened up a pull request allowing us to run our codebase on Codespaces by adding a devcontainer.json. So thanks to him for that. Heā€™s using a Docker Compose file and a little bit of JSON, and you can just like say, ā€œOpen in Codespacesā€, and itā€™s super cool. How do these changes affect his work, if at all, or whatā€™s the integration there? Because now we have like a dev environment, we have this image that youā€™re changing the way it worksā€¦

Yeah. It all builds on top of it. This is brilliant.

This is brilliantā€¦ [laughs]

It is. And itā€™s not me, itā€™s the combination of people that came together, right? I wasnā€™t expecting Chris to come along.

Nobody was.

That was great, it was amazing. So based on that - that was pull request 437 in our code base - I did a follow-up, 449, which basically changes the reference in the Dev Container with our runtime image, that is now pulled from GHCR. And because weā€™re running GitHub Codespaces, that will be very fast. Much faster than if youā€™d pulled it from any other registry. So that was another reason to go to GHCR.

So that works currently?

Thatā€™s how it works currently. If you go and open the file - come on, letā€™s check it out.

Because I just did it last week in preparation for that conversation with Brigit, and one thing I noticed is pulling from Docker Hub, just the entire ā€“ the first running Codespaces experience. I mean, itā€™s probably five to seven minutes, you knowā€¦

That has improved. The pull request that I mentioned, 449 - it no longer builds it; it references the already built runtime image. If you check out in the Dev Containers directory, if you look at the Docker Compose file, line five, now it has the image reference. So the runtime image is no longer built; the runtime image reference is pulled. So it shouldnā€™t take six, seven minutes anymore. It should be instant.

Iā€™ll try that again.

There you go. Let me know how it works. But if not, weā€™ll work on it some more. And all this stuff, all these things, we can start templating. Once we get it in the pipeline, there will be a single place where we declare those versions. As soon as the image builds successfully, and because we go through the process in the pipeline, we can start modifying all these other places, then build the production image, try and deploy it, and if it works, weā€™re done. Merge the PRā€¦ Weā€™re good.

Who else is doing it like this? How state of the art is this?

I donā€™t know. I would say itā€™s pretty cutting edgeā€¦ Because we are redefining the CI/CD with Dagger. We really are. I mean, the CI/CD as code - forget like any weird languagesā€¦ And some of the stuff that we have coming - I canā€™t talk about all the thingsā€¦ But Iā€™m like six months ahead, and Iā€™m so excited to be there.

For example, last Friday - it was just a few days ago - we shipped services support. Itā€™s an experimental feature. If youā€™re listening to this, youā€™re not supposed to use it, so please donā€™t, because it may be broken in a number of ways we donā€™t knowā€¦ But Changelog will be the first one to use the services support in Dagger. What that means is that we will be spinning up a PostgreSQL container that we need for our tests inside dagger, inside the Dagger engine, because it now has a runtime.

And what are the ramifications of that?

[33:57] Well, you spin up containers in code. Just as you write your code, you can say, ā€œSpin me up a PostgreSQL containerā€, and when itā€™s spun up, connect it to this other container where the test will run. You can have the waiting ā€“ I mean, we used to do nc, netcat, for heavenā€™s sake, to wait for the PostgreSQL container to be available. Thereā€™s like services support, thereā€™s like ugly YAMLā€¦ All sorts of weird things.

Letā€™s not knock on netcat, Gerhard. Come on. Sweet tool.

No, itā€™s amazing. I love it. It is old school. Itā€™s amazing. But whatā€™s not amazing is that you have to ā€“ youā€™re forced to combine scripting and YAML.

To wait. Yeah, youā€™re waiting for a service to be ready for you.

In a weird way. Exactly. Rather than doing it in code. Why wouldnā€™t you do all these things in code? Because now we can start orchestrating containers. But orchestrating for the purpose of CI/CD. Letā€™s be clear about that.

So weā€™re going to be like a poster child for Dagger, arenā€™t we? I mean, these people have to love us. Weā€™re using all the bleeding ā€“ I mean, by these people, I mean you people.

I love you. Iā€™m Dagger.

I know you are. [laughter] Thatā€™s cool, man. I love that weā€™re a testbed for cool new things. And weā€™re definitely right there on the edgeā€¦ I wonder how much bleeding weā€™re gonna do. Well, we are defining it. Well, weā€™ll find outā€¦ And by the way, you have the right person to fix it, who does the work. [laughs] Isnā€™t that the whole point?

Yes. Alright, cool. Exciting times. Iā€™ve always wanted to have one string in my codebase, in which I could update the version of Elixir.

Itā€™s there.

And then docs, too. Thatā€™s so cool. Updating docs is a cool thing. Still docs suck; especially a readme. Like, when you go to the readme, itā€™s like ā€“ Iā€™ve gone there recently with other things Iā€™m working onā€¦ Itā€™s referencing the old release< for example, in the readme. It says in the installation instructions, which you go to immediately, but itā€™s referencing an old release. But if you go to releases, thereā€™s like two new ones, for example. But the documentation is out date.

It could always be outdated.

Not anymore.

So is every ā€“ so because we do basically master branch base deploying, is every push to master a release, effectively?

Yeah. That hasnā€™t changed in years. Since Iā€™ve been around, that hasnā€™t changed.

Right. What about on PRs and branches? How does that work?

We donā€™t deploy. So we now run tests, by the wayā€¦ We didnā€™t use to run tests in pull requests. Oh, dang it, I donā€™t know how I overlooked that thingā€¦

We just close them all, yeah. [laughs]

Yeah, yeah, yeah. So that was actually one of the first things, pull request 436. So since pull request 436, which by the way, happened in the same Kaizen, since Kaizen 8ā€¦ We are now running tests for every pull request. And we do that by basically leveraging the built-in Docker engine in GitHub Actionsā€¦ Which is a bit slow, and it doesnā€™t have any cachingā€¦ But it means that we are running all the pipelines, including building a runtime image, but not publishing it, because there arenā€™t credentials to do that, with every pull request. So while we donā€™t deploy on every pull request, we couldā€¦

Which would give us deployment previews, effectively.

We absolutely could. Thatā€™s it. Thatā€™s it, yup. And the nice thing would be - I think Iā€™m very keen to try and do that in Dagger. The reason why Iā€™m keen to do that is because of the services support. Iā€™m pretty sure when they were designed no one thought about this, but we can have longer-running environments. So basically, we have a CI that is like one action which wonā€™t stop until youā€™re okay with it. So how do we figure out routing? I donā€™t know. Iā€™m really keen to explore that.

We could run a very lightweight version of the Changelog in the context of the CI/CD, in the context of the pull request. Because it doesnā€™t have to serve a lot of traffic, it doesnā€™t need to be anything bigā€¦ The CI/CD is already there. You have a VM where youā€™re running the actual code for your tests. So why wouldnā€™t you run a longer-running process that exposes Changelog?

Youā€™re blowing my mind, Gerhard. Iā€™m not even ā€“

[38:00] Thatā€™s a crazy idea, right? No one has thought about that before. [laughs]

Alrightā€¦

See, I told you - six months from now. Itā€™s the future.

Okay. Well, thatā€™s exciting.

So when a pull request opens, basically, the GitHub runner that runs all the various checks, one of them, we basically keep it running for longer; or we donā€™t even use GitHub runners at that point. So one of the things which we run - we spin up a Changelog, a preview one - we still need to figure out the data part - that will be accessible publicly. We get a random URL that you can hit, and then you can connect to that instance. And that instance runs within one of the CI workers. When the pull request is merged - I mean, one of the checksā€¦ Again, I still need to figure out how to do this, but one of the checks, basically, will not finish until the pull request is merged. And that check in GitHub Actions - thatā€™s the one where you can access the Changelog, the preview version.

So literally, youā€™re running a preview in CI/CD.

Iā€™m going to need a new diagramā€¦

Infrastructure.md is the place to go to our repo to see how everything wires together, and thatā€™s the one that I intend to update as we will have this new stuff. So infrastructure.md is fairly accurate right now. I think the only thing missing is GHCR, and the reason why itā€™s missing is because Iā€™m migrating the rest of the stuff to GHCR. And once that will complete, it will be weird to see both Docker Hub and GHCR. So weā€™re in a transition period. Once the dust settles, the diagram will be up to date. But again, thatā€™s the only thing which is missing. Everything else is accurate. Fly, Honeycomb, Sentryā€¦ Everything.

Very cool. Very cool.

So what about you, Jerod? I know that youā€™ve had some improvements in mind. Some of them I think youā€™ve already done since Kaizen 8ā€¦

Yesā€¦

Which ones do you want to talk about? Thereā€™s many, I can tell you that.

So a lot of my time, Gerhard, as you know, has been spent on rotating all of our secrets, first of all.

Oh, my goodness me. There were so many. [laughter]

So LastPass, thanks for nothingā€¦ Well, thanks for a few good years; and then weā€™ve lost confidence. So we are 1Password users as a team now, which we talked about for a few Kaizens, and finally made that migration. And then we decided, because of the LastPass leak, and the fact that weā€™re all on 1Password now, itā€™s a great time to just go through and do a key rotation, right? Just rotate all of the thingsā€¦ Which was just a lot of things. Like, man, weā€™ve got a lot of secrets in there, lots of integrationsā€¦ And mostly harmless. Thereā€™s a few fallouts, as there tends to be, with just that many changes; things that went wrong because of that. The biggest one was our stats system went down for a few days, because AWS credentials existed in one place correctly, but the other place incorrectly, I thinkā€¦ And then secondly, Changelog Nightly actually stopped sending, because I didnā€™t update the Campaign Monitor API key on Nightly, which is an old Digital Ocean box from way back; it still just runs dutifully, every night, on a Digital Ocean boxā€¦

So I updated our Campaign Monitor API key inside of our app, and in Campaign Monitor, but I didnā€™t rotate it over on the other server. And so it failed to send. It was still generating the emails, just not sending them, which is key; itā€™s a key part of it. So there was like a few nights where Nightly didnā€™t go out until I realized it, and I was like ā€œOh, that one makes total sense.ā€ You and I also teamed up on a few thingsā€¦

Oh, yeah.

ā€¦which is always fun.

Issue 442 for anyone that wants to see all the things we have to go through. We had 79 tasks to complete. And some of the work quick, but just like untangling all thatā€¦ We cleaned up a lot of stuff, and again, it was like almost like a spring clean; even though it was January, it was definitely like a spring clean for secrets.

[42:13] Yeah. You donā€™t realize just how many service integrations you have until you go to rotate all your secrets. And then itā€™s like ā€œHoly cow. Slack. Campaign Monitor. GitHub. Fastly AWS. GitHub.ā€

Notion.

Mastodon.

Yeah. GitHub twice, by the way. You said GitHub twice, because GitHub is used twice you have NPI token [unintelligible 00:42:30.06]

Same thing with Slack. Thereā€™s like two different Slack APIs that we use. Oneā€™s for the invites, which is like this old legacy thing that was never an official API, how you actually generate an invite. And then everything else is like for logbot, which is our Slack bot that does a few things. Yeah, thereā€™s just so many of them. And then itā€™s just like ā€“ itā€™s just an arduous process. So this is why my personal private key is years old at this point, embarrassingly.

We have to rotate it again. You wonā€™t be able to SSH into things. Good thing is you donā€™t need to SSH anymore. Isnā€™t that a relief?

That is nice. Weā€™re getting better on that front.

Flyctl ssh consoleā€¦

I do enjoy that, yes. So that was one big piece of workā€¦ The other thing - Adam, you mentioned it; itā€™s in flight right now - weā€™re swapping out Algolia for Typesense, which is a very cool C++ based search index, search engine, open source, that we had on the Changelogā€¦ Jason Bosco, we had him on the Changelog last year. I really liked the guy, got really interested in the product. Algolia has been kind of ā€“ we were on the Algolia, and we still are on the Algolia open source plan, which sets us a limitā€¦ And so when weā€™ve hit that limit, and weā€™ve been putting new things into the Algolia index ever since, but it wonā€™t search them until we upgrade our planā€¦ So weā€™re happy to be replacing Algolia with Typesense. Of course, thatā€™s an open source thing, but weā€™re working on a partnership with Jason and his team, so that weā€™ll be using Typesense Cloud. All thatā€™s very close to at least being swap-out-ready, and then weā€™re going to build from there and start to use some of the things that make Typesense interesting. So Iā€™ve been coding thatā€¦

And then the third thing is trying to rejigger the way that our feeds are generated and cached and stored in order to get to this clustered world of multiple nodes running the apps, without having to change the way we use Erlangā€™s built-in caching system, because Iā€™ve just had some issues with thatā€¦ And I just started thinking, ā€œWhy are we caching stuff if we have a very fast application, that can just run close to the user? Letā€™s just figure out a way not to cache stuff as much.ā€ But we have these very expensive pages, specifically the feeds: Master feed, Changelog feedā€¦ I mean, the XML that gets generated is like 2.3 megabytes. Itā€™s not going to be fast on any system, unless itā€™s literally pre-computed.

So I started thinking about different ways of pre-computing and storing files on S3, and fronting thatā€¦ And thereā€™s just lots of concerns with publishing immediately; we like to publish fast. And we even had a problem - thanks to a listener who pointed it out - with our Overcast ping, because Overcast as a specific app allows you to ping it immediately on publish, and theyā€™ll just push notify, and people will get their things immediatelyā€¦ Which some people really like that. Iā€™m always surprised - thereā€™s some listeners who listen like right when it drops, and thereā€™s others who listen like six months later. And thatā€™s all well and good, but for the ones who want it now - itā€™s cool, we add the Overcast Bing. Well, thereā€™s an issue there, because Overcast pings, but weā€™re caching our feeds for a few minutes, maybe just a minute. And so Overcast says thereā€™s a new episode, and so you click on it, and you go there, and there isnā€™t a new episode. And then you refresh, itā€™s not there, then you refresh, itā€™s not there, then you refresh it and it is there, and it was like 60 secondsā€¦ Because weā€™re caching.

[46:14] So I just turned that thing off and thought, ā€œWell, people can just wait for Overcast to crawl us again, for now, but I would love to solve that problemā€¦ā€ And so then I started thinking, you know, we already have a place where we store data, thatā€™s a single instance, but is a service, so to speak, and itā€™s called Postgres. And instead of adding like a memcached, or Redis, or figuring out these caching issues inside of the Erlang system, which was not trivial in my research, I was like ā€œWhat if we just precompute and throw stuff into Postgres?ā€ And I did a test run of that, the feeds; just the feeds. And just turn off all other caching, because I donā€™t think we actually need any other caching. Itā€™s just like, I already had caching setup, so I cached a few popular pagesā€¦ But what if I just did it on the feeds? And every time you publish, you just blow it away, rerun it, and put it in Postgres. And you just serve it as static content out of Postgres.

I did some initial testing on that locally, and itā€™s like consistently 50-millisecond responses with like Apache Bench, it was not a problem. Itā€™s never super-fast, like what you get with Erlang, where itā€™s like microsecondsā€¦ Which I always like to see those stats. But thatā€™s not what we need, right? Consistently 50 milliseconds is great.

Without any caching layer. I mean, youā€™re basically just pulling it out of Postgres and serving it. Very few code changesā€¦ It just felt ā€œOkay, this is kind of a silly idea, using Postgres as a cache effectively, but what if it just works, and itā€™s simple, and we donā€™t have to add any infrastructure?ā€

So I want to test that sort of in production, I kind of want to roll it out and run it, and then easily roll it back if itā€™s not going to actually work in productionā€¦ But I donā€™t really have the metrics, I donā€™t have the observability. I have Fastly observability through Honeycomb, but Iā€™m lacking the app responses [unintelligible 00:48:10.20] observability, which is really what we want. We donā€™t want Fastly to be waiting on the app all of a sudden, and the app to be just bogged down on other requests. And so thatā€™s where I came back to you and said, ā€œThis is what I would like to seeā€¦ Can we get Phoenix talking to Honeycomb in some sort of native fashion?ā€ And then I found this OpenTelemetry thing, and I stopped right there. So I will let you respond after that long monologue.

No, no, I mean, thatā€™s exactly it. I mean, we knew we wanted to do that. Itā€™s like another experiment which I wanted to continue withā€¦ And Iā€™m so keen to get back to it, to see how that integration could work. That was on my list for as long as I can remember, and Iā€™m so excited to be finally doing it. Weā€™re finally in a good place to do that integration, and Iā€™m fairly confident that weā€™ll be able to talk about it at the next Kaizen.

Ha-ha! He said it.

[laughs] On the next Kaizenā€¦

There you go. In the next Kaizen.

Okay, so we have it on record; there will be another Kaizen.

Oh, yes.

Not just a hope and a dream.

We just need to figure out where.

So if I understand this correctly, Jerod, youā€™ve done this work, but you havenā€™t done it in production. So you need a way to test it in production, essentially, to see how it responds.

I spiked it out on a branch, and then it was just like ā€œOkay, this is certainly feasibleā€ And then I did some rudimentary benchmarking of that branch, just to make sure itā€™s not crazy dumbā€¦ And then Iā€™m like ā€œOkay, this is feasible, and I know how to bring this into official code.ā€ I can definitely transition what I coded, or even just rewrite it in a way thatā€™s maintainable if we decide to do it. But Iā€™d really like to know if itā€™s gonna be really dumb, or just kind of dumb. I feel like itā€™s just dumb enough that it just might workā€¦ And be so simple, and solve a problem in a way thatā€™s just awesomely dumb. But I donā€™t want it to be so dumb that itā€™s not gonna workā€¦ [laughs]

[50:10] Thatā€™s the real spirit of Ship It. We literally have to get it out to see if it works. Like, what happens.

And then I was like ā€œWell, what I lack is metrics.ā€ So I can observe it for a few hours, get some confidence, leave it in, or be like ā€œHoly cow. It worked great in dev, but itā€™s not going to work with a real load.ā€

I have a question for Adamā€¦ So Adam, I think this may be the moment to tell us again about the benefits of feature flags.

I almost mentioned it there. I was like ā€œI donā€™t want to have egg on my face by mentioning feature flagsā€¦ā€ Because I know Jerod has sort of been resistant to some degree against itā€¦ But there may be a simpler way to do this, but I think that thatā€™s essentially what you want to do. You want to test this in production, on a limited set of users. So it could be scoped to admins only, for example.

No, because I want to load-test it. I want the full load, is my issue.

But it could be like maybe 50% of the requests, and you can compare them. So 50% of the requests, 50/50ā€¦

A threshold.

ā€¦going to the old one, 50 to the new implementation, and see how do they compare over the course of maybe a few daysā€¦

Yeah, we can do that.

So Adam, how do we get feature flags? What do you think?

Hmā€¦

Where do you stand on that?

Well, if weā€™re doing 50/50, canā€™t we just do like an if statement, with like random divided by two? [laughter]

Sure. ā€œIf itā€™s an even second, do this. And if itā€™s an uneven second, do the other thing.ā€ [laughs]

If itā€™s an imperial unit, or if itā€™s the metric systemā€¦ Is this the metric system, or which system are we going to use here?

Luckily, seconds only exist in oneā€¦ [laughs]

I know Adamā€™s been keen on feature flags, and I feel like this is his big moment to introduce some sort of subsystem.

I think so too.

I mean, I donā€™t feel like I have a system to pitch hereā€¦ [laughter]

No, I remember the conversation, Jerod. Thatā€™s why I keep going back to it. Because we didnā€™t have a good answer for Adam, and we were both against it. So maybe now itā€™s coming back, and maybe now itā€™s a yes, because it was a definite no back then.

We were premature. When I tried to pitch ā€“

Feature flags?

The insider story here, listeners, is there was ā€“ my initial pitch for us using feature flags fell on deaf ears, essentially, because we were premature. We just didnā€™t have the need for it. We were trying to find a use for it, and if you follow Kaizen, and Ship It, and what weā€™ve done, then you know our application is pretty simple. We donā€™t have a lot of developers developing on it, so thereā€™s not a real need for an immense feature flags feature and/or service to use. LaunchDarkly was our friends for a while thereā€¦ Iā€™d still say theyā€™re friendly, but theyā€™re not friends. Weā€™re not working with them directly anymore.

We do have a new sponsor coming on board, DevCycle, which is in the feature fly business, which - you know, if you wanted to use it for this one instance, Iā€™m sure we could do something. So I mean, there is an opportunity there, butā€¦ That would be my pitch. I feel like if itā€™s just this one off though, then the if statement probably works.

Well, Iā€™ll let you know when I get this far. What we need first, I think, is the observability. Because either way, if we do it 50/50, we want to see both results.

Of course.

And so right now I canā€™t see any results, besides sit there and stare at the log files, and look at the request responsesā€¦ Which was a side effect, actually, of one of our recent changes - our log files just stopped logging. I got it fixed, but that was funny. So Iā€™m like ā€œWait a second, there arenā€™t any logs.ā€

How can the Changelog not log?

Right?

Thatā€™s just like against the laws of nature, essentially.

Well, Iā€™m not gonna git blame that one on the air, because I donā€™t want to embarrass Gerhard, butā€¦ I fixed it.

Thatā€™s okay, I canā€™t get embarrassed. [laughter] I canā€™t, because Iā€™m going to learn something new out of this.

There you go.

So tell me the commit where this was introduced, so that I can understand my mistake. Seriously.

[54:00] So the code that fixes it is in commit f19c9cf, where I basically changed the application file to basically turn the logger back on. So I think you were overly aggressive when you were ā€“ you were removing a few thingsā€¦ We removed PromEx, because weā€™re not really using Grafana anymoreā€¦ And you just deleted too much code. And the code that you deleted would, if weā€™re not in IEx, turn on the default logger. But you deleted it, so there wasnā€™t a default logger, and so it wouldnā€™t log anything in prod at allā€¦

ā€¦and you didnā€™t notice.

Yeah, thatā€™s right.

And I didnā€™t notice, and so I just thought, ā€œWell, Iā€™ll just go see whatā€™s going on in productionā€, and there was no logs there. So I actually just put that code back in, that you had deleted, is all.

Right. So hang on, let me try and understand this codeā€¦ Thatā€™s whatā€™s happening right now. Iā€™m trying to understand some Elixer code live, as we are recording thisā€¦ Iā€™m looking at the application.ex, line 32, ā€˜unless Code.ensure_loaded?(IEx) && IEx.started?() doā€™ Which of those two lines disables logging? The 33 or the 35 one? Oban telemetry attach default logger?

No, thatā€™s not the line. Look at endpoint.ex line 60. Plug.telemetry. Thatā€™s the line where you basically remove the telemetry plug.

Okay, okay, okay. I see. So the telemetry plug logs.

I see. Okay.

The logger uses the telemetry plug to do its thing.

Right, right. If it would have been plug log. I donā€™t think I would have made that mistake.

Right. Yeah.

But yeah, cool. Okay. Thatā€™s good to know.

So yeah, it was an easy mistake to make. And I know how it is when youā€™re removing stuff. Youā€™re like ā€œOh, this we donā€™t need. This we donā€™t need.ā€ And I think it was just that one lineā€¦

Thatā€™s it.

ā€¦just turned that off, and we didnā€™t notice because we werenā€™t really looking at production. Now, had we been sending it over to Honeycomb and observing it, we probably would have seen the drop-off immediately, because Telemetry would have been turned off there.

Yeah, thatā€™s right.

So I think the Honeycomb integration will use this OpenTelemetry plug as well, when we do it. So that was the line that did it; it wasnā€™t the other one. There was a few other things that you also removed, I put them back in, but that was like Oban stuff. Not a big deal. It was just over-aggressive deletion, which is totally normal when weā€™re like ā€œLetā€™s ā€“ā€

Probably. I deleted too much.

Yeah. When youā€™re in like ā€œLetā€™s delete stuffā€ modeā€¦ I know how it is, because it feels so good.

Okay, okay. Okay, okay.

So there you go.

Cool. Thatā€™s good to know. So who reviewed my PR?

[laughs] Uh-ohā€¦

Do you see where this is going? [laughter] Cool, great.

Well, it wasnā€™t meā€¦ Clearlyā€¦

I knowā€¦

I merged it, but I didnā€™t review it.

I think I waited for a while and said, ā€œYou know what - Iā€™m just gonna push this throughā€, because thatā€™s how we roll.

There you go.

No, thatā€™s fine. Thatā€™s fine.

No, even if I reviewed it, I must have not reviewed it very well, soā€¦ You knowā€¦

Thatā€™s okay. Yeah, it was an honest mistake.

Totally.

On both our parts.

On both our parts.

I want to chase that rabbit downā€¦ Iā€™ve got a question for you. So once we put this experiment into production, Jerod, whatā€™s going to happen? Can you come back to the beginning, where if we get this potentially smart Postgres feature out thereā€¦ Letā€™s say itā€™s successful. What happens? What happens as a result of that being successful?

So what happens is every single request that goes to one of our feeds will be served live from Postgres, from what I call like a feeds cache inside our Postgres instance. So itā€™s effectively ā€“ itā€™s as if it was reading off disk, but we donā€™t have a disk, because weā€™re in Fly landā€¦ But itā€™s just on disk inside of Postgres. And so it goes out of Postgres, goes out live, so every request is immediateā€¦ And then every time that we change something thatā€™s going to change the feeds, we blow that one away, and we rewrite it, and so we recompute the feed. Itā€™s basically a cache inside of Postgres, because thatā€™s already our single source of data. Whereas if we did it anywhere else, weā€™d have to have a shared data source etc.

[58:03] I think whatā€™s more important is that this enables us to run more than one instance of Changelog.

Exactly.

Right now, because of how caching is done, we can only have one instance of Changelog. And we have been on this journey for quite some time now. Right? If you remember, we had a persistent disk. So we did have a local disk. But when we had that, it meant that we could only have a single instance, because all our media assets were stored on that one disk. So we pushed the media assets to S3, and now we could have more than one. But then the next thing was like ā€œOh, dang it. The caching.ā€ So once we solve the caching, we can run more than one instance, we can spread them across the world, we can serve dynamic requests from where users are, rather than everything going through the CDN, and the CDN really only caches the static stuff. And even then, it has to timeout. Thatā€™s why we have also like the time, because the CDN and also caches for about 60 seconds.

Right. Yeah, the other thing lets us do is serve different feeds to different requesters. And so hereā€™s why this might be interestingā€¦ So Spotify specifically supports, allegedly - I havenā€™t seen it working very muchā€¦ They support chapters, if you put them as text in your show notes, using the YouTube style timestamps thing. So I just put it in for everybody at this point. But itā€™s silly to put it into the show notes for listeners who have regular podcast apps that support chapters the way that you should, not because theyā€™re Spotify.

Well, we could just serve from using this system. We could have two different versions of the feed, both put into Postgres, use the request header to identify Spotify, because it has a standard request, and serve a slightly different feed to Spotify than we serve to everybody else, and give them those timestamps. So you get the chapters over there, but you donā€™t clutter up your feeds for everybody else. And you canā€™t do that very well with caching, because itā€™s like ā€œWell, weā€™ve got a cached versionā€, right? And the requests never hit our server; theyā€™re just Fastly. And maybe you can put that logic inside of Fastly, but now you have to point it to different places, and manage that whole dealā€¦

And so this also enables that, where you can basically have N caches per request, and serve the right one dynamically, but still have it precomputed. So itā€™s kind of the best of both worlds. By the way, to our listener, I realized this is kind of a dumb way of doing it. If itā€™s super-dumb, and you have reasons why, please, tell me, because Iā€™m about to roll it outā€¦ [laughs]

ā€œIā€™m about to roll it outā€¦!ā€

I donā€™t think it is.

Why is it dumb? Why do you keep saying this? Why do you think itā€™s dumb? Whatā€™s the logic behind it being dumb?

Storing precomputed text inside of Postgres - itā€™s somewhat large. I read some ā€“ like, how big is too big, and itā€™s like 2.3 megabytes in a Postgres record. It seems like itā€™s fine, actually, but once you start getting up to like 100 megabytes, now youā€™re in trouble. Weā€™re not going to make it there with any of our documents. But maybe even at 2.3 megabytes, at scale itā€™s just going to read too slow. I donā€™t know, it seems like a very low-tech, kind of silly way of doing itā€¦ And so maybe itā€™s just lack of confidence, is why I think it sounds dumb.

I think this is a step in the right direction, because Fly brings the app closer to the users.

Right.

And Fly really makes it less necessary to run a CDN, or maybe completely unnecessary, depending on the case. If we want to depend less on the CDN, which I think is a good idea, and if we distributed our apps around the world, that means that we can rely less on the CDN - which by the way, had like all sorts of issues which we are yet to solve - and serve directly from our appā€¦ So basically, we are reverting back, putting changelog.com behind the CDN. And we had to do that, because we had a single instance, we had all sorts of issues related to thatā€¦ But now, if we have multiple instances, one per continent - again, depending on where our users are - we no longer need to depend on the CDN as much as we did before.

[01:02:12.28] And by the way, Fly itself, it has a proxy, it has a global proxy, which means that depending on where you are, those edge instances, they will connect to the app instance which is closest to the edge. So then we are pulling more of that stuff in our app, which makes us be able to code more things, as Jerod mentioned, pull more of that smarts in code, rather than in CDN configuration or other thingsā€¦ Which are very difficult to understand, very difficult to troubleshootā€¦ I mean, weā€™ve had so many hair-pulling moments. Thatā€™s why we have so little hair [unintelligible 01:02:46.00] sections, going like ā€œWhy the hell? How does this varnish even work, because it doesnā€™t make any sense?ā€

Right. And we built our own little version control inside of Fastly, between Gerhard and I, by adding a comment and putting whose name it is at Last Edited, which we would love to just have our actual programming tooling.

It seems smartā€¦

If it takes us to where we wanna go, I agree with you 100% that having our app be its own CDN, so to speak, closer to all the users, which is what Fastly is giving us, at the app level, then it can be dynamic in ways that is possible with Fastly, but itā€™s just cumbersome to this day.

Yeah. And I guess one more layer here is we havenā€™t truly embodied the vision of Fly, which is our app close to our users, because of this cache issue. This is full circle; the whole reason for this cache experiment was to be able to bring to fruition that actual dream with no ops, or very, very little opsā€¦ But we havenā€™t been able to do that because of this cache layer.

Well, our app does run close to our users in the greater Houston areaā€¦ [laughter]

Yeahā€¦ Itā€™s actually in Virginia.

Oh, is it?

Yeah, yeah.

Well. It shows what I know.

Itā€™s the IAD data center. Yeah.

Yeah. Well, all that to say, getting to this direction is is challenging. I think the logic in this Postgres sounds fine. I mean, if we were, like you had said, above a larger thresholdā€¦ A couple megs, not that big of a deal. And if the app is close to the user, and thereā€™s one ā€“ Iā€™m assuming thereā€™s probably like one or two primary Postgres writes, and then the rest are reads, right? Thatā€™s how it would set up, naturally, with Postgres on Flyā€¦

Yeah, the writes would actually happen on publish. The writes happen on edit, not on first request, which is what happens now with a typical caching. First request, we calculate it once. Now weā€™re not going to calculate it again for 60 seconds. Then weā€™ll calculate it once. This is actually on write, is when weā€™re doing the compute, which we wanted to move to.

The other option is to put this on a static file server like S3, and then manage and blow away different files. But then I started thinking, like, we actually like our URLs, how they are, and so then our app would be reading from S3 and responding as a proxyā€¦ And itā€™s like ā€œWell, it was already proxy to Postgres.ā€ I donā€™t know. But yeah, we would cache on write versus on read, which makes us have immediate changes. Thereā€™s no 60-second delay, or five minutes, or whatever you send it to.

And Iā€™m in that camp. I mean, I listen to our show immediately, as soon as we ship The Changelog at leastā€¦ I mean, as just a crazy person, whenever you ship something, you want to make sure itā€™s in production. And the only way to do it is like to test it. And the app I use is Overcast primarily. I donā€™t think I have notifications on, because I just hate notifications just generally. If I donā€™t have to have notifications on for an application, theyā€™re off, for sure. But when I do go there, I usually test it on the master feed directly, becauseā€¦ I listen to Master, like you should be. Hey, listener, if youā€™re not listening on Master, youā€™re wrong. Or Plus Plus; then youā€™d be even betterā€¦

Right.

[01:06:06.04] ā€¦because itā€™s betterā€¦ But Iā€™m a Master feed subscriber in that regard, and pull to refresh, and it does take a bit for the new episodes to get there, for me at least. So Iā€™m not like I ship it and 30 seconds or a minute later itā€™s in Overcast. It takes longer than Iā€™ve counted, letā€™s just say. I havenā€™t actually sat there and counted. Itā€™s like ā€œOh, itā€™s not there. Iā€™ll come back laterā€, and come back and itā€™s there.

The one thing about this which gets me really excited is that we will double down on PostgreSQL. So we talked about this for a whileā€¦ Crunchy Data is what Iā€™m thinking. But itā€™s not the only way.

In what regard are you thinking Crunchy Data?

Iā€™m thinking a PostgreSQL as a service, that scales really, really well, so then the app is all Fly. PostgreSQL is managed via Crunchy Data. We have a global presence, nicely replicated, all that nice stuff. And then we consume PostgreSQL as a service at a global scale. Our app runs at a global scale, on Fly, and the database the same, but with someone else. Because the PostgreSQL in Fly - itā€™s not a managed one. Itā€™s easy, convenient, we have a lot of advantages, and itā€™s been holding up really well since we set it up. No issues. But we can ā€“ I mean, if the app is distributed, and if the app gets this level of attention, I think so should our database, because now these are the two important pieces. We scale the app, we should scale the database. I mean, if for example we have all these app instances that connect to the same PostgreSQL instance back in the US, thatā€™s not going to be any good. Right? Reading all those megabytes across continentsā€¦ Thatā€™s going to be slow.

Isnā€™t that the point though for like the read servers that are distributed?

So we could add multiple PostgreSQL read replicas in Fly; we could do that. Maybe tune themā€¦ Maybe. I donā€™t know. Maybe try and understand better what they doā€¦ But maybe, rather than doing that, we can grow up our approach to databases, and go with someone that does this as a service. I know Planet Scale comes up as wellā€¦ Thereā€™s like a couple that we can use PostgreSQL as a service.

But thatā€™s MySQL, Planet Scale.

Thereā€™s one which I know is PostgreSQL. Maybe itā€™s not Planet Scaleā€¦ What was itā€¦?

Supabase?

I think itā€™s Supabase. I think itā€™s Supabase. I think thatā€™s what Iā€™m thinking. Yeah. See? Not enough time to experiment. [laughs]

There is a conversation, letā€™s just say thereā€™s a conversation. So we may be meeting in the middle, letā€™s just say. Donā€™t wanna give too much away.

Exactly.

But dreamsā€¦ We are dreaming together.

Exactly. And we need to experiment a lot. So thatā€™s the whole point, right? We need to try a couple of things out, see what makes senseā€¦ I know Jerod loves his PostgreSQL, the vanilla one, the open source oneā€¦

I doā€¦

You know, as unaltered as they come.

So goodā€¦

Weā€™re actually coming out with a T-shirt, Gerhard. It says ā€œPostgres-compatible is not Postgres.ā€ [laughter]

Really?! Okay, I wasnā€™t aware of thatā€¦ Okay.

No, not really.

Okayā€¦

We want to.

Is that the Jerod tagline?

No, thatā€™s actually a Craig Kerstiens tagline.

I do like ā€œJust Postgresā€ as a T-shirt.

ā€œJust Postgres.ā€ Yeah.

Just Postgres.

We will be doubling down on that. Thatā€™s what matters. And weā€™ll be improving that part as well. All this is leading us into that direction, and thatā€™s really exciting.

Thatā€™s why I wrote this right hereā€¦ I was writing it right there.

There you go. On a napkin? Itā€™s a thing!

Okay! Now we have a plan.

Thatā€™s how all dreams start, on a napkin.

Mm-hm. Iā€™ve been doodling while weā€™re having this call.

Put some Bā€™s and some dollars as well, while youā€™re at it.

Yeah, put some dollars on there.

Step one, Postgres. Step two, question mark. Step three, profit.

[01:09:52.19]Or Postgres, change the s into $. Thatā€™d be good.

Thatā€™s right, Iā€™ll do that.

Thatā€™s our business plan. Weā€™re gonna turn Postgres into dollars.

Well, letā€™s say somebodyā€™s listened this far, and theyā€™re thinking, ā€œMan, this really sucks, okay?ā€

What sucks?

ā€œIā€™m here at the end of this amazing episodeā€“ā€ Well, Iā€™m gonna tell you what sucks. Iā€™m gonna tell you. Theyā€™re gonna be like ā€œI liked this show. Come on, guysā€¦ Whatā€™s going on here?ā€ Can we dream a little bit to where this might go, the next version of Kaizen? Can we give them some prescription? Versus just wait and see? Jerod, you mentioned subscribing to the Changelog, which I think is a great next step after thisā€¦

Well, I think it makes sense to do our next Kaizen on the Changelog if we donā€™t have anywhere else to do itā€¦

Thatā€™s right. Yeah.

Which is probably likely, right? I mean, we could cross-post it to the Ship It feed, I guessā€¦

Or episode 91 will be Kaizen in two and a half months. [laughter]

Yeah. And so will 92.

Thatā€™s also possible. And so will 92, yeah. Or we go straight to 100, and then people are like ā€œWhat the hell? Whereā€™s all the rest?ā€

Right.

So itā€™ll be 90, 100ā€¦ It will be just going 10 to 10. We were just talking about Fahrenheit and Celsiusā€¦ [laughter]

Thatā€™s more of a Celsius thingā€¦ 100 is hot. I would say we would publish our next Kaizen on the Changelog feed. Ainā€™t that safe? Thatā€™s probably the safest bet today.

I think so. Itā€™s what makes most sense to me, too.

And stay tuned for more. Weā€™ll have more to say on that episode.

Well, I have one thing which I really have to say, and I have to mention this, because Iā€™ve been trying to get to someone from 1Password since January 15th, when I sent my email, and I havenā€™t heard backā€¦ So if someone knows someone within 1Password that can help with their services accountā€¦ This is so that we can use secrets from 1Password without needing to run the Connect server. I mean, we will set up a Connect server if we need to, but hopefully, weā€™ll be able to access the secrets using this new beta feature, which as far as Iā€™m aware, itā€™s called Services Accounts, that allows us to use the secrets programmatically in CI systems. Right now, we canā€™t do that without the Connect server. And ideally, I would like to use the Go SDK - and you see where Iā€™m going with thisā€¦ To use it directly in code, so that our CI will never see the secrets. Itā€™s just a code that connects to the 1Password instance, and it pulls it just in time as the code runs. So if anyone knows someone, I would very much like to talk to them to get this feature, try this beta feature, see how it works. Alternatively, how do you feel about a migration from 1Password? [laughs]

Ohā€¦

Negative.

Rotating secrets is my favorite thing to doā€¦ Yes, I mean - we want something that works, and works well, soā€¦

We can set up a Connect server. I mean, itā€™s so easy to set anything up on Fly these days, so maybe weā€™ll just do thatā€¦ Which will act as a gateway to 1Password.

[01:13:04.23] Well, we can make something happen with 1Password, there is some opportunity there. Soā€¦

Great. Thatā€™s the one thing which was on my list.

Let me go to work, you know?

Excellent.

Iā€™m a big fan of 1Password.

I like it too, very much.

And I root for them, in all ways. Iā€™ve been using them for more than a decade. I mean, like just basically forever. Theyā€™re embedded in my operations. And now with SSH integrations, and stuff like that - I just love biometricallyā€¦ And thank you for removing all of our SSH needs, Changelog.com infrastructure-wise, but I still have LAN infrastructure that I have to log into, and biometrically logging in via SSH is just ā€“ itā€™s the way to go.

Yeah, for sure. Yeah. And I was reading this blog post on the 1Password blog about passwordless systems. Iā€™m just going to double check the titleā€¦ So the blog post is ā€œPass keys in 1Password - the future of passwordless.ā€ And it was published on November 17th, 2022. So not that long ago. And it was mentioned a couple more times.

So I think thatā€™s a really cool ideaā€¦ So I really like where 1Password is, and where theyā€™re goingā€¦ If we can only figure this thing out, it will be even more amazing for us. So no more secrets in GitHub. Yes, baby! Thatā€™s what I want.

Alright. Wellā€¦

Should we call it a pod?

I think we should call it a pod. Someone needs to sing something, I feel likeā€¦ Itā€™s my birthday tomorrow, soā€¦

Jerod singsā€¦!

Happy Trails to youā€¦

See? Told ya.

Thatā€™s all youā€™re gettingā€¦ Until we meet again.

He tried to sing Semisonic on the ā€“

Closing Time?

ā€¦on the & friends episode we did. Yeah, you started singing Closing Time. I edited you right out of that, man. I didnā€™t want you embarrassedā€¦ You did not do a good job. [laughs]

All I said was ā€œYou donā€™t go home, but you canā€™t stay here.ā€

Well, thatā€™s what happened in the one that shipped.

Ahā€¦!

Behind the scenes, it was worse. Iā€™m just messing with you, Jerod. Iā€™m just being silly.

I donā€™t even believe you.

With all this time that Iā€™m going to have from not shipping a Ship It episode every week - do you know what Iā€™m going to do instead? Iā€™m going to go Dan-Tan! [laughter] Thatā€™s whatā€™s happeningā€¦

Oh, my gosh. Dan-Tanā€¦ Comes again!!

Every week, Iā€™ll go Dan-Tan. [laughs]

Dan-Tanā€¦!

So thatā€™s whatā€™s up.

Oh, my goshā€¦

I love it.

Iā€™ve got my kids saying Dan-Tan now.

There we go.

Never telling that story again.

Everyone is on it.

Everyoneā€™s saying it.

So thatā€™s my plan.

Alrightā€¦

Sounds good, Gerhard.

Alright.

Thank you.

It has been good. Thank you.

Always a pleasure. There will be a next one, two and a half months away. Right? Roughly. So I donā€™t exactly when, but two and a half months away. It will be warm and nice where you are, Iā€™m sure.

Iā€™m looking forward to thatā€¦ Kaizen!

Same. Kaizen!

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. šŸ’š

Player art
  0:00 / 0:00