Ops Icon

Ops

DevOps, infrastructure, etc.
171 Stories
All Topics

Ship It! Ship It! #88

Treat ideas like cattle, not pets

In our ops & infra world, we learn to optimise for redundancy, for mean time to recovery and for graceful degradation. We instinctively recognise single points of failure, and try to mitigate the risks associated with them.

For some years now, Daniel Vassallo has been doing the same, but in the context of life & work. Daniel talks about the role of randomness, about learning from small wins & about optimising for a lifestyle that matches your true preferences,. Apparently, ideas too should be treated like cattle, not pets.

Ops blog.railway.app

Why we ship the most code on Friday

Greg Schier from Railway:

The theory goes that if you ship code on Friday you’ll inevitably be working Saturday to fix what you broke. This makes logical sense; software is messy and mistakes are unavoidable.

But no, shipping code on Friday isn’t bad. At least, it doesn’t have to be.

The key to their success is confidence, which comes from having a solid deployment pipeline and shipping through it often. Gerhard Lazu hit that nail on the head in this short clip as well.

Ship It! Ship It! #87

Why we switched to serverless containers

Last September, at the 🇨🇭 Swiss Cloud Native Day, Florian Forster, co-founder & CEO of ZITADEL, talked about why they switched to serverless containers. ZITADEL has a really interesting workload that is both CPU intensive and latency sensitive. On top of this, their users are global, and traffic is bursty. Florian talks about how they evaluated AWS, GCP & Azure before they settled on the platform that met their requirements.

Ship It! Ship It! #86

Human scale deployments

Lars is big on Elixir. Think apps that scale really well, tend to be monolithic, and have one of the most mature deployment models: self-contained releases & built-in hot code reloading. In episode 7, Gerhard talked to Lars about “Why Kubernetes”. There is a follow-up YouTube stream that showed how to automate deploys for an Elixir app using K3s & ArgoCD.

More than a year later, how does Lars think about running applications in production? What does simple & straightforward mean to him? Gerhard’s favourite: what is “human scale deployments”?

Ship It! Ship It! #85

The hard parts of platform engineering

Marcos Nils has been into platform engineering for the best part of the last decade. He helped architect & build developer platforms using VMs & OpenStack, containers with Docker, and even Kubernetes. He did this at startups with 10 people, as well as large, publicly traded companies with 1000+ software engineers.

Today we talk with Marcos about the hard parts of platform engineering.

Ship It! Ship It! #84

Bare metal meets Talos Linux (the K8s OS)

Welcome to 2023! A new year is the perfect time to start with a fresh perspective. Given a few bare metal hosts with fast, local storage, how would you run your workloads on them? Would you cluster them for redundancy? What operating system would you choose?

Steve Francis, CEO at Sidero Labs and Andrew Rynhard, CTO at Sidero Labs join us today to talk about running Talos Linux on bare metal.

Ship It! Ship It! #83

🎄 Planning for failure to ship faster 🎁

Eight months ago, in 🎧 episode 49, Alex Sims (Solutions Architect & Senior Software Engineer at James & James) shared with us his ambition to help migrate a monolithic PHP app running on AWS EC2 to a more modern architecture. The idea was some serverless, some EKS, and many incremental improvements.

So how did all of this work out in practice? How did the improved system cope with the Black Friday peak, as well as all the following Christmas orders? Thank you Alex for sharing with us your Ship It! inspired Kaizen story. It’s a wonderful Christmas present! 🎄🎁

Ship It! Ship It! #82

Red Hat's approach to SRE

Narayanan Raghavan leads the global SRE organization that runs Red Hat managed cloud services including OpenShift Dedicated, Azure Red Hat Openshift, Red Hat OpenShift Service on AWS, and Red Hat OpenShift Data Science among others across the three major cloud providers: AWS, GCP & Azure. We start with a high-level discussion about DevOps, SRE & platform engineering, and then we dig into SRE specifics, including what it takes to safely roll out updates across many tens of thousands of OpenShift clusters.

Ship It! Ship It! #81

Let's deploy straight to production!

In today’s episode, we have the pleasure of two guests: Whitney Lee, Staff Technical Advocate at VMware, the one behind the ⚡️ Enlightning episodes, and Mauricio Salatino, which you already know from 🎧 shipit.show/41 on Continuous Delivery for Kubernetes.

The two of them gave the most amazing KubeCon NA Keynote last month: What a RUSH! Let’s Deploy Straight to Production!

So how do we create an Internal Development Platform that enables anyone on the team to deploy straight to production with the confidence that everything will just work?

Ship It! Ship It! #80

Kaizen! 24 improvements & a lot more

For our last 2022 Kaizen episode, we went all out:

  • 💪 @jerod outdid himself in the number of improvements shipped between Kaizens
  • 🕺 A few of our listeners contributed → prompted us to create a new contributing guide
  • 🗺 We now have a new infrastructure diagram

All of this, and a whole lot more, is captured as GitHub discussion 🐙 changelog.com#433. If you want to see everything that we improved, that is a great companion to this episode.

Ops matthewtejo.substack.com

Why Twitter didn’t go down (from a real Twitter SRE)

Matthew Tejo:

Twitter supposedly lost around 80% of its work force. What ever the real number is, there are whole teams with out engineers on it now. Yet, the website goes on and the tweets keep coming. This left a lot wondering what exactly was going on with all those engineers and made it seem like it was all just bloat. I’d like to explain my little corner of Twitter (though it wasn’t so little) and some of the work that went on that kept this thing running.

This is a detailed post about Twitter’s caching system that Matthew and others built while working there, and it’s brilliantly summed up by commenter Johnny Manu40:

When everything works fine, they wonder why they hired you. When everything stops working, they wonder why they hired you. I.T. in a nutshell.

Ship It! Ship It! #79

Developer Experience Infrastructure (DXI)

In your company, who designs the end-to-end developer experience? From design to implementation, what is the developer experience that you actually ship? Even though the average developer wastes almost half of their working hours because of bad DX, many of us don’t even know what that means, or how to improve it.

Kenneth Auchenberg is working at Stripe, building economic infrastructure for the internet. Gerhard found his perspective on Developer Experience Infrastructure (DXI) refreshingly simple, as well as very useful.

Ship It! Ship It! #78

The system that runs Norway's welfare payments 🇳🇴

In today’s episode we have the pleasure of Audun Fauchald Strand, Principal Software Engineer at NAV.no, Norway’s Labour & Welfare Administration. We will be talking about NAIS.io, the application platform that runs on-prem, as well as on the public cloud.

Imagine hundreds of developers shipping on an average day 300 changes into a system which processes $100,000,000 worth of transactions on a quiet week. If you think this is hard, consider the context: a government institution which must comply with all laws & regulations.

Ship It! Ship It! #77

Seven shipping principles

15 years ago, Gerhard discovered magic in the form of Ruby on Rails. It was intuitive and it just worked. That is the context in which Gerhard fell in love with infrastructure and operations.

Today, for special episode 77, we start at Seven Shipping Principles, and, in the true spirit of Ship It, we’ll see what happens next.

Our guest is David Heinemeier Hansson, creator of Ruby on Rails, co-founder of Basecamp & HEY, and a lot more - check out dhh.dk.

Go github.com

A configuration management system for Pets, not Cattle

This is for people who need to administer a handful of machines, all fairly different from each other and all Very Important. Those systems are not Cattle! They’re actually a bit more than Pets. They’re almost Family. For example: a laptop, workstation, and that personal tiny server in Sweden. They are all named after something dear.

Liz Rice and I spoke about pets & cattle recently on The Changelog. I asked her, “from your perspective, are the people still doing it the old-school pets way?”

This tool is a great example of Pets-style administration being alive and well.

Ship It! Ship It! #76

Container base images with glibc & musl

In today’s episode, we talk about distroless, ko, apko, melange, musl and glibc. The context is Wolfi OS, a community Linux OS designed for the container and cloud-native era. If you are looking for the lightest possible container base image with 0 CVEs and both glibc and musl support, Wolfi OS & the related chainguard-images are worth checking out.

Ariadne Conill is an Alpine Linux TSC member & Software Engineer at Chainguard.

Ship It! Ship It! #75

How vex.dev runs on AWS, Fly.io & GCP

Few genuinely need a multi-cloud setup. There is plenty of advice out there which mostly boils down to don’t do it, you will be worse off. Vex.dev is a startup that provides APIs for video and audio streaming. The hard part is real-time combined with massive scale - think hundreds of thousands of concurrent connections. They achieve this by using a combination of Fly.io, AWS and GCP. Jason Carter, founder of Vex Communications, is joining us today to talk about the multi-cloud setup that vex.dev runs.

Brandon Willett willett.io

How to build software like an SRE

This is awesome — Brandon, please write down more lessons learned, even if they’re just something for you to reference later on.

I’ve been doing this “reliability” stuff for a little while now (~5 years), at companies ranging from about 20 developers to over 2,000. I’ve always cared primarily about the software elements I describe as living “outside” the application – like, how does it get its configuration? What kinds of instances does it run on, and are those the best kinds to use? What steps does it take on its path from “code in a repository” to “running in production”? And I’ve always kept track of what I liked – which mechanisms allowed fast iteration and which caused frustration, which led to outages and which prevented them.

… So! With that out of the way – this is how I’d rebuild it all from scratch if I could.

Ship It! Ship It! #74

Vorsprung durch Technik

I don’t think that you can imagine just how excited Gerhard was to find out that Audi, his favourite car company, has a Kubernetes competence centre. We have Sebastian Kister joining us today to tell us why people, followed by tech make the process.

The right thing to focus on is the genuine smiles that people give in response to something we do or say. That is an important SLI & SLO for reducing friction between silos.

How does this impact the flow of artefacts into production systems that design & build cars?

Ship It! Ship It! #73

A modern bank infrastructure

Matias Pan is a Staff Software Engineer at Lemon Cash, a crypto startup based in Argentina. Lemon infrastructure runs digital wallets & physical cards, which technically makes them a bank. How does Matias & his team think about enabling developers get code from their workstations into production? Remember, we are talking about a bank - a bad deploy is a big deal. And when a bad database migration goes out, what happens then?

Ship It! Ship It! #72

Klustered & Rawkode Academy

One of our listeners, Andrew Welker, suggested that we talk about Klustered, so a few hours before David Flanagan was about to do his workshop at Container Days, we recorded this episode. We talked about all the weird and wonderful Kubernetes debugging sessions on Klustered, a YouTube playlist with 43 videos and counting.

We then talked about Rawkode Academy, and we finished with conferences. Good thing we did, because David almost forgot about KubeHuddle, the conference that he is co-organising next week. Gerhard is looking forward to talking at it! No, seriously, check it out at kubehuddle.com.

Ship It! Ship It! #71

Modern Software Engineering

Dave Farley, co-author of Continuous Delivery, is back to talk about his latest book, Modern Software Engineering, a Top 3 Software Engineering best seller on Amazon UK this September. Shipping good software starts with you giving yourself permission to do a good job. It continues with a healthy curiosity, admitting that you don’t know, and running many experiments, safely, without blowing everything up. And then there is scope creep…

Ship It! Ship It! #70

Kaizen! Four PRs, one big feature

In today’s Kaizen episode, we talk about shipping Adam’s Christmas present: chapter support for all Changelog episodes that we now publish. This feature was hard because there are many subtle differences in how the ID3 spec is implemented. Of course, once the PR shipped, there were other issues to solve, including an upgrade the world kind of scenario. Since Lars Wikman did all the heavy ID3 lifting, he joins us in this episode.

  0:00 / 0:00