We’re excited to have Tuhin join us on the show once again to talk about self-hosting open access models. Tuhin’s company Baseten specializes in model deployment and monitoring at any scale, and it was a privilege to talk with him about the trends he is seeing in both tooling and usage of open access models. We were able to touch on the common use cases for integrating self-hosted models and how the boom in generative AI has influenced that ecosystem.
What is the model lifecycle like for experimenting with and then deploying generative AI models? Although there are some similarities, this lifecycle differs somewhat from previous data science practices in that models are typically not trained from scratch (or even fine-tuned). Chris and Daniel give a high level overview in this effort and discuss model optimization and serving.
Gerhard joins us for the 11th Kaizen and this one might contain the most improvements ever. We’re on Fly Apps V2, we’ve moved from S3 to R2 & we have a status page now, just to name a few.
On Monday, Kelsey Hightower announced his retirement from Google. On Tuesday, he sat down with us to discuss why, how & what’s next.
Along the way, Kelsey teaches us how not to suck at work, analyzes his magical demos, fights off the haters (again) & opines on System Initiative, Dagger & 37Signals moving off the cloud.
This week we’re joined by Adam Jacob and we’re talking about his mission at System Initiative to rebuild DevOps. They are out of stealth mode and ready to show off their transformative new power tool that reimagines what’s possible from DevOps. It’s an intelligent automation platform that allows DevOps teams to build detailed interactive simulations of their infrastructure and use them to rapidly update their production environments.
This is our 9th Kaizen with Adam & Jerod. We start today’s conversation with the most important thing: embracing change. For Gerhard, this means putting Ship It on hold after this episode. It also means making more time to experiment, maybe try a few of those small bets that we recently talked about with Daniel. Kaizen will continue, we are thinking on the Changelog. Stick around to hear the rest.
Tim McNamara is known as New Zealand’s Rust guy. He is the author of Rust in Action, and also a Senior Software Engineer at AWS, where he helps other builders with all things Rust.
The main reason why Gerhard is intrigued by Rust is the incredible resource frugality. Fewer CPUs means less energy used, which is good for the planet, and good for the monthly bill. This becomes most noticeable at Amazon’s scale, when S3, Lambda, CloudFront and other services start adding Rust components.
We’ve been hearing about “serverless” CPUs for some time, but it’s taken a while to get to serverless GPUs. In this episode, Erik from Banana explains why its taken so long, and he helps us understand how these new workflows are unlocking state-of-the-art AI for application developers. Forget about servers, but don’t forget to listen to this one!
Worlds are colliding! This week we join forces with the hosts of the MLOps.Community podcast to discuss all things machine learning operations. We talk about how the recent explosion of foundation models and generative models is influencing the world of MLOps, and we discuss related tooling, workflows, perceptions, etc.
In our ops & infra world, we learn to optimise for redundancy, for mean time to recovery and for graceful degradation. We instinctively recognise single points of failure, and try to mitigate the risks associated with them.
For some years now, Daniel Vassallo has been doing the same, but in the context of life & work. Daniel talks about the role of randomness, about learning from small wins & about optimising for a lifestyle that matches your true preferences,. Apparently, ideas too should be treated like cattle, not pets.
Last September, at the 🇨🇭 Swiss Cloud Native Day, Florian Forster, co-founder & CEO of ZITADEL, talked about why they switched to serverless containers. ZITADEL has a really interesting workload that is both CPU intensive and latency sensitive. On top of this, their users are global, and traffic is bursty. Florian talks about how they evaluated AWS, GCP & Azure before they settled on the platform that met their requirements.
Lars is big on Elixir. Think apps that scale really well, tend to be monolithic, and have one of the most mature deployment models: self-contained releases & built-in hot code reloading. In episode 7, Gerhard talked to Lars about “Why Kubernetes”. There is a follow-up YouTube stream that showed how to automate deploys for an Elixir app using K3s & ArgoCD.
More than a year later, how does Lars think about running applications in production? What does simple & straightforward mean to him? Gerhard’s favourite: what is “human scale deployments”?
Marcos Nils has been into platform engineering for the best part of the last decade. He helped architect & build developer platforms using VMs & OpenStack, containers with Docker, and even Kubernetes. He did this at startups with 10 people, as well as large, publicly traded companies with 1000+ software engineers.
Today we talk with Marcos about the hard parts of platform engineering.
Welcome to 2023! A new year is the perfect time to start with a fresh perspective. Given a few bare metal hosts with fast, local storage, how would you run your workloads on them? Would you cluster them for redundancy? What operating system would you choose?
Steve Francis, CEO at Sidero Labs and Andrew Rynhard, CTO at Sidero Labs join us today to talk about running Talos Linux on bare metal.
Eight months ago, in 🎧 episode 49, Alex Sims (Solutions Architect & Senior Software Engineer at James & James) shared with us his ambition to help migrate a monolithic PHP app running on AWS EC2 to a more modern architecture. The idea was some serverless, some EKS, and many incremental improvements.
So how did all of this work out in practice? How did the improved system cope with the Black Friday peak, as well as all the following Christmas orders? Thank you Alex for sharing with us your Ship It! inspired Kaizen story. It’s a wonderful Christmas present! 🎄🎁
Narayanan Raghavan leads the global SRE organization that runs Red Hat managed cloud services including OpenShift Dedicated, Azure Red Hat Openshift, Red Hat OpenShift Service on AWS, and Red Hat OpenShift Data Science among others across the three major cloud providers: AWS, GCP & Azure. We start with a high-level discussion about DevOps, SRE & platform engineering, and then we dig into SRE specifics, including what it takes to safely roll out updates across many tens of thousands of OpenShift clusters.
In today’s episode, we have the pleasure of two guests: Whitney Lee, Staff Technical Advocate at VMware, the one behind the ⚡️ Enlightning episodes, and Mauricio Salatino, which you already know from 🎧 shipit.show/41 on Continuous Delivery for Kubernetes.
The two of them gave the most amazing KubeCon NA Keynote last month: What a RUSH! Let’s Deploy Straight to Production!
So how do we create an Internal Development Platform that enables anyone on the team to deploy straight to production with the confidence that everything will just work?
For our last 2022 Kaizen episode, we went all out:
- 💪 @jerod outdid himself in the number of improvements shipped between Kaizens
- 🕺 A few of our listeners contributed → prompted us to create a new contributing guide
- 🗺 We now have a new infrastructure diagram
All of this, and a whole lot more, is captured as GitHub discussion 🐙 changelog.com#433. If you want to see everything that we improved, that is a great companion to this episode.
In your company, who designs the end-to-end developer experience? From design to implementation, what is the developer experience that you actually ship? Even though the average developer wastes almost half of their working hours because of bad DX, many of us don’t even know what that means, or how to improve it.
Kenneth Auchenberg is working at Stripe, building economic infrastructure for the internet. Gerhard found his perspective on Developer Experience Infrastructure (DXI) refreshingly simple, as well as very useful.
In today’s episode we have the pleasure of Audun Fauchald Strand, Principal Software Engineer at NAV.no, Norway’s Labour & Welfare Administration. We will be talking about NAIS.io, the application platform that runs on-prem, as well as on the public cloud.
Imagine hundreds of developers shipping on an average day 300 changes into a system which processes $100,000,000 worth of transactions on a quiet week. If you think this is hard, consider the context: a government institution which must comply with all laws & regulations.
Maggie Johnson-Pint from Stanza sits down with Amal & Divya for a deep-dive in to the production side of the development world. If you’re at all curious (and/or intimidated) by terms like Site Reliability Engineering (SRE), Service Level Objective (SLO), OpenTelemetry, distributed tracing, and the like… this episode’s for you!
15 years ago, Gerhard discovered magic in the form of Ruby on Rails. It was intuitive and it just worked. That is the context in which Gerhard fell in love with infrastructure and operations.
Today, for special episode 77, we start at Seven Shipping Principles, and, in the true spirit of Ship It, we’ll see what happens next.
Our guest is David Heinemeier Hansson, creator of Ruby on Rails, co-founder of Basecamp & HEY, and a lot more - check out dhh.dk.
In today’s episode, we talk about distroless,
melange, musl and glibc. The context is Wolfi OS, a community Linux OS designed for the container and cloud-native era. If you are looking for the lightest possible container base image with 0 CVEs and both glibc and musl support, Wolfi OS & the related chainguard-images are worth checking out.
Ariadne Conill is an Alpine Linux TSC member & Software Engineer at Chainguard.
Few genuinely need a multi-cloud setup. There is plenty of advice out there which mostly boils down to don’t do it, you will be worse off. Vex.dev is a startup that provides APIs for video and audio streaming. The hard part is real-time combined with massive scale - think hundreds of thousands of concurrent connections. They achieve this by using a combination of Fly.io, AWS and GCP. Jason Carter, founder of Vex Communications, is joining us today to talk about the multi-cloud setup that vex.dev runs.
I don’t think that you can imagine just how excited Gerhard was to find out that Audi, his favourite car company, has a Kubernetes competence centre. We have Sebastian Kister joining us today to tell us why people, followed by tech make the process.
The right thing to focus on is the genuine smiles that people give in response to something we do or say. That is an important SLI & SLO for reducing friction between silos.
How does this impact the flow of artefacts into production systems that design & build cars?