Ship It! – Episode #54
Knative, Sigstore & swag (KubeCon EU 2022)
with Matt Moore, founder & CTO of Chainguard
This is the post-KubeCon CloudNativeCon EU 2022 week. Gerhard is talking to Matt Moore, founder & CTO of Chainguard about all things Knative and Sigstore.
The most important topic is swag, because none has better stickers than Chainguard.
The other topic is the equivalent of Let’s Encrypt for securing software.
Akuity – Akuity is a new platform (founded by Argo co-creators) that brings fully-managed Argo CD and enterprise services to the cloud or on premise. They’re inviting our listeners to join the closed beta at akuity.io/changelog. The platform is a versatile Kubernetes operator for handling cluster deployments the GitOps way. Deploy your apps instantly and monitor their state — get minimum overhead, maximum impact, and enterprise readiness from day one.
Raygun – Never miss another mission-critical issue again — Raygun Alerting is now available for Crash Reporting and Real User Monitoring, to make sure you are quickly notified of the errors, crashes, and front-end performance issues that matter most to you and your business. Set thresholds for your alert based on an increase in error count, a spike in load time, or new issues introduced in the latest deployment. Start your free 14-day trial at Raygun.com
MongoDB – An integrated suite of cloud database and services — They have a FREE forever tier, so you can prove to yourself and to your team that they have everything you need. Check it out today at mongodb.com/changelog
Chronosphere – Chronosphere is the observability platform for cloud-native teams operating at scale. When it comes to observability, teams need a reliable, scalable, and efficient solution so they can know about issues well before their customers do. Teams choose Chronosphere to help them move faster than the competition. Learn more and get a demo at chronosphere.io.
Notes & Links
Click here to listen along while you enjoy the transcript. 🎧
Well, first of all, hi, Matt. Thank you for joining us today.
Yeah, it’s great to be here. Thanks for having me on.
I have noticed from your Twitter profile that you are in the business of shipping chains. How’s the business coming along these days?
Well, we’re pivoting away from swag. I’m wearing all my Chainguard stuff, and we’ve got stickers and stuff too, including my face. Thanks to Scott for those.
Oh, that’s an amazing one. Okay, do you have one with Dan with lots of hair? Do you have such a sticker? That’s the one that I want.
Actually, we have one of just the hair somewhere. I think it may be in my closet, but yeah, there are pictures of it online, I think… But it it’s basically a halo of hair that you can use to attach to other stickers to identify them. I think Scott even made two sizes of them, so you can have the multi-level halo effect.
[04:14] Paul Morian and Dan Pop had sort of a field day on Twitter, making him have this enormous lion’s mane of hair. But yeah, Scott is our swag master and has been refining his art, and it’s great. He did it for a few years for Knative-related swag, and has carried that over to Chainguard swag, and has just been killing it.
I have to talk to him about some tips seriously, because he’s, as you mentioned, really killing it. And I’m somewhat disappointed that you’re pivoting from the swag business, because I thought you were doing really, really well. And you’re doing so well that one of the main reasons for me coming in-person to this KubeCon is to get some of that swag. I’ve been waiting years for this, so please, please, have at least two of each for me, because I’m like, first thing when I arrive, I come to the Chainguard booth and I look for the swag. So by the way, where is the Chainguard booth? Can you tell me?
So we don’t yet have a booth this KubeCon, unfortunately, but we will be around, and I think we’ll have a few options swag-wise. I’m not sure we’re toting– so let’s see… Scott, Ville and I all checked a giant box of T-shirts and then we carried them around all of KubeCon. I don’t think we’re bringing huge amounts of shirts across the Atlantic, but I think we’re going to have a form folks can submit to get swag sent to them, which will make it easier for them to get it where they are as well. I think Scott has all manner of stickers that we will be handing out; there’s the octopus, and a bunch of costumes, probably my face, although there’s a few of us… Scott has one of his face and Ville’s face, too. I think there may be one specific to Valencia, although I’m not sure he’s turning it into a sticker. He’s definitely got the image. He’s always got something cooking, so I’m sure there’ll be some fun swag surprises from him.
So I’m really looking forward to putting that link to the form in the show notes. I think people will love it, especially since this is coming out the week after the KubeCon. I think they will really appreciate it. We already established that the thing that I’m most looking forward to this KubeCon, is the Chainguard swag. So that was already established. But I’m wondering, what are you most looking forward to at this KubeCon in EU?
That’s a great question. So I think there’s a few answers to it. I think, personally, I think one of the things I’m most excited about is this is the first KubeCon ever where Knative is going to be in the CNCF. And as you know, Knative is near and dear to me, and so there will be a whole Knative Con day zero, or whatever it’s called now; there’s two days. So I think that’s one piece of it. But I love that sort of hallway track, connecting with people… And there was KubeCon in LA, but when that happened, it was still virtually impossible for folks in the EU to travel because of the state the pandemic was in. And so there’s loads of people that we collaborated with in the context of Knative and other projects that I haven’t seen in years. And especially the folks that are EU based, you just see them less often, I think. So just reconnecting with the people, since I feel like open source is all about the people.
What is the first thing that you’re going to do when you arrive in Valencia? Have you thought about it?
That’s a good question. Well, I’ll probably check into my Airbnb, wherever I’m staying. I think I’m staying with Scott and Ville, so it should be a very interesting week. But after that, I’m excited to go back to Spain, because the food when we were in Barcelona was very good. And I still think about the paella and the pulpo and stuff like that we had that week, which were just great. I probably ate way more rice than I should have that week in paella form, but it was fantastic. So I’m looking forward to some of the delicious Spanish foods, but starting to connect with people after that…
[08:29] I do have to say that’s something which I’m most looking forward to. This is my first KubeCon EU in-person. I’ve never been to KubeCon EU in-person. My previous one was obviously virtual. And I think two KubeCon EUs ago was also virtual, but that was a bit even more different, because I think it was the first– the first year, everything was just being figured out. That was a huge adjustment, 2020. KubeCon North America 2019 - that was my first and last in-person KubeCon. So this is a big one for me.
Was that San Diego?
That was San Diego. Yes, it was. Yes. That was a good one. I really enjoyed myself. Yeah. So yeah, really looking forward to that. I know that there are a couple of Chainguard talks at this KubeCon, five plus one - five KubeCon and one in the RejectConf. Have you seen any of the previous for those talks?
I don’t think I have, although it’s not a lot of topics that are peripherally related and I think a lot of them are sort of– I think most of them are very sort of open source community focused and co-presenting with other folks. But yeah, I think Adrian is doing Rejects, and I don’t think I’ll quite be there yet, because I think landing late Sunday, after– if you look at wall-clock time, it seems like it’s just this incredibly long thing travel time, but with the time change and whatnot, it always just seems like a time distortion. So yeah, I think I get there late Sunday, so I’ll unfortunately miss the Rejects talk.
And then there’s talks from– I always call him Puerco, but Adolfo and Carlos and others have talks on things like s-bombs, which I love hearing Puerco talk about SBOMs, because he’s the go-to person on SBOMs. Certainly a Chainguard– a lot of folks come to him from the community and elsewhere to talk about them. So it’s always interesting to get the latest from him, because the SBOM space is also an evolving thing, as folks are starting to wrap their heads around it and form best practices around how you do X and how you do Y and how you actually consume those to do other useful things, which really informs… How you produce it and what information you put in there, I think is heavily influenced by what you want to do with it and the use cases there, right? Do you want to do vulnerability scans, or license compliance? Do you need things down to the file level, or do you want it at the package level, or the image level? I mean, there’s just so many things. Does it have to be all-encompassing, or can it reference other things? …and so on and so forth. So it’s always great to hear him talk about that stuff.
So if you had to choose one of the five talks to watch, which one would that be?
That’s tough. So I’m going to miss Puerco’s talk, because I’m flying out early Friday. I generally am naughty. I like the hallway track so much, I often skip most talks, because they’re recorded. So that’s sort of my non-answer. I don’t know that I’ll actually go. I may go and support them, but generally, I actually don’t go to the talks at KubeCon. I meet with people and talk to the people while I’m there.
I actually hear that a lot, and I think many people do that. And especially since we haven’t seen each other in-person for years in this case, I think the hallway track will be the most popular talk at this KubeCon. I think they may want to move it somehow, like somewhere where there’s lots of room, because I see a lot of people joining that talk. I’m really, really keen to see releasing Kubernetes less often, because I find that almost like a controversial point, and more securely. I think that’s going to be very interesting. But what I would recommend, especially since you’re listening to this after KubeCon, is skimming all five talks when they come out, and then watch with intent the one that you like the most. That’s what I’m going to do.
Okay. So I would like to switch, very quickly, topics a little bit, because you mentioned something which I was fascinated about. You mentioned about the Knative revision count in the Chainguard of staging environment reaching four, five, six… And I’m wondering, why is that important?
That’s a good question. And I kind of jinxed it, because pretty much as soon as I posted that, someone– this is just our staging environment where we roll out changes every time we commit. As soon as I posted that, someone made a change to our Terraform, which ended up cycling that cluster and reset back to zero. Yeah, so one of the reasons that that was interesting to me was really about, to some extent, scale. So we create enough resources as you’re cranking up to that number, both in terms of Kubernetes services and other things, that if we hadn’t done the work to do things like garbage-collect old revisions and all kinds of other things on the Knative side, it would’ve fallen over long before that… And really not necessarily Knative falling over. It’s like your Kubernetes cluster would’ve been like, “No, you can’t create more services.”
So there’s these resource quotas that most vendors put on clusters around how many of certain types of resource you can have, like pods and services and other things. And at least in a context like GKE, that’s generally something that they control that’s proportional to the size of your control plane. And as your cluster gets bigger, they give you a bigger control plane, and you can have more of those resources.
[16:11] And so we still are running on a relatively small cluster, but if we were to just be creating all those things and not cleaning up behind it and stuff like that, we would’ve hit one of those limits by now and had a bad time. But really, we reached that high number really just by continuing to do what we needed to do and not worrying about the infrastructure, and it just reached that point, and I was like, “Great, that’s awesome.”
Is this number, four, five, six, counting the updates to Knative? Is that what this number is? So you updated can native 456 times in the staging environment?
Each service, each Knative service, when you update it– so in Kubernetes, resources have a generation. Every time you change their spec, generation in the metadata section gets incremented by the API server automatically. And so basically, what this number represents is how many times that’s been bumped. So how many times have we actually pushed a new image? How many times have we changed the configuration, whatnot, since the cluster effectively came up? Bear in mind that we’re using tooling like KO, which does reproducible builds. So the image doesn’t change unless the code changes. So that’s a fair amount of stuff that has been rolling out over time where there’s some meaningful change to those things. I think there were two that were at 456. There were some that were lower either because they got introduced later, or just haven’t had code changes to them.
So yeah, I think the fact that we stopped paying attention to the infrastructure for a while because it was just working and managed to crank along to that point was very exciting, because ultimately I think that was one of the goals with Knative, was to try and take all of those things that you need to worry about normally with Kubernetes to really build a production-ready service with, “Okay, I’ve got my deployment, I’ve got my service, I’ve got my L7 layer”, whether that’s Ingress or something like SDL virtual services, “I’ve got my autoscaler. I’ve got - how am I doing TLS? Am I using cert-manager resources?” And on, and on, and on. And even with those things, it’s hard to do things like request-based scaling, certainly in a performant way, if you want to scale to zero.
And so the fact that we can write – really, it’s a tiny amount of YAML for each of the services we stand up. I mean, at one point I was like, it fits in a tweet, right? It’s basically API version kind, metadata, and then the one bit that is a little tricky to remember - we wanted to align with what a lot of the Kubernetes apps resources look like. So it’s spec template, spec containers, and then the pointer to your image. And that’s literally all you need to stand up an HTTP service on Knative. And you get all of those things - you get auto-scaling, and if you’ve configured auto TLS, you’ll get a TLS terminated service with auto-scaling, and request based auto scaling… And it does smart rollouts at default– because we know it’s HTTP serving, we can do things like default readiness probes… And if your container is crash-looping or your HTTP server is not coming up, we don’t roll traffic over to those things.
One of the really nuanced things that is just incredibly hard to get right in Kubernetes is really nailing the life cycle around readiness probes, draining traffic, and accepting traffic.
[20:08] And in particular, there’s sort of two classes of apps. There’s apps that ignore signals and they stick around for what’s called the termination grace period, which is between the SIGTERM and the SIGKILL, which luckily defaults to like 30 seconds. So it’s not sitting around forever.
And then the other class of people are the people who – well, there’s a third, which they do it properly, but that’s like super-niche. The second big category of people are like, “Okay, well, I’m going to do signal handling. When I get SIGTERM, I’m just going to quit”, right? And that’s actually not what you want to do, right? You want to handle SIGTERM by starting to fail readiness probes, but all your normal requests will be handled properly, because it takes time from when you start to fail readiness probes until your pod is marked not ready. That’s the failure threshold on the readiness probe. And then once your pod’s marked not ready, that has to roll out to all of the network programming, right? So your pod’s endpoint has to be removed from the endpoints on the API server. So the service controller has to see that your pod’s not ready, remove it from endpoints… But you’re not done there, right? Those endpoints then have to be propagated, in vanilla Kubernetes, to all of the nodes, which have to reprogram their IP tables, or if you’re in mesh mode, every single pod sidecar now needs to know that like, okay, that endpoint is no longer available, right?
So in some cases and some scales of clusters, I don’t think that 30 seconds is even necessarily long enough. But the reason I bring it up is we did a whole bunch of magic in Knative, since we know it’s an HTTP based service, to make it so that it is really hard to get that wrong. Because it’s really hard to get it right in vanilla Kubernetes, but it’s actually really, really hard to get that wrong in Knative.
One of the things we do is we have a pre-stop hook where we do something somewhat magical where the pre-stop hook is on one container, but the place to send it is on the other container. So we have a proxy that sits in front of the application container, and when Kubernetes is going to go stop the pod, instead of actually sending any signal to the user container, it sends it to our sidecar first, and our sidecar starts to fail probes, and do it properly, so that you don’t have to.
So if you’re in the first camp of folks who doesn’t really handle the signaling at all and just continues to serve traffic normally, you will still drain properly, because our - what we call the queue-proxy - will actually handle that for you. And if you’re in the second camp, where you just do what I call a YOLO exit, you’re like, “I’ve got the SIGTERM. I’m out”, you’re still good as well, because since we have that pre-stop hook, we get the signal first, we make sure traffic has drained, and then by the time you’re actually getting that signal, traffic’s been routed away from that instance of your application. And so it’s really, really hard, actually, within the context of Knative, to handle that wrong. And I think that’s a really important thing to get right if you’re using any sort of auto-scaled application, because when you scale up, there’s a window where the new pod’s coming up, and if it reports ready before it’s really ready, you’re in trouble; you’re going to serve 500s. And when you’re scaling down, if traffic continues to go to those pods after they’ve started to shut down, you’re going to get 500s, right? So the goal is zero 500s, and we have all kinds of tests in Knative where we’re like, “No, there should be zero 500s.”
[24:05] The other thing that we do that is really hard is - and the networking layers make this incredibly hard to do, and we work around all kinds of stuff in basically every Ingress provider - is ready means ready, right? Everyone at the networking level is like, “Yeah, it’s eventually consistent. It’ll get there at some point.” But it’s like, no, if we roll out a new revision, we want to know, when we tell the user like, “Yeah, yeah, you’ve got your new code”, that we’re not lying, right? And so Knative does all of this fun stuff where we actually inject hashes of the network programming into the network configuration in ways that our elements of the data path will respond with the header that’s being injected by the network programming, and then the components we have can actually probe different things to understand what version of the network programming has been rolled out. And then once it’s been rolled out everywhere - we can’t do this in mesh mode, because we can’t probe mesh sidecars, but we do this for probing the pool of envoys if you’re running outside of mesh mode. So for instance, traffic serving off cluster, we can probe and make sure that once we fully roll things out and we say it’s rolled out, you should never get the old version. It is at the new version, because we’ve confirmed all the networking programming is there.
Listening to you unpack this, I have two things on the top of my mind that I have to get them out there. The first thing is a Knative Ship It episode with you is long overdue, and I think that’s what I’m picking from this. We really have to dive into Knative in a dedicated Ship It episode. That would be great. The second thing which I wanted to say, I think that it’s time for your timeline cleanser for this conversation. And because people are listening to this and they don’t know what it means, can you describe your timeline cleanser for us please?
Do you mean Charlie?
Yes, that’s exactly what I mean. Who is Charlie? [laughs]
He’s asleep under my desk here.
Okay. So everyone listening, Charlie is a dog. Charlie is Matt’s dog specifically, and he’s super-cute. And after Matt talks too much technical stuff and Chainguard and Knative, he brings Charlie, just to break the timeline. So that’s amazing.
See, that’s the picture. That’s the screenshot we should have taken. Okay.
He’s not thrilled that I woke him up.
How old is Charlie, I’m wondering?
Charlie is about a year and a half. So one of our previous dogs passed early in the pandemic. She was getting really old and had all kinds of health problems, and so that was sad.
I’m sorry to hear that.
She was my shadow during the work from home… And so we were like, “We’re never getting another dog.” It was too sad. And the house sort of felt empty. I mean, especially me working from home, right?
Because I was working from home when I was at VMware, and then when I took my break, it was just like, there was – I was never really home alone when I was working from home, because she was always there and she’d follow me around the house. One of her health problems - she was deaf, so she wouldn’t always follow me around, because she didn’t know I’d be wandering around if she was asleep. But so we actually got really lucky with Charlie, who’s a great dog, because we started to talk to folks with this breed of puppy… We love this kind of dog. They’re Cavalier King Charles Spaniels. They’re fantastic dogs. We started to talk to them not because we necessarily wanted one right away, because we were still getting over Lily… But during the pandemic, everyone seemed to get a dog, so there was this really long waiting period.
[28:00] So we weren’t telling our daughter that we were talking about this, or anything. And then one of the folks that my wife had reached out to called her one time when she was in the car, and it was on speaker phone, and she was like, “Well, I had someone fall through who was going to take a puppy home like this weekend, and would you be interested in meeting him?” And so this was on like a Thursday. And so we went out and met Charlie on, I think it was going to be Saturday, but she had another litter of puppies all over her house, so we ended up meeting him on Sunday. He was quite a bit smaller… And it was fantastic, both because of him, but also because of sort of - you walk in, you sit on the floor and you get mobbed by puppies. It’s a very therapeutic –
Special feeling, yes. Special feeling.
Thank you very much for sharing that, Matt. I will make sure to put images, the ones that you share with me, in the show notes for people to see who Charlie is. And if you want one of Lily, I wouldn’t mind including it. I think that would be nice.
I’m wondering about Chainguard, and I’ll start with a big one, a big why. Why does Chainguard do what it does?
That’s a great question. Why is Chainguard doing what it does? Well, we’re working on doing a lot of things. I think the what, I think in terms of the space and focusing on the supply chain space in particular, is rooted in– it’s, I think, one of the big problems facing our industry right now. The prevalence of attacks is just going up and up and up. And there is some point solutions for pieces of the problem, but it’s still really hard to do this holistically, end-to-end, from source all the way through to production. And I think one of the difficulties there is - even if you do that for your own stuff, it’s something like 90% of what folks ship to production these days isn’t something that they build. It’s something that they either got off the shelf in the form of system packages from your Debian or your Alpine or whatever, right? And that’s in your image. But in your application itself, that’s true as well, because you are likely consuming, either in Go, random - well, hopefully not completely random… Certain libraries to do what you want to do, right? Like if you’re rendering PDFs, there’s probably some library that you’re using to do it, and you haven’t rolled your own way of writing PDFs, right?
To pick just a completely random example. Or Java, you’re likely using JARs off of Maven Central or – one of the ones that gets a lot of attention is npm, right? npm, it may even be more than 90%, because people write these little one-liners and you pull in enough Node modules to fill your disk, right? So yeah. I mean, just increasingly, it’s become all about the package managers for really bootstrapping new languages.
One of the funny things that I always felt like it was sort of missing from C++ – I know there’s things like Conan now, but I really had an a-ha moment the first time I used Maven, and I was like, “Oh cool, there’s this repository of really useful libraries, and I can download things and use them”, but at the same time, how do you establish trust for those things and whatnot?
[34:20] So I think the point there is there’s a lot more to securing your supply chain than really just securing your own stuff. So that’s one of the reasons I think we’re so invested in open source as well, because ultimately, we need to make this incredibly easy to use and incredibly accessible to open source developers. And so one of the key projects that we are investing in and building around is a set of projects under an umbrella called Sigstore projects. The project names are Fulcio, Rekor, Cosign. So Cosign is the CLI. It’s what I think folks will probably have the most exposure to in terms of if they’re a developer day to day, they may invoke Cosign to attach what’s called an SBOM, or to sign things or to attest to things, which is a form of signing a claim…
But I think one of the most exciting things for me about what Sigstore is doing is not the sort of traditional modes of signing in tools like Cosign, where you generate a public-private key pair or use a KMS system, and then with those keys, you sign things, and then it’s verifiable with the verification key. You can do those with Cosign, but I think what’s really, really exciting to me and I think ties into that, making it accessible to and easy for developers, is this thing called keyless signing. So if you sort of rewind time ten or so years - maybe more now; I’m going to show my age… But before Letsencrypt existed, getting TLS certificates for your website was hard, it was expensive. And then Let’s Encrypt came along, and they got themselves registered as a certificate authority with all the groups that sort of matter in the space to become a certificate authority. Then they offered this public good service that had a challenge process, and you could get TLS certificates. And there’s fantastic graphs that show where the number of websites with TLS before Letsencrypt, and then it launched, and it goes up and up and up. And at this point, when you’re viewing a website and it doesn’t have the little green lock in your browser or whatever, it’s very suspect, right?
Some browsers - I think this is STDIN Chrome - won’t even show websites if they don’t have TLS. It’s certainly something you can turn on. I mean, new top-level domains like .dev, for sure, basically require TLS in order to serve HTTP on them.
So the reason I bring it up is we like to make this analogy when we’re explaining Sigstore, since Letsencrypt is a fairly well-established thing nowadays, that many folks use for TLS, where one of the objectives of the Sigstore community - and they’re working really hard towards a GA of this public good infrastructure - is something like that, which instead of being a certificate authority for web traffic, it’s a certificate authority for signing. And so it’s this really cool process where if you enable this mode of keyless signing and you say, “Cosign, sign my OCI image digest”, it will send you through what I’ve been calling an identity challenge.
[38:04] So it pops open your browser, and you go through a 3LO, a web single sign-on OAuth flow with – it gives you a couple of choices on the public good instance. You can use GitHub, or Google, or I think there’s support for a form of Microsoft identity in there as well. And then once this identity challenge is complete, it’ll sign your image and route that into this public good instances sort of certificate authority, and folks can verify things.
And so what’s happening behind the scenes – we call this keyless, and it’s like serverless. There’s still servers and serverless. But what happens is it’s keyless in the sense that you don’t have to manage or think about keys. The keys are sort of ephemeral, and they exist for as long as you need them to exist, and then they’re gone.
So in keyless signing, what happens is Cosign generates a key pair in memory, and then it goes through this identity challenge, and there’s a few other elements to the challenge process, like proving you actually have the private key to this Fulcio certificate authority… And a few things happen. One is that challenge process completes, but then you get back a certificate that basically says, “Okay, I’ve verified that this is the identity that I provisioned this certificate for.” And so if you were to cosign verify, which is the tool for signing, something that’s been keyless signed, what you’ll see is encoded into this certificate - and you can do this the hard way with things like OpenSSL and stuff, too… What you’ll see is something like, “Matt Moore at Chainguard is encoded into the certificate.”
Fulcio is basically taking your key pair and actually routing it into this certificate chain. It includes the identity that the challenge process worked for. And so if I’m doing the human version of this and using single sign-on to prove my identity, that gets included into the certificate. I think it’s in the subject of the certificate. There’s two forms of it where we either use a URI if you’re using something like SPIFFE identity or something like that. But if there’s an email-based identity, like if you’re signed on with Google or - the GitHub identity can produce an email, too - then it’ll be included into the email section of the certificate. And then you can also start to include more and more information into some of the certificate extensions as well. I mentioned the human flow, right? There’s also a sort of workload flow. And so this is one of the really cool things about–
So I’ve become such a nerd about OIDC, in particular the prospects that it enables with things like federation and having a form of portable auth. But so many things have started to adopt this as a standard, right? You can produce identity tokens from most Google identities. So like GKE workload identity produces an OIDC compatible token. But Kubernetes itself has support through what’s called – I never remember the order of the soup here, but it’s service account, projected volume, tokens where you can pull the server’s account that your workload is running as, and you can project the token with a particular audience in a particular – the lifespan is configurable. I don’t think it can be lower than ten minutes, but you can adjust it.
And so what’s cool here is a lot of the major vendors - GKE does this by default, EKS I think does it by default, or it’s at least one of the standard ways of spinning up EKS clusters. And EKS now has this in preview, where the issuer for your cluster is world visible, which means you can send those tokens to services, and those services can actually verify those tokens, because they can hit your issuer and get the verification keys for those verifying with signature on the OIDC tokens, and then you can do what’s called federation.
[42:19] Federation is a process where you take the OIDC token and you basically send it to something like Google or Amazon… Both of them have a service they call STS, or the security token service, where you can exchange a third party token for a token that’s first party. So it’s now a Google token or an Amazon toke, and then you can do things with them. And that’s super, super, super-cool, especially since Google and Amazon don’t need to know about who you are, right? Anyone with an identity provider that does OIDC can potentially integrate with these things and do federation, which is super, super-cool.
So the workload-based signing basically builds around this. So the public good instance has a set of issuers that are sort of well established that it can accept tokens from, and then as workloads, you can sign these things. GitHub recently launched support for identity tokens. And so as part of GitHub Actions, when I’m producing things, I can now do keyless signing just by adding to my workload – I think you need permissions ID token write, which allows you to produce these things, and then you just say, “Cosign sign image name”, and it works like magic.
Sorry to jump in, Matt… I mean, it’s fascinating what you’re telling me. I wish I could listen to you for at least another hour, but I know that we’re hitting against your time limit. There’s a bunch of things which I wanted to ask you and we didn’t get to, so this is what I think we should do, and it’s a proposal. I think that we need to have Kim on to talk about Nforce, because we haven’t even crack-opened that subject.
And when I talked to Puerco last time in the previous episode, he was saying, like I did, “Oh, I haven’t even mentioned this thing.” “That’s okay. Matt will come on. I will talk with Matt about Nforce.” And that hasn’t happened. And that’s okay. It’s not a problem. I was very curious to – I’ll start unpacking with you, how can the CI/CD ecosystem help? What are the changes that need to happen there so they help the supply chain? But I don’t think we have time for that. We have time for one more question, and this is something I know that our audience really appreciates, and that is the key takeaway. So from all this conversation, I’m just very aware of all the things that we haven’t hit, because we could talk for hours and hours, and you have so much information in that head of yours. I don’t know where it fits. [laughs]
It falls out of my ears when I don’t have them corked.
[laughs] That’s a good one. So when it comes to the key takeaway from our conversation for the listeners that stuck with us all the way to the end, what would you say that would be?
That’s a great question. I would say signing is the beginning, it’s not the end goal. And I think that it would behoove most organizations to start looking at how you start signing things now, because it’s really a foundational thing. And if you can get to the point where you have that foundation, you can then start to do really interesting things with that, right? You sign things like provenance to say, “I produced this”, so think signed commits, but it goes so far beyond that once you start to talk about things like attestations, and all the other kinds of useful pieces of metadata you might sign with. And I think to touch on and tease Nforce a little bit, Nforce is really about the complimentary side of signing, which is, “I’ve now signed all my stuff. Why did I do that”, right? And it’s so you can actually have policy around the types of signatures you want to allow into different contexts, and things like that. Now you are really signing like a form of authentication. You’re saying, “This is my identity”, since now we have that identity element to this, right? And I am making some sort of statement about this thing that I have signed, right? So now what identities do you trust, in what context, to say what things, right? So these are the kinds of things where policy gets involved, and where you can actually start to leverage that metadata that we’re trying to make incredibly easy to attach, to actually then make judgment calls about these things. So yeah.
Yeah, that’s a great one. So I think what I’m going to do is cancel a few meetings that I had planned for KubeCon, so I have more time to talk to you, because I can see where this is going. I would love to have you back sometime soon, so we can continue this conversation. We have a bit more time. I think there is a lot to unpack here. It’s super-important, it’s going to affect us all, and I think the people that are not paying attention - that’s okay. You will have a wake-up call at some point. But this stuff is really important. I’m pretty sure it is, and I think we need to keep driving this home. It will take a whole town. I know it’s a cliché, but that’s exactly what it’ll take. It’ll take all of us to improve the supply chain of our software. Delaying it will only make it more urgent, and then we’ll be like headless chickens running around. That’s okay, it works for some, but I would like to get ahead of the game if we can. And I can definitely see where this is going.
Matt, thank you very much for joining me. I look forward to meeting you at KubeCon. Please bring all the swag, because I’ll be there, and I’ll get like a bag just for that. And yeah, I’m looking forward to talking to you next time. Thank you.
Alright. Yeah, thanks for having me.
Our transcripts are open source on GitHub. Improvements are welcome. 💚