Ship It! – Episode #8

Cloud Native fundamentals

with Katie Gamanji from CNCF

All Episodes

Why Cloud Native? What are the guiding principles that you should keep in mind as you are choosing a project from the Cloud Native Landscape? How do you build & ship an app in a Cloud Native way? Katie Gamanji, Ecosystem Advocate @ CNCF and former cloud engineer for American Express, Condé Nast and Microsoft, joins Gerhard to cover these topics in the context of the Cloud Native Fundamentals course that she developed. 15,000 students have already enrolled, and the initial feedback has been great. Tune in if you want to know why you should too, how to do it and when the course will become available for free.

Featuring

Sponsors

RenderThe Zero DevOps cloud that empowers you to ship faster than your competitors. Render is built for modern applications and offers everything you need out-of-the-box. Learn more at render.com/changelog or email changelog@render.com for a personal introduction and to ask questions about the Render platform.

LaunchDarklyShip fast. Rest easy. Deploy code at any time, even if a feature isn’t ready to be released to your users. Wrap code in feature flags to get the safety to test new features and infrastructure in prod without impacting the wrong end users.

Cockroach Labs – Scale fast, survive anything, thrive everywhere! CockroachDB is most highly evolved database on the planet. Build and scale fast with CockroachCloud (CockroachDB hosted as a service) where a team of world-class SREs maintains and manages your database infrastructure, so you can focus less on ops and more on code. Get started for free their 30-day trial or try their forever-free tier. Learn more at cockroachlabs.com/changelog.

Grafana CloudOur dashboard of choice Grafana is the open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Notes & Links

📝 Edit Notes

Transcript

📝 Edit Transcript

Changelog

Click here to listen along while you enjoy the transcript. 🎧

I’d like to start with a story, and the story is how we met, because I thought that was a very interesting way… Do you remember how we met?

The way we actually met was during the End User Partner Summit, which I was involved in at the time. This was an event only for CNCF end users, pretty much; everyone who uses cloud-native, but they don’t sell it. And as part of that, we had a networking session, more like a breakout room, where we were just able to maybe interact a bit more with our people… So kind of still getting that networking vibe in a virtual space.

Yeah. What I remember is that during Priyanka’s happy hour - that was like a session that we did at KubeCon, the European one…

That’s right, yes.

[00:04:08.10] That was it. We had breakout rooms, you’re right. So 3-4 people would be randomly picked and they would have a breakout room and then they would chat. In one of those sessions we ended up in the same room, and there was like two more people, I think…

We were talking – I know Splunk was mentioned, I believe… Is that right? Splunk, Redis Labs, something like that, in that conversation.

There were many things mentioned, yeah.

Was Snyk mentioned as well…? So just a couple of technologies were mentioned, and what people use, and how it’s going, just like a general one… And then a few weeks later after KubeCon I found out about this course, the Cloud-Native Fundamentals that you’ve just launched… And in that tweet - which, by the way, will be in the show notes - you wrote that after four months of very early mornings and very late nights and a lot of hard work, it’s finally done, and you’re very happy for it. And I was thinking “Finally. A cloud-native course that people can take.” It’s a practical one, one that takes a while, and takes people from nothing all the way to understanding not just the landscape, but how to use specific tools. So a very practical approach, which was sorely needed… Because the cloud-native landscape - let’s be honest, it’s so confusing, even to those that know it. There’s like so many things there. And it’s not a bad thing, because it’s meant to be big, but how do you start? What is the first step that you take? I think that is less – my perspective; maybe you disagree. Do you disagree? It’s maybe less confusing to you, or…

Well, let’s put it this way - I do remember my journey when I started to use cloud-native tools. This was around when Kubernetes was around for two or three years. It was still very brand new. I do remember actually I had to set it up the hard way, before it was called the hard way… When I had to actually write the systemd units and files, and actually write all of the configuration there to actually make sure that the Kubelet is going to be up and running on the machine. And at the time, I managed to have a two-node cluster, but it took me a lot of Stack Overflow, a lot of waiting for the docs, and a lot of back-and-forth. It was not very concise.

Now, of course, the community has been working quite heavily on improving documentation. It’s out there, it’s splendid, it’s in very good condition, it’s maintained as well, up to date… But now the problem is that everything is overwhelming. Because when you talk about cloud-native, it’s not just Kubernetes; you have so many other tools around it. So what I’m actually trying to outline with this course is more of like cloud-native is a practice; the tools can vary from one organization to the other, but once you understand the fundamentals, once you understand what it actually brings to the table, then you’ll actually be able to choose the right tool for your use case. Then you’ll be able to explore and maybe even advance some of the technologies forever.

One of the things I wanted to provide here is the fundamentals in making sure that the cloud-native principles are understood, and everyone will be able to use them to build further.

That is a very good way of putting it, because you’re right, some people, including myself - I think “Wow, there’s so much choice. This is so confusing.” But I’m spoiled. I’m spoiled for choice; there’s so many approaches, and there’s no one better than the other. It’s all contextual. So I’m complaining about the choice, but really, a lot of hard work went into creating those choices to begin with. And the reason why there is so much diversity in choosing - not only the diversity of choice, but also the community is very diverse - is because there’s so many approaches. So how do you surface all those approaches? And if anything, the cloud-native landscape does a really good job of highlighting and showing all these options, which I think is a great thing to have. So picking and mixing things is very interesting, and that in itself can be a job - curating these approaches.

[00:08:10.03] So in the Cloud-native Fundamentals, in the course, did you do any of this curation, or how did you pick basically the approach that you follow or that you recommend?

So when I was actually building the course, I really had the audience in mind who is actually gonna take this course. I was trying to make it easier for them to navigate the ecosystem, because as I mentioned, there’s a canvas full of different tooling, and you can pick and choose, you can make a great platform. But is it actually something that someone who wants to start with cloud-native needs to know? So I was trying to break it down to the bare fundamentals, and again, explain the principles, what cloud-native is. It is about being declarative, it is about the self-healing capabilities, containerization, you have interoperability that you’ve mentioned, you have multiple solutions for the same problem. That’s why we have such a diverse landscape.

So when I was trying to choose the tooling, I was trying to make it easier for the students. At the beginning, I was trying to explain that you have an application. The only requirement a student will need is to have some programming experience, because based on an application, we’re gonna move it forward to different phases, the deployment is going to be within a production cluster.

So having this application, what do we do with it? We start thinking about its architecture. Is it a good application to be containerized? So we’re starting with those mindsets and perspectives around the service. Then we’re thinking about “How can we containerize it?” We usually look at Docker. Docker has been there for a very long time; it has been given [unintelligible 00:09:43.25] so it’s a very good kind of knowledge to have. Once you understand it, even if you use other tools which does this packaging of your application by default in a package - for example we have tools such as Buildah or Podman; they package it for you quite nicely, without you even having to run one line of Dockerfile. But for them to understand how to package it, it’s quite important; that’s why I went a bit more declarative when it comes to packaging an application, creating that artifact, the Docker image.

Kubernetes by itself was one of the focuses, as well. When you’re talking cloud-native, there’s gonna be an element of container orchestration, hence we talked about Kubernetes resources and how it actually schedules different resources in applications, how it can expose your application to the wide world using Ingress and services… So pretty much still trying to explain the basics, but not go too further up. So the bare minimum that they will need to deploy an application.

The interesting thing is however when you choose a CI/CD pipeline, because there you have so much choice around the tooling. One thing that I’ve purposely done - I’ve split the CI and the CD into two different lessons… Because most of the people cannot really differentiate the stages within a pipeline. When I was trying to choose the tools, again, one of the things was it has to be open source. So when I was choosing GitHub Actions, again, it’s something which is quite well maintained by the community; you have a lot of pre-build a lot of actions that you can use straight away… And with Argo CD - again, I wanted to make it easier for students; instead of all the time being in the terminal, I wanted that UI element of deploying and maybe visualizing your resources. So that’s why I went with that approach.

Again, I was trying to put the bare minimum of “How can you have an application package deployed, automate the deployment process, and have it running within a cluster?” I was trying to choose the tools purposely to kind of fit these fundamentals and make it very easy for them to move forward.

Okay, so you mentioned that you split CI and CD… And based on the tooling that you mentioned, the way I understand it is that you use GitHub Actions as the CI, and Argo as a CD.

Precisely.

Okay. And why is that? Why did you split CI and CD? That is interesting.

[00:12:00.06] I wanted to make it very clear what is continuous integration by itself and what is continuous delivery by itself. Because continuous integration usually focuses on the code - how can you actually integrate the latest features from your application within an artifact? So the end of the CI should be an artifact; I wanted to make that very clear.

With the CI you can have, for example, testing in different environments, you can build artifacts for different platforms. However, at the end of it, the result should be an artifact. So I wanted to make that very clear with the cloud-native space that you’re just gonna be represented by a Docker file, which will be able to run on any platform that runs containers.

So I kind of separated that… And when it comes to continuous delivery, it’s how you actually ship that artifact to different environments. So I wanted to make it clear that a pipeline, in a way, should do continuous integration and continuous delivery, should contain all of these stages, but I don’t think there is a very good understanding of where exactly continuous integration finishes, and where the continuous delivery starts. I wanted to break it down into, again, bare fundamentals. You can still have two different tools, but you can still achieve a very functional CI/CD pipeline. You don’t necessarily need to bring one tool to achieve the end result. You can actually kind of have this puzzle, put two pieces together. And this is something, again, I wanted to maybe accentuate the nature of cloud-native - you have different tools, you put them together, it works… So that’s another thing I wanted to highlight.

Again, it was not the main focus. I wanted, again, to make it clear, to differentiate it… But if we’re looking from a different perspective, we can see this interoperability, we can see this diversity of the tooling. We can easily switch the Argo CD, for example, with Flux. We can use Spinnaker instead; we can use any other third-party provider. You can actually change that with GitHub Actions; you can completely [unintelligible 00:13:50.21] You can really choose different tooling here. But I wanted to maybe emphasize what are the stages and what is the result of all the stages, and make it simple.

That is really interesting, because even though we don’t call it out like that to basically build and deploy Changelog.com the app itself, we do something very similar. We use CircleCI as the CI, where the end result of the pipeline is a container image. So if tests pass, if dependencies get built, we compile static assets, JavaScript, CSS, and the end result is just a container image. And for the CD part, it used to be five lines of Bash, which would be like [unintelligible 00:14:39.04] Docker Compose days. That’s all it took, literally - a Docker update service with the latest image. And we replaced that with something called Keel.sh. I’m not sure whether it’s part of the CNCF, but it runs in the context of the Kubernetes, so you deploy it… And first of all, it receives WebHooks from Docker Hub; you can configure it like that when there’s like an update to the image, and then it will update the deployment with the latest version. I know that this goes against the GitOps mindset and the GitOps philosophy; I think that’s a very interesting topic, which I would like us to dig into… But that’s all it takes. And this separation, just to highlight what you’ve been mentioning - this separation works really well, in that you don’t have to change your CI to change your CD as well. They’re like two separate things. And whether it’s five lines [unintelligible 00:15:32.17] or whether it’s GitOps, or whether it’s something else, it doesn’t really matter. The point is you have this choice to maybe change them independently and not have them coupled together.

So migrating, in our case, from Circle CI to GitHub Actions is easier than if we had this really huge pipeline that had all sorts of secrets, and it knew where the Kubernetes clusters were running, and it needed access to those… So it definitely works. I can definitely say that it works, and that’s why I was trying to understand a bit more why you do that, because it’s a very good approach; it’s a really good approach to separate the two.

[00:16:06.18] Absolutely. Again, here it’s just preparing the mindset for – you actually build your own pipeline. And again, it’s more about the internal requirements you have within an organization. If you cannot choose open source tools, then probably you’ll need to run your own Jenkins servers, and run your CI/CD pipelines there… And that’s absolutely fine, because even if you use Jenkins, you still need an artifact; you still need to deploy it within an environment, be it in a data center… It doesn’t necessarily need to be a Kubernetes cluster, but the fundamentals are there. So I’m trying to provide this information that they can reuse in different environments.

So I guess this philosophy – I call it the Unix philosophy, with small utilities, and then combine and compose in infinite ways… Is that what cloud-native is to you? Or is it something else?

I think it becomes that. What actually draws me to cloud-native is the diversity of tooling. And not necessarily the tooling, but the strategies… Because I’ve been interacting with many organizations, I’ve been talking with many engineers, and when you’re talking about the infrastructure and their platform setup, not one platform is gonna be the same. Even if they use Kubernetes, the way they use Kubernetes, the way they deploy to Kubernetes, the way they bootstrap the cluster, maintain it - all of these answers are gonna be different for every single organization, pretty much. I haven’t seen too much overlap from one setup to the other… And I think this is the beauty of this environment, of cloud-native - the diversity, this inclusivity of multiple solutions. You can actually leverage your product further. You can use very good fundamentals; you have a platform that pretty much can schedule your application, it can take care of it, it can restart it… You have higher abstraction layers, it’ll maybe scale your application, and so forth. You have the fundamentals. What you need to think about is how can you further leverage your product, and maybe use the tooling that are right for your organization in terms of budget, in terms of the resources you have; sometimes outsourcing is gonna be the answer, sometimes building everything in-house is gonna be the answer.

So it really varies, but the interesting part about all of these platforms - they’re different, but they leverage open source at the same time, and they try to contribute… I’m kind of amazed, because this small adoption, this small integration builds up to this organic growth of the entire cloud-native ecosystem, and open source tooling at the same time. This is maybe something very miraculous to observe and actually see how it grows.

Thinking about Kubernetes - it’s been around for seven years; we’re actually marking seven years now. But it completely changed the way we have these application deployment strategies. We had the VMs, and this was the buzz thing maybe ten years ago, but within seven years everything just changed completely. And we have a lot of data, a lot of surveys and reports showcasing this. We see enterprises actually thinking about Kubernetes; they’re thinking about multi-cloud strategies. And because Kubernetes is agnostic, you can run it on any compute, it can be any cloud provider; as you have some compute, you have some networking components, and storage, of course, you will be able to build a cluster anywhere.

And this is the beauty of it, because we can leverage this and build with multiple clouds, and you can migrate applications quite easily using the same abstraction layer. So the beauty of it is it’s a pluggable system; it’s interoperable. It’s diverse and organically growing, and I think this is something which is quite important, and maybe is something which is not easy to replicate. I don’t think any organization will be able to maybe have the same success with an internal tool, kind of growing and gathering ideas from different communities, and build it up… So this is kind of, again, the beauty of the cloud-native, and maybe that’s why I’m within this space still.

[00:19:54.06] Yeah. That is a very good answer, and sometimes I think if you take away Kubernetes, what are you left with? So Kubernetes definitely was the center of it all, and many things are being built on top of it and around it… But if you remove it, I think we’re starting to see other scheduling platforms. I don’t know whether they slowly emerge, but there are other options. I think people sometimes say “You know what, Kubernetes is too complicated.” I say “Okay. Well, you can use something else, but the problems that you have to solve will still be more or less the same.” So - sure, you can use something else… I don’t know, I’m trying to think… If I wouldn’t use Kubernetes, I think I would try maybe Nomad out… What would your choice be, Katie, if maybe let’s say you couldn’t use Kubernetes? What would you use?

If I couldn’t use Kubernetes… That’s an interesting one; what is life without Kubernetes…? [laughter] I’ve been working with it for so long I’m quite biased here. But I think it really depends on what kind of applications I have here. I’m actually, again, biased from something that I would like to maybe deep-dive more… Serverless is something which has been extremely beneficial for many organizations. So if I have an application that has to be there for a certain amount of time, it’s something which is timed, or timeframed, let’s put it this way, then I think serverless is definitely something to have out there.

Again, it depends on the organization, but something in me would be like “I don’t wanna go the data center way again”, because I think – actually, I wouldn’t mind this, but I think this is a space which requires a bit of modernization as well, in terms of how we set it up. If the tooling is right, if the mindset is right, I think it can be a very good setup there as well. If you have an organization that is very restricted, and it has to deal with all the compliance, then definitely a data center is gonna be the way. But yeah, I think my answer here is I don’t have one specific platform. It really depends on what I would like to deploy. I think the cloud providers have very good offerings around anything that you’d like to deploy; they have stacks built for you straight away, so sometimes you just need to put your application in and then you’re gonna have it running somewhere, as long as – for example, if you don’t care which ones, you don’t care about managing that infrastructure, they will be able to run it for you somewhere, and it’s gonna be available to the customers.

So my answer here is - as you mentioned, there is a very good diversity to containers as well, [unintelligible 00:22:23.01] But it really depends on what you would like to deploy; based on that, you’d be able to choose maybe some more specific tooling, something which is right.

Yeah. A PaaS - now that you’ve mentioned, a PaaS may work, for sure… Because you’re right, I know there are many efforts to abstract Kubernetes away. It’s an implementation detail, as the load balancers - we used to configure them, and now they’re just an implementation detail; it’s just an Ingress. Same way, maybe you don’t need Kubernetes. Maybe what you actually want is just a PaaS… And in some cases, maybe not even that. Maybe you want a serverless platform that you just push your functions and you define your integrations and inputs and outputs and all that stuff, and that’s it; that’s all you need.

So that is really interesting, because I think, again, going back to the choice comment - I think we’re really spoiled for choice these days… And then if you’re trying to cling on to your bare metal machines - that’s okay, there’s nothing wrong with that, but I don’t think people can be as exclusivist anymore. It’s not like “This is the only way and that’s it.” You have to be blind to not see all the other ways, which - by the way, in some case may work better. You may be spending less time toiling away at your infrastructure and maybe focusing more on the business… I don’t know, that’s just a thought. I know it works well for some.

Break

[00:23:52.20]

Imagine, Katie, that this is your first day in a new job. You’re leading a team of five or seven developers; you’re the lead developer here, and also you have a bit of an architect role… And the brief is to design an online presence, an online store for a supermarket that needs to service mobile applications, web browser, web applications as well… So more about web applications. But you’re in charge of building the API, handling the data part as well, and then there are some other frontend teams which maybe build on top of it. And by the way, that may not be a good division, and you would point it out if it’s not. The frontend team should be separate from the backend team, I don’t know… But how do we build an online store, Katie?

Well, it depends in which stage we are of building that store… Because having a first MVP I think is more important than having the full architecture grounded down to every single service that is gonna be covered within this store.

So at the beginning, the first MVP - what is actually a store? Of course, we’re gonna have a web interface. This is what the customers will interact with. We’re gonna have a database, which is gonna be pretty much storing all of our items that we’d like to sell. We’re gonna have a backend which pretty much will connect the two things together, so pretty much will take any request coming from the frontend, make sure that it has all the information from the database, and kind of provide back the response

So these are kind of the three “major” things - usually, the frontend the backend, and then you have a database component as well. This is gonna be the bare minimum. But the fun part starts when the backend actually is not just about getting a request and providing a response back, it’s about maybe expanding to different functionalities as well.

So when thinking about a store, we have a shopping cart, we can have, for example, different currencies we’d like our price to be displayed. You might have different languages, we have different categories, maybe we have even different portals for different stores which are managed by the same company. One is gonna be the grocery store, one is gonna be the clothing… There are so many varied things. Maybe the interface is gonna be the same, but some of the functionalities can change, for example. So you can really go the extra mile and personalize the entire experience for your user.

But [unintelligible 00:27:17.08] have the first MVP. You don’t need to segregate everything across three repositories and have everything nailed down on an architectural level. Have it working - a basic frontend, a basic backend, and a database running maybe even locally, and that’s it. This is gonna be the monolithic approach. So if this is just for testing maybe, or this is just at the level of the MVP, I think having a monolith is fine at this stage. However, as you’d mentioned, if I’m in a team with a couple of engineers, [unintelligible 00:27:47.26] I might have a frontend team, or I might have even the ability to employ new people and create different teams across the organization. In that case, that’s when we start to think about “Do we need to split up this application into different components?” And most of the time, the answer is gonna be yes, because you want your application to be resilient. If one component goes down, you don’t want the entire application to be down.

[00:28:12.28] For example, if the shopping cart is not working, if that functionality specifically is not working, everything else on the store is still working; people will be able to visualize, maybe they will still be able to get an order confirmation, or do the payment and so forth…

So you really want to segregate things. You really want to reduce the blast area if something goes wrong. And this is where usually the answer is – “Are microservices good for me at this stage?” the answer is gonna be yes. But again, you have to think “How would you like to split those services?” We’ve been talking about the frontend, the backend, the databases, but then we can talk about the payment systems - you can go the extra mile there, because you have different payment methods nowadays. You can go with a shopping cart, with the currencies… Again, you can have different teams, different services for all of these applications.

The other thing that I want to mention here is that once you have a set of microservices, that is not the end. You always should consider and make sure that you reiterate on your application, on your codebases… Because if for example you’ve written one microservice using Java, because that was the main resource knowledge that you had within the organization at that time - it was working, it was perfectly fine, however, you would like to optimize.

For example, Java is gonna be very heavy on the CPU consumption. And then you realize you’re looking to your machines and you would like to save some of that compute… And you’re thinking “Is it the right time to rewrite this microservice using a different language?” Actually, there is a [unintelligible 00:29:42.28] write something from Java to Python, and they’ll observe this differentiation of CPU consumption.

Having these maybe different languages and segregation of services allows you to have independent management of these applications, as long as you have a very well-documented API and you know how different products should call each other, you have a standard; you will be able to pretty much have this independent development on the services.

One thing that I wanted to mention as well - I’ve started to talk about this - you have to reiterate. Something you’ve been segregating in different microservices can be too granular, so you might think of maybe putting some of the currency and payment microservices. If you find this too granular, there’s too much management… Because once you split these services, you have usually a different codebase, you have maybe a different language, at one point you might have some other teams managing independently… Sometimes having them together is the answer as well. So merging two microservices - that’s an operation where you can further exploit an application or a service to make sure that you have a better management of those services.

Some of these services are completely staled, so for example you’ve developed a very - I don’t know, maybe a very personalized shopping cart experience that no one is using, that’s one of the microservices that is not used; you might be marking it as stale and completely retire it from your cluster and from your application.

What I’m trying to say for all of these operations and microservices - it’s dynamic. It’s not set in stone. So you have an application, you might split it in microservices, but that’s not the end. You always have to reiterate and consider what is the best for your application team, for your business and for your organization at the time, and always kind of try to optimize and improve. This is the answer to many of the technology advances that we have nowadays - how can we further optimize it?

So it’s a journey, but again, one thing that I want to mention here - do try to understand the requirements of your organization. Everything is gonna be driven by those. So if you’re an organization that really wants to scale up, and they have all of the resources in the world, maybe you’re not gonna think about optimizing the application.

[00:32:01.14] For you, the primary thing is gonna be be available out there; you have enough scale, you have enough resource, so cost optimization is not gonna be something you’re gonna look at very frequently. However, for an organization which is in a startup, they will be trying to use all of the free level tiers resources that a cloud provides, for example. You’ll have to be very thoughtful about the resources you use. You might choose some of the tooling that are free tier, just because it will get you in a position where you can still run, but be very efficient with the money you have. So really try to understand what you have at the moment and try to build with the resources that you have.

What I’m hearing is “It depends.”

Yes. [laughs]

It depends on your context, always.

The short answer, yes.

Yeah, that’s the first thing. But starting with a monolith as an MVP does sound like a sensible approach, especially if you’re trying to prove a concept. And then based on that, it depends on how things go. You may decide to break it down into microservices. Where would you run this application or sets of microservices? Do you have a specific preference? We keep mentioning Kubernetes… Would you run it on Kubernetes, or would you use something else?

I actually have many people asking me that… Some of my friends - they are actually developing products, startups, very small startup companies, and they’re asking “Is Kubernetes the right thing for me at the moment?” And usually, what I answer in that circumstance is gonna be probably no, because it’s just only two people, both of them developers, they don’t have any infrastructure developer or cloud platform engineer within their team… So for them, managing and completely deep-diving into the Kubernetes architecture and management maybe is not the answer.

So in that circumstance, I would usually maybe suggest a cloud provider. Again, you have free tiers you can use from different cloud providers… So I would usually recommend that. However, if you are in a circumstance where you have enough engineering resources and you have enough expertise of maybe understanding how to run an infrastructure - not necessarily Kubernetes, but the basics of what exactly an infrastructure is composed of, then probably Kubernetes is gonna be the answer moving forward. Because when we’re talking about Kubernetes, it’s not just about containers, it’s about how these containers are managed, and what kind of leverage you get in a production environment by using Kubernetes.

I’ve been mentioning the scheduling capabilities… For example, you have – maybe I should introduce Kubernetes very briefly. So Kubernetes is pretty much an orchestration platform for containers that is run across a distributed amount of machines. So you can have different instances, and all of them are gonna be put together to run your applications. Now, on which node, on which instance - that doesn’t really matter. That’s always gonna be abstracted by Kubernetes. So one of the capabilities is gonna be the scheduling.

So based on the requirements you have for your application - for example you can choose “This application should have this amount of memory and CPU at all time. This is the very minimum of the resources I need for it to be up and running.” The scheduler would take these requirements into account and place it on a specific node that will have all of these available resources.

Now, the thing is if that particular instance goes down, usually, if you’d be working in a data center, you will need to migrate the application, or you will pretty much need to trigger your load balancer to point to a different data center. Now, with Kubernetes all of this is automated. So Kubernetes will be managing or monitoring the state of your application all the time. It will say if it’s no longer up and running and it will go back to the scheduler, and this will put it on a different instance, again, with enough resources and capacity to run your application. So all of these operations are automated… And this is just one of the functionality it provides.

[00:35:56.28] We have scalability, we have resources which will allow you to scale an application based on different events. For example, if you reach the amount of – maybe the maximum of the CPU or memory your application can consume, you’ll be able to scale forever. But now you can actually scale on external events as well. For example, maybe there is a queue you have in a – actually, an SQS messaging queue in AWS would be able to take those metrics, and based on that, maybe scale further. It can really go the extra mile [unintelligible 00:36:27.04] personalized scale mechanism. And all of this, again, is automated. You have a declarative representation of your application, so your application is represented as code. It’s YAML, it’s not necessarily the most readable thing out there, but it’s out there, and that’s gonna be representing the desires that you want to have within a cluster… And that desired state is always gonna be fulfilled. You have control managers which will always make sure that what you want is actually gonna be in the cluster. And it’s always in a loop to reach that ideal condition that you defined within the manifest.

So these are just some of the functionalities. I didn’t even talk about the Ingress and how you actually manage the reachability to your application, how you have this abstraction across a collection of pods of services, and how can you have granular control of how your application serves different HTTP endpoints with Ingress, or how your Ingress can actually point traffic to different services and different applications. So you have a lot of availability out there, and you have custom resource definitions, and you really can go the extra mile. It’s a tool that is very customizable. But more importantly, again, it has some of the basics very well set, so you don’t have to think about them anymore; you just take advantage of them straight away.

So in that case, if you have a team that would like to run an application within a production environment, would like to kind of take advantage of all of these capabilities that Kubernetes provides, and they have enough resources within an organization to run it, then probably the answer is “Yes, do look into Kubernetes and start rolling it out.”

I would definitely agree with everything you said from a practical perspective, because even though Changelog is a monolith, the reason why we chose Kubernetes is that it takes care of certain details in a very elegant way. We can declare everything from certificates, to DNS, to load balancers, to even cron jobs; it has even the concept of cron job. And these are just like the built-ins, never mind the specific custom resource definitions (CRDs). So you can enrich it in so many ways… And it’s a really nice tool to work with, which seems to do very many things really well out of the box. Maybe some of them you won’t even need. You have policies, you have the whole network policies, the built-in security model - I forget what it’s called; I think OPA is one of them, the Open Policy Agent. So you can define certain constraints, certain requirements that need to be present…

So what I’m saying is it scales really well, so you can do so many things that would be very difficult to do in a different platform, and it just takes a lot of resource and a lot of knowledge and a lot of just time. Now, the good thing is once you learn it, it applies to anywhere Kubernetes runs, because it’s literally the same API; a few differences, but the same API. Maybe the persistent volume that you get is slightly different, and the load balancer has some extra things based on the platform that you choose. But the language is the same. So you have this unified API, and it just makes things happen, which is amazing. And not just once, continuously. So that’s great.

So you’re right, you can have a single app and still get a lot of mileage out of it if you want to or can afford to invest that time. Otherwise, maybe a platform-as-a-service. Maybe that’s going to be all you need. Maybe something like Heroku, or Cloud Foundry, or I think Render - that’s like the new version of Heroku… Different options like that. So there are options.

Now, you wouldn’t start with microservices to begin with, would you?

Probably… It depends on the scale. But if I have a running MVP that’s a monolith, I’m happy to move it forward and create this automated pipeline for it, if needed.

[00:40:15.19] How would you get updates out into production? What would you use for that? Let’s say that you have a monolith… What would you choose to get updates out into production?

So here’s where I would actually have this pipeline. I was actually wondering what a pipeline is, because when I first hear about it, I was an intern at the time, and there was this magical pipeline that can push changes to the production, and sometimes it can take days, because you have [unintelligible 00:40:40.08] changes and so forth. But a pipeline pretty much is a mechanism that will be able to roll out changes that you have within an application to the production environment. Ideally, that’s gonna be automated. And this is what is nowadays known as the continuous integration and continuous delivery. So CI and CD.

With the CI and CD you usually have different stages that you would like to go through. So once you have your application, you developed a new functionality. The next thing is actually to have some tests. I think this is quite a natural thing to do if you want to have something secure in production. It’s very often overlooked, but definitely do write your tests and actually do think what are those gaps that you might want to catch before pushing to production. Some of them are quite easy. Maybe some linting is gonna be the answer, maybe syntax checking, and so forth. So there are tools that does that for you, so do look into integrating those tests.

So you have the application, you test it, it kind of passes everything you’ve been writing out there… The next stage is to package it, building that artifact. When we’re talking about an environment where we have data centers, usually the artifact is gonna be a binary. And it can have different formats as well, depending on where we run it, on which operating system, and so forth.

When we’re talking about cloud-native, there’s gonna be a container image, so usually a Docker image. And what’s very good about the Docker image is that you can have a set of instructions building your binary or your artifact. And that’s something, again, declarative. You can reuse that, you can change it accordingly, or if you don’t want to use Docker for example, you can - as mentioned before - use a tool such as Podman or Buildah, which will build the container image straight away for you.

And once you have this image, usually you will need to store it somewhere. That’s gonna be, again, different environments; it’s gonna be an Artifactory. With cloud-native you’ll be able to use something like Docker Hub, you can use Harbor, you can use Artifact Hub currently available… So there are options to store your image out there.

So all these stages, like building your functionality, testing it, packaging it and distributing it - this is gonna be the continuous integration. So you’ve integrated a new functionality and your end result is gonna be a binary.

Now, the next stage of it is “How do I push this binary? How do I push all of these changes to the production environment?” And this is where we have the continuous delivery. With the continuous delivery usually we have to pretty much propagate the application for different stages. When we’re talking about an organization that, again, has resources, usually they are 3-4 environments that you’ll need to pass it through. The first one is gonna be the dev environment, you might have a QA as well - debatable; some companies do, some companies don’t - definitely have a staging, and then the final one is gonna be the production.

The reason you actually have all of these environments - and more importantly, they should be set up similarly. So the difference between them is just maybe the endpoint you reach to that cluster, so the API endpoint. But everything else in terms of the setup internally is gonna be the same.

[00:43:54.00] So what you actually do within the continuous delivery process is propagating it from one environment to the other… So QA, staging, and production. The passing through all of these stages - the results should be the same; the application should be up and running. So you have at least two possibilities to verify your application, and how to respond to the other components within the cluster. It’s not just the application running, it’s about how it affects other components within a cluster. And if it doesn’t and everything is fine, even greater.

So once you reach the production stage, this is pretty much the continuous delivery process, and hopefully it’s gonna be up and running all the way through. So these are pretty much all of the stages that we have.

Yeah. That’s a good one. Argo CD is what you’re thinking for CD?

Yes. I’ve actually had to battle [unintelligible 00:44:41.26] project manager, because they wanted to use a more traditional tool here… But I was very set up to maybe promote, or maybe – not necessarily promote, but advocate for the GitOps strategy. It’s something which is there, and it’s been a buzzword for the last year for the practitioners and the experts within the industry. I think they are completely tired of hearing this term. However, for students that kind of are on the journey to start their cloud-native journey, I think it’s important to maybe set up the fundamentals of what GitOps is.

Now within the cloud-native space we have Argo CD and FluxCD provisioning these capabilities. Both of them currently are incubating CNC projects, and Argo CD is currently undergoing a [unintelligible 00:45:27.01] vote, which means it’s stable, it’s been used by hundreds of customers, it has a very healthy community, it has a very healthy development velocity and so forth, so it’s a healthy project, it’s out there.

Argo CD, actually - the reason I’ve picked it up is mainly because it has these Web UI, so it will be easier for students to visualize their resources. Because once you have a cluster, the only chance for you to interact with it is gonna be through the CLI, the command line… Which is still something not very comfortable for someone who starts to understand Kubernetes. So I wanted maybe to provide an extra support, a visual support for them to visualize those resources.

So that was the main reason, because I think it’s gonna be easier for the intended audience here… But that doesn’t mean that one is better than the other. It’s pretty much - in the context, I think it’s the best tool at the moment.

Break

[00:46:15.23]

That’s really interesting, because you’re right, the reason why between FluxCD and Argo CD I also prefer Argo CD because of that visual element. I think the UI is really nice. Not only that, but I’m seeing that other projects, like for example Kubeflow Pipelines, which is about machine learning - they also use the concept of Argo CD workflows behind the scenes… So now we start seeing that other projects are building on top of projects which were not meant to be used like that, but they’re really flexible and they work really well, and they have nice UI elements; you just get the UI for free, so it makes sense for example for machine learning to use something where you can see a UI. I think that’s really powerful.

[00:48:10.27] So it just goes to show that a tool sometimes people use it in unexpected ways, which are good, and many people like… And this is where something new and unexpected just happens. Nobody planned for this to happen, but it’s a good thing. So yeah, another vote…

Yeah, [00:48:26.02] responsive to customer feedback here. I definitely agree on this one, yeah.

…yeah, another vote for Argo CD from here. And for CI, I think that’s maybe less important… And the reason why it’s less important is because, first of all, people have been doing this for such a long time, so you may already have a preference, so whatever you’re using is fine. GitHub Actions is there, and I think that’s what you’re recommending in the course for the CI part, because it’s just so simple…

I mean, you have to store the code somewhere, and then wherever you’re storing the code, having the CI part as close as possible to that I think makes a lot of sense. So that’s like fairly easy. And for those that use GitLab - well, you already know, but you’re taken care of, so that, again, doesn’t really matter; you know what to do. So that’s really interesting.

Okay, what about monitoring, telemetry, logs, traces, events, any such things? Would you introduce that at these early stages, or would you just maybe mention a couple? How would you approach this?

So this is a very good question, because one of the things that I’m trying to, again, advocate for is - as an application developer, you’re gonna still need to understand your infrastructure, but you need to know where it’s gonna be pushed, or where it’s actually gonna be running and executed. This is actually quite important, because when I was talking about – I’m talking within the Kubernetes context here. When you have an application, it’s quite important to have these readiness and liveness probes, which automatically can restart your application if something goes wrong. So instead of someone waking up in the middle of the night and doing that, the cluster will be able to do that for you, as long as you have a health checking point out there.

So what I’m doing at the beginning of the course and kind of making sure that everyone understands is be aware if your application is gonna be executed. There I’m talking about the health checking points. I’m talking about the metrics endpoints; if you want to export any application-specific metrics - I’m talking about logs - be sure that you’re actually logging on different stages within your applications, different functions, when they are called and so forth. I’m even talking about traces as well, because some of the traces out there or some of the APMs at the moment - they require for the libraries to be used within the application to have these super-fine, granular representation of how a request is actually solved and how a request is actually getting the response from all of the functions that are called, and so forth, so you can actually build all of these components together and have the full journey.

So I’m talking about all of these components and kind of making sure that the students understand them. However, these are gonna be covered even further in the next course - I think it’s gonna be course three - which is gonna be focused solely on observability. They will talk about Grafana and Prometheus for metrics collection and visualization, they will talk about Jaeger for tracing, they will even touch upon how can you actually build these dashboards and panels and making sure that you have a very good representation of what’s going on inside your application. These - again, I’m just like covering them, kind of as beginning fundamentals; it’s just kind of making sure that you understand what it is about… But they’re gonna be used throughout the course, and the [unintelligible 00:51:38.03] project, which is at the end - and this is something that I’ve developed as well - the students will really need to kind of be very thorough in building their dashboards, because these are gonna be quite crucial for them to get these results of CPU consumptions between a microservice that has been rewritten in a different language. So they’ll have plenty of chances to practice their observability skills as well.

[00:52:01.10] That’s a very good one… Because people don’t think about that, and I think based on the platform that you choose, it can be either very easy, or very hard. Once you’re really well into your journey and you think you have it, and everything is looking good, then you discover, “Oh, hang on… So now I need to understand how my application behaves?” Yes, you do. That’s something that you should keep in mind from the beginning. And based on the platform that you choose, it can be either very easy, or very hard.

So sometimes it can be straightforward, and even then, the straightforwardness is in the approach… Because there’s those complexities associated to what you care about, what your application does, how it’s structured, or your microservices, how they’re structured, and it’s all very contextual. So it’s very difficult to maybe build something that is generic. I mean, you could have an APM, that may be good enough, and you would have some traces, but is that sufficient? Maybe it is, maybe it isn’t. I don’t know. That’s where it depends. It basically starts to depend more on the context, and less [unintelligible 00:53:01.26] because it’s less generic. So that’s the first thing - instrumenting your code, that started becoming more important, and only you know where to put those instrumentation calls. Nobody can tell you, because it’s your code. So there’s that.

I completely agree on this one. And one thing that I want to mention here is that this need of understanding where your application runs is kind of bringing this need for the dev ops practice. I know it’s been completely consumed as a topic, but for students that, again, are on the journey of understanding cloud-native - maybe they have been a programmer, they have been training to be a software developer, but they look into cloud-native and they wanted to transition… I think it’s important for them to understand that dev ops is not a tool, it’s not something physical, it’s a culture. It’s pretty much as you’ve mentioned - as an application developer and as an infrastructure engineer, how do we leverage the product further?

So this collaboration, for example, using a particular tracing application, or you need to kind of run or integrate some libraries to collect those logs, so you actually can visualize them in – I don’t know, it can be for example in Splunk, or Datadog; it depends on the tool of choice that you have within the house… So this collaboration is all about making sure that you have this full transparency of what’s happening at the application layer, and the infrastructure team will be able to leverage that with the tooling that they provide.

Okay. So as we’re approaching the end, I hope that you really enjoyed what Katie’s been saying, because I have… And the course is currently free, or will be free…? There’s something like “free” in the tweets; I think it’s not very clear. Some people are getting confused about that part, so can you help us, Katie, understanding it?

Absolutely. This is something that I’m clearing out myself as well. So the way I’ve built the course, it’s supposed to be free. However, because it’s part of a wider now degree - so I’m just kind of having a fourth of the entire nano-degree. So there are four courses; the first one is going to be Cloud-native Fundamentals, the second one is gonna be Message Passing, so we kind of talk about gRPC and Kafka, the third one is gonna be Observability, and the fourth one is gonna be Security. So I’m teaching the first one.

So this is kind of a wider nano-degree put together… So my course is free, but at the moment it’s not yet available as a standalone free material. It is gonna be available later on this summer; unfortunately, it’s taking longer than needed. But this is because currently we are having 15,000 students that are doing the first course that [unintelligible 00:55:37.24] Once they finish this course, they will be able to pretty much open it to the wider community as well.

[00:55:46.26] So it is gonna be free, it is intended to be free. I am not charging for it at all. I’ve built it purposely to – part of my motivation to make cloud-native ubiquitous is making it accessible and available to everyone… Even someone who doesn’t do any technology at all, I hope they’ll be able to have some programming experience before and they will be able to roll out for the entire course as well.

So currently it’s not yet available, but once I have the links, I’m gonna share it and make sure that everyone will have it. I’m actually for further feedback from everyone as well.

Thank you. So that’s great. Maybe by the time you’re listening this, the course will be free, or just like a week or two weeks away from becoming available, so you can take it. And that is the free part. I’m sure that there will be a course that people will be able to pay you for it if they want to, right? Hopefully? No? There should be a paid version as well, right?

Yeah, the entire nano-degree is paid. That’s why people are currently asking “Where is the free material?” So the entire nano-degree actually has a price; so if you wanna take it as part of the nano-degree, maybe your organization already has an affiliation to Udacity, which means you don’t have to pay for it… Which is great, because many organizations have these training programs internally, they have a lot of contracts with these vendors as well… So maybe you’ll be able to take it completely free because it’s already paid for.

However, if someone would like to do it, you’ll have to take the entire nano-degree and pay for it. That’s the only option. Once it’s available as a standalone material, I’m gonna be able to share it free for you… Because the feedback so far I’ve been receiving - it’s actually great, because I’ve been developing this course starting November 2020, and I finished it in January 2021… So it’s been quite intense, I would say. It’s been four months from the beginning to the actual end for me working on it, and now it’s been half a year since and actually I can see that the students do find it useful. It’s been difficult for me to realize the impact.

One of the motivations, again, for me, is to grow the next generation of cloud-native practitioners, to make it easier for them to transition within a role that has cloud-native elements. And I’ve been developing this, but I haven’t seen any results. Now I’m actually starting to see those, and it’s really delightful to see students from across the entire world sending me messages on LinkedIn, and requests, and being like “The material is great. I really understand everything you’re saying. I would like to learn further from you, that’s how I would like to connect.” So it really inspires me to – again, it’s been a great work that I’ve been doing, and I really hope it actually reaches as many students as possible.

Thank you, Katie. That sounds wonderful. Thank you for putting in the time, for caring enough about this, because it is important, but maybe many people don’t realize just how important it is… As time goes by, I’m sure this will become even more and more relevant, and people will enjoy that such great materials exist… So thank you, Katie, for taking the time.

If you have not heard of this course, go and check it out. It will be in the show notes. Give Katie feedback, what you liked, what you didn’t like, how she can improve it, because there’s always scope for that, to improve, to make it better… But I think you will really enjoy what’s already out there. If you just look at the blurb, the description, there’s a lot of very valuable content.

Katie, it’s been a pleasure. Thank you for sharing this with us. I look forward to speaking with you sometime soon.

Thank you for having me. There is one last thing I would like to mention - taking the course is just the first step. One thing that I’m actually calling out at the end of the course is “Do reach out to the community”, and I’m expecting everyone in the cloud-native community. I think this is an action that I would like every student to take, if possible. Being part of the community does not necessarily mean writing a thousand lines of code and being out there. It’s being present, it’s sharing your experience of using cloud-native, it’s reaching out to the community… So the community is out there, and we are expecting you as well, so please do reach out, and… Yeah, let’s grow the next generation of cloud-native practitioners. That’s the high purpose.

Thank you, Katie.

Changelog

Our transcripts are open source on GitHub. Improvements are welcome. 💚

0:00 / 0:00