Patrick DeVivo pointed tickgit at Kubernetes’ source code and discovered that the team has a lot TODO…
- 2,380 TODOs across 1,230 files from 363 distinct authors
- 489 TODOs were added in 2019 so far
- 860 days (or 2.3 years) is the average age of a TODO
That’s just a taste of what they found. The article has more info and some analysis to boot.
Chaos Mesh is a cloud-native Chaos Engineering platform that orchestrates chaos on Kubernetes environments. At the current stage, it has the following components:
- Chaos Operator: the core component for chaos orchestration. Fully open sourced.
- Chaos Dashboard: a visualized panel that shows the impacts of chaos experiments on the online services of the system; under development; curently only supports chaos experiments on TiDB(https://github.com/pingcap/tidb).
For the uninitiated, chaos engineering is when you unleash havoc on your system to prove out its resiliency (or lack thereof).
What do you do when you have CronJobs running in your Kubernetes cluster and want to know when a job fails? Do you manually check the execution status? Painful. Or do you perhaps rely on roundabout Prometheus queries, adding unnecessary overhead? Not ideal… But worry not! Instead, let me suggest a way to immediately receive notifications when jobs fail to execute, using two nifty tools…
How do you know if your Kubernetes cluster is production-ready?
If you’re a beginner, it’s hard to tell what you’re missing. The subject is soo vast and it’s easy to lose sight on what’s the right path to production.
And even if you’re an expert, remembering all networking, storage, cluster, and application development best practices is impossible. There are so many.
Here is a curated a list of best practices for Kubernetes that helps you drive your roadmap to production.
Check things off the list and keep track as you go. ✅
You should have a plan to roll back releases that aren’t fit for production. In Kubernetes, rolling updates are the default strategy to release software.
In a nutshell, you deploy a newer version of your app and Kubernetes makes sure that the rollout happens without disrupting the live traffic. However, even if you use techniques such as Rolling updates, there’s still risk that your application doesn’t work the way you expect it at the end of the deployment.
Kubernetes has a built-in mechanism for rollbacks. Learn how it works in this article.
Have you ever created a Kubernetes cluster and wondered what type of worker nodes you should use? For example, if you’re on AWS, should you use many small and cheap t2.micro instances, or some few powerful m5.xlarge instances?
This article discusses the pros and cons of using different worker node sizes in your cluster.
In this workshop, we’re going to:
- Deploy Kubernetes services and an Ambassador API gateway.
- Examine the difference between Kubernetes proxies and service mesh like Istio.
- Access the Kubernetes API from the outside and from a Pod.
- Understand what API to choose.
- See how Service Accounts and RBAC works
- Discover some security pitfalls when building Docker images and many interesting things.
- Other things :-)
If you’ve ever wondered why exactly Kubernetes is a thing OR wondered what the root problem is that Kubernetes solves, then this post from Jef Spaleta is for you.
For organizations that operate at a massive scale, a single Linux container instance isn’t enough to satisfy all of their applications’ needs. It’s not uncommon for sufficiently complex applications, such as ones that communicate through microservices, to require multiple Linux containers that communicate with each other. That architecture introduces a new scaling problem: how do you manage all those individual containers?
…Enter Kubernetes, a container orchestration system — a way to manage the lifecycle of containerized applications across an entire fleet.
It aims to be part of the developer’s toolkit for gaining insight and approaching complexity found in Kubernetes. Octant offers a combination of introspective tooling, cluster navigation, and object management along with a plugin system to further extend its capabilities.
The CNFC has been funding security audits of projects since last year. With CoreDNS, Envoy, and Prometheus taken care of, Kubernetes itself recently received the treatment.
The assessment yielded a significant amount of knowledge pertaining to the operation and internals of a Kubernetes cluster. Findings and supporting documentation from the assessment has been made available today, and can be found here.
If you don’t want the full report, the linked announcement lists some of the major takeaways.
This isn’t just for business executives. It’s good knowledge to have for anyone who has heard the hype around K8S but never any of the potential problems:
This post will cover some hard truths of Kubernetes and what it means for your organization and business. You might have heard the term “Kubernetes” and you might have been led to believe that this will solve all the infrastructure pain for your organization. There is some truth to that, which will not be the focus of this post. To get to the state of enlightenment with Kubernetes, you need to first go through some hard challenges. Let’s dive in to some of these hard truths.
A proof-of-concept virtual Kubernetes control plane that lets you take one physical Kubernetes cluster and chop it up in to smaller virtual clusters. The benefits of doing this are:
- Better security/multitenancy
- Better separation of concerns between infra and custom controllers (operators)
- Ability to package complex k8s based applications
Learn from other people’s fail stories. This is a compiled list of public Kubernetes failure stories. Why?
Kubernetes is a fairly complex system with many moving parts. Its ecosystem is constantly evolving and adding even more layers (service mesh, …) to the mix. Considering this environment, we don’t hear enough real-world horror stories to learn from each other! This compilation of failure stories should make it easier for people dealing with Kubernetes operations (SRE, Ops, platform/infrastructure teams) to learn from others and reduce the unknown unknowns of running Kubernetes in production. For more information, see the blog post.
Polaris helps keep your cluster healthy. It runs a variety of checks to ensure that Kubernetes deployments are configured using best practices that will avoid potential problems in the future.
Provides a dashboard with an overview of how your clusters are doing as well as an experimental “validating webhook” that can stop future deployments that don’t live up to the standards.
Why Kubernetes? Should you roll your own servers? Should you go off the cloud?
If you’ve listened to The Changelog #344 — where we cover the details of Changelog.com’s 2019 infrastructure with special guest Gerhard Lazu — then you’ll know the answer to these questions. But if not, as you might assume, I recommend listening to that episode and reading this post from Ev, in that order.
In this three-part blog series, we’ll try to address some of the fears and uncertainties faced by organizations who had successfully started their projects on public clouds, like AWS, but for one reason or another found themselves needing to replicate their cloud environment from scratch, starting with an empty rack in their own enterprise server room or a colocation facility.
Popeye is a utility that cruises Kubernetes cluster resources and reports potential issues with your deployment manifests and configurations. By scanning your clusters, it detects misconfigurations and ensure best practices are in place thus preventing potential future headaches.
This is a read-only tool, which means it’s pretty safe to kick the tires. For the back story, check out Fernand’s announcement post.
Have you ever stared at the terminal window but don’t remember which Kubernetes cluster it is set up for? Are you bored of typing
kubectl get podsfor the millionth time? Learn how to boost your kubectl productivity with these 6 tips
There’s another gorilla to consider for container orchestration.
Kubernetes is the 800-pound gorilla of container orchestration. It powers some of the biggest deployments worldwide, but it comes with a price tag.
Especially for smaller teams, it can be time-consuming to maintain and has a steep learning curve. For what our team of four wanted to achieve at trivago, it added too much overhead. So we looked into alternatives — and fell in love with Nomad.
From the Nomad website:
HashiCorp Nomad is a single binary that schedules applications and services on Linux, Windows, and Mac. It is an open source scheduler that uses a declarative job file for scheduling virtualized, containerized, and standalone applications.
Anyone from the community with experience using Nomad? Let us know in the discussion below.
Omer Levi Hevroni:
When we made the shift to Kubernetes, we wanted to keep our devs independent and put a lot of effort into allowing them to create services rapidly. It all worked like a charm – until they had to handle credentials…
The solution they came up with is called Kamus, which is:
an open source, GitOps, zero trust, secrets solution for Kubernetes applications. Kamus allows you to seamlessly encrypt secret values and commit them to source control
Jump over to the article for more on Kubernetes built-in secrets, an overview of some other alternatives, and a deep-dive on how Kamus works.
The intersection of service mesh and distributed tracing is exciting to me. This quick Kubernetes-based tutorial is a great way to see how it works in practice.
Submariner is a tool built to connect overlay networks of different Kubernetes clusters. While most testing is performed against Kubernetes clusters that have enabled Flannel/Canal, Submariner should be compatible with any CNI-compatible cluster network provider, as it utilizes off-the-shelf components such as strongSwan/Charon to establish IPsec tunnels between each Kubernetes cluster.
Pre-alpha so it’s not ready for production, but it is ready for a follow.
K3s is a fully compliant production-grade Kubernetes distribution with the following changes:
- Legacy, alpha, non-default features are removed. Many of these features are not available in most Kubernetes clusters already.
- Removed in-tree plugins (cloud providers and storage plugins) which can be replaced with out-of-tree add-ons.
- Added sqlite3 as the default storage mechanism. etcd3 is still available, but not the default.
- Wrapped in a simple launcher that handles a lot of the complexity of TLS and options.
Rancher is also doing an online meet-up and demo of K3s on March 13, 2019.
- Security: reduce your attack surface by practicing the Principle of Least Privilege (PoLP) and enforcing mutual TLS (mTLS).
- Predictability: remove needless variables and reduce unknown factors from your environment using immutable infrastructure.
- Evolvability: simplify and increase your ability to easily accommodate future changes to your architecture.
Hit up the README if you’re curious about the name, why there’s no shell/ssh access, or how it’s different than CoreOS/RancherOS/Linuxkit