May 26, 2022

A little Chaos can go a Long Way, KubeCon & CloudNativeCon Europe 2022

Contributed by Idan Balisha, Solution Architect, following his attendance at KubeCon & CloudNativeCon Europe 2022.


As the responsible DevOps that we are, we always test our production and development infrastructure and validate that our application can survive disruptions and failures – right?! As it turns out, chaos may benefit DevOps more than you might think, and Chaos Engineering doesn’t have to wait for the entire build-test-release-deploy process – at least, not anymore!


I learned more about this growing trend at KubeCon Europe 2022 during a session called “Case Study: Bringing Chaos Engineering to the Cloud Native Developers,” presented by Uma Mukkara, CEO of ChaosNative, and Ramiro Berrelleza, CEO of Okteto.


Chaos Engineering began as a solution for solving unknown problems at scale. But according to Mukkara and Berrelleza, it has recently evolved into an entirely unique practice area. It plays a major role in CI/CD separately from Ops by improving the developer experience.

By bringing chaos engineering to the developer’s level — before merging – ChaosNative was able to offer Okteto a new and higher level of reliability. The pair discussed how Okteto, an open-source tool that lets developers deploy environments directly in Kubernetes, utilized ChaosNative’s LitmusChaos tests, running them as part of the development process rather than just on CI. These tests, or experiments, are both basic and chaotic, including repeatedly shutting down databases, deleting deployments, etc. Okteto found that the process improved CI/CD efficiency and that they were able to find bugs and failures in earlier stages before merging.


At TeraSky, we have also helped our customers leverage chaos engineering to make their products more resilient. We have generally utilized public cloud features like AWS Fault Injection Simulator (which improves resiliency and performance with controlled experiments) and Consul service mesh (which improves resilience through retries, circuit breaking, and timeouts), as well as other commercial and open-source tools. It was fascinating to hear about Okteto’s experience with LitmusChaos and see that this approach is catching on.

We would be happy to talk with you to help you adopt a chaos culture in your organization — there’s a lot more than killing servers and pods to discuss ✌????.

Want more info?

Cloud Native

Next Articles


27 February, 2024

Precision Scaling Success
Read Entry

26 February, 2024

TeraSky’s 2024 vExperts!
Read Entry

21 February, 2024

How TeraSky Conquered Developer Frustration with AWS’s Help
Read Entry
Skip to content