On July 23, 2021, we released Chaos Mesh 2.0 GA. Chaos Mesh 2.0 is an exciting release that takes a solid step toward the closed loop ecology of Chaos engineering.

Making Chaos engineering easier has always been a firm goal of Chaos Mesh, and building a Chaos engineering closed-loop ecology is a key step to help us achieve this goal. After nearly a year of hard work, we made major improvements in three areas: ease of use, orchestration and scheduling of experiments, and richness of failure types.

Ease of use

We have been working hard to improve the usability of our products, and we released Chaos Dashboard at Chaos Mesh 1.0 GA to make it easier for users to experiment with Chaos through a graphical interface. In Chaos Mesh 2.0, the Chaos Dashboard brings major improvements:

  • Chaos Dashboard supports the creation, viewing and updating of AWSChaos GCPChaos, so that chaotic experiments on the cloud are consistent with those in Kubernetes.
  • For each chaotic experiment, Chaos Dashboard is now able to show more detailed events for each experiment, making the experiment even more visible!

Native experiment scheduling and scheduling

When conducting chaotic experiments, a single chaotic experiment is often unable to satisfy the simulation of the fault scenario, and manual control of the start and stop of the experiment is a tedious and dangerous thing. Previously, we could use Argo and Chaos Mesh to automatically control the injection and end of experiments. In Chaos Mesh 2.0, we added Workflow native, enabling scenario orchestration to easily execute multiple experiments serial/in parallel, and weaving notifications and health checks into complex experiment scenarios.

When defining the periodic execution of chaos, just using “cron: @every 10s” and “duration: 5s” to describe the behavior is not enough. For example, the definition that a single execution is often longer than the execution period is legal, but there is no proper description for the study of expected behavior. We refer to the definition of CronJob to introduce a new custom object Schedule, and add more explicit attributes for the regularly executed tasks, such as whether multiple experiments are allowed to be executed at the same time, so as to constrain the behavior.

For updates to the definition, we provide migration tools to help users migrate and upgrade, which will also be released with the Release. To complete the upgrade from 1.x to 2.0, see Upgrading to Chaos Mesh 2.0.

More fault types

Chaos Mesh has supported system level fault injection such as NetworkChaos, IOChaos and StressChaos, as well as cloud service type fault injection such as AWSChaos and GCPChaos. In Chaos Mesh 2.0, we have also added fault injection at the application layer.

JVMChaos

Java, Kotlin and other JVM-based languages are widely referenced in the industry. JVMChaos can be easily implemented through THE ENHANCEMENT of JVM bytecode and JavaAgent technologies. Chaos Mesh currently implements JVMChaos with chaos-exec-JVM, enabling application level fault injection such as method delays, return value modifications, memory overruns, and throwing exceptions. Refer to the documentation simulating JVM application failures for more information.

HTTPChaos

HTTPChaos is a new Chaos type supported in 2.0. It can hijack HTTP service requests and responses on the service side, break links, inject delays, or modify Header/Body. It is suitable for any scenario that uses HTTP as the communication protocol. Refer to the documentation simulating HTTP failures for more information.

Physical machine injection tool Chaosd

Chaos Mesh is specially designed for Kubernetes, while for physical machine environment, we provide Chaosd, Chaosd evolved from chaos-Daemon, and add some special Chaos experiment functions according to the characteristics of physical machine. Support for different types of fault injection on physical machines such as process, network, JVM, pressure, disk, etc.

Looking to the future

Chaos Mesh is still in active development, and we have planned more powerful features for Chaos Mesh in the coming months, including:

  • Inject JVMChaos at runtime, making JVMChaos cheaper and more convenient.
  • Plug-in mechanism allows users to build custom Chaos experiments while enjoying the Chaos Mesh scheduling function.

In addition, we also found that users’ chaotic experiment scenarios are very valuable resources, and good chaotic experiment scenarios can be reused in many places. In the future, we will launch a platform that will allow users to share their chaotic experiments.

Rapid experience

You can go to chaos-mesh.org/interactive… To quickly experience Chaos Mesh 2.0 using resources on the cloud!

Thank you

Thanks to all Chaos Mesh contributors (github.com/chaos-mesh/…) Chaos Mesh went from 1.0 to 2.0 without the efforts of every contributor!

Finally, you are welcome to submit an issue or reference document for Chaos Mesh to start submitting code. Chaos Mesh looks forward to your participation and feedback!