Remove K8s POD from Chaos Engineering using one of ChaosToolkit

Today we will play chaostoolKit, an open source tool for chaos engineering.

Its goal is to provide a free, open, community-driven toolset and API.

Official source link: github.com/chaostoolki…

To understand this tool, you must know the main points mentioned in chaos engineering principles. As follows:

Remember the first point here, establishing the steady-state hypothesis.

Before running the tool, let’s take a look at its architecture.

ChaosToolkit operates your system under test via Drivers.

Its function points include the following:

Let’s set up the tools and play with them.

Context: CentOS7.8, k8s 1.19.5, example application

Python3 sudo yum install python3 python3-venv install pipenv gaolou@GaoMacPro ~ % pip3 install Pipenv install chaos toolkit Pip3 install -u Chaostoolkit pip3 install -u Chaostoolkit -kubernetes Pip3 install -u Chaostoolkit -reporting If you need to operate on other platforms, you can also install extensions.

Python3 -m venv. bundler source-bundler /bin/activate Python3 -m venv. bundler source-bundler /bin/activate

Above the installation process is performed on k8s master machine, if you are not on the k8s installed, you can configure the corresponding k8s context, the specific operation, please reference: chaostoolkit.org/drivers/kub…

The Chaos Discover test starts with the Discover command, chaostoolKit will generate a discovery.json file from the contents of./kube/config, which contains a collection of all the actions that can be performed on K8s. The result is as follows:

(.bundler) [root@s5 chaostoolkit_scenarios]# chaos discover chaostoolkit-kubernetes [2021-06-23 12:18:07 INFO] Attempting to download and install package ‘chaostoolkit-kubernetes’ [2021-06-23 12:18:08 INFO] Package downloaded and installed in current environment [2021-06-23 12:18:09 INFO] Discovering capabilities from chaostoolkit-kubernetes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.deployment.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.node.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.pod.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.pod.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.replicaset.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.actions [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.service.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.statefulset.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.statefulset.probes [2021-06-23 12:18:09 INFO] Searching for actions in chaosk8s.crd.actions [2021-06-23 12:18:09 INFO] Searching for probes in chaosk8s.crd.probes [2021-06-23 12:18:09 INFO] Discovery outcome Saved in./ Discovery.json (.bundler) [root@s5 Chaostoolkit_scenarios]# Chaos init generation test

Execute the initialization command to create a chaos experiment as prompted.

(.bundler) [root@s5 chaostoolkit_scenarios]# chaos init You are about to create an experiment. This wizard will walk you through each step so that you can build the best experiment for your needs.

An experiment is made up of three elements:

a steady-state hypothesis [OPTIONAL]
an experimental method
a set of rollback activities [OPTIONAL]

Only the method is required. Also your experiment will not run unless you define at least one activity (probe or action) Within it Experiment’s title: E2 #

A steady state hypothesis defines what ‘normality’ looks like in your system The steady state hypothesis is a collection of conditions that are used, at the beginning of an experiment, to decide if the system is in a recognised ‘normal’ state. The steady state conditions are then used again when your experiment is complete to detect where your system may have deviated in an interesting, weakness-detecting way

Initially you may not know what your steady state hypothesis is and so instead you might create an experiment without one This is why the stead state hypothesis is optional. Do you want to define a steady state hypothesis now? [Y /N]: Y # creates the steady-state Hypothesis. Please note that this is an important concept in chaos engineering, but it is not seen in most other chaos tools

You may now define probes that will determine the steady-state of your system. Add an activity

all_microservices_healthy
deployment_is_fully_available
deployment_is_not_fully_available
microservice_available_and_healthy
microservice_is_not_available
read_microservices_logs
service_endpoint_is_initialized
count_pods
pod_is_not_available
pods_in_conditions
pods_in_phase
pods_not_in_phase
read_pod_logs
statefulset_fully_available
statefulset_not_fully_available
get_cluster_custom_object
get_custom_object
list_cluster_custom_objects
list_custom_objects

Activity (0 to escape): 1 # Select the steady-state hypothesis. In short, this is to create an expected result

!!!!!!!!! DEPRECATED!!!

kill_microservice
remove_service_endpoint

Do you want to use this probe? [y/N]: y # Determines whether to use the probe selected above

A steady-state probe requires a tolerance value, within which your system is in a reognised normal state.

What is the tolerance for this probe? : normal

You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply press return to use it or not set any value. Argument’s value for ‘ns’ [default]: Chaosnamespace # Do you want to select another activity? [y/N]: y # Add an activity

all_microservices_healthy
deployment_is_fully_available
deployment_is_not_fully_available
kill_microservice
microservice_available_and_healthy
microservice_is_not_available
read_microservices_logs
service_endpoint_is_initialized
count_pods
pod_is_not_available
pods_in_conditions
pods_in_phase
pods_not_in_phase
read_pod_logs
statefulset_fully_available
statefulset_not_fully_available
get_cluster_custom_object
get_custom_object
list_cluster_custom_objects
list_custom_objects

Activity (0 to escape): 1 # Select specific action

!!!!!!!!! DEPRECATED!!! Do you want to use this probe? [y/N]: y # confirm to use the action selected above

You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply press return to use it or not set any value. Argument’s value for ‘ns’ [default]: Do you want to select another activity? [y/N]: N # Whether to add another experimental action, I won’t add it here

An experiment’s method contains actions and probes. Actions vary real-world events in your system to determine if your steady-state hypothesis is maintained when those events occur.

An experimental method can also contain probes to gather additional information about your system as your method is executed. Do you want to define an experimental method? [y/N]: y # select a test specific method to Add an activity

kill_microservice
remove_service_endpoint
scale_microservice
start_microservice
all_microservices_healthy
deployment_is_fully_available
deployment_is_not_fully_available
microservice_available_and_healthy
microservice_is_not_available
read_microservices_logs
service_endpoint_is_initialized
create_deployment
delete_deployment
scale_deployment
deployment_available_and_healthy
deployment_fully_available
deployment_not_fully_available
cordon_node
create_node
delete_nodes
drain_nodes
uncordon_node
get_nodes
delete_pods
exec_in_pods
terminate_pods
count_pods
pod_is_not_available
pods_in_conditions
pods_in_phase
pods_not_in_phase
read_pod_logs
delete_replica_set
create_service_endpoint
delete_service
service_is_initialized
create_statefulset
remove_statefulset
scale_statefulset
statefulset_fully_available
statefulset_not_fully_available
create_cluster_custom_object
create_custom_object
delete_cluster_custom_object
delete_custom_object
patch_cluster_custom_object
patch_custom_object
replace_cluster_custom_object
replace_custom_object
get_cluster_custom_object
get_custom_object
list_cluster_custom_objects
list_custom_objects

Activity (0 to escape): 24 # Here I select the 24th method: Delete a POD

!!!!!!!!! DEPRECATED!!! Do you want to use this action? [y/N]: y # confirm selection

You now need to fill the arguments for this activity. Default values will be shown between brackets. You may simply Press return to use it or not set any value. Argument’s value for ‘name’: DeleteRedisPOD

Argument’s value for ‘ns’ [default]: Argument’s value for ‘label_selector’ [name in ({name})]: App =redis # Enter the tag of the object to operate, so that you can find the object to operate Do you want to select another activity? [y/N]: N # Whether to add another action, I won’t add it here

An experiment may optionally define a set of remedial actions that are used to rollback the system to a given state. Do you want to add some rollbacks now? [y/N]: N # delete redis POD, because k8s will automatically pull up, so I don’t need to scroll back

Json ‘# generated test file (.bundler) [root@s5 chaostoolkit_scenarios]#

Chaos Run Example (.bundler) [root@s5 Chaostoolkit_scenarios]# Chaos Run Experiment. Json [2021-06-28 23:03:23 INFO] Validating the experiment’s syntax [2021-06-28 23:03:24 INFO] Experiment looks valid [2021-06-28 23:03:24 INFO] Running experiment: E2 [2021-06-28 23:03:24 INFO] Steady-state strategy: default [2021-06-28 23:03:24 INFO] Rollbacks strategy: default [2021-06-28 23:03:24 INFO] Steady state hypothesis: H2 [2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy [2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next releases, please use all_pods_healthy instead [2021-06-28 23:03:24 INFO] Steady state hypothesis is met! [2021-06-28 23:03:24 INFO] Playing your experiment’s method now… [2021-06-28 23:03:24 INFO] Action: delete_pods [2021-06-28 23:03:24 INFO] Steady state hypothesis: H2 [2021-06-28 23:03:24 INFO] Probe: all_microservices_healthy [2021-06-28 23:03:24 WARNING] all_microservices_healthy function is DEPRECATED and will be removed in the next releases, please use all_pods_healthy instead [2021-06-28 23:03:24 INFO] Steady state hypothesis is met! [2021-06-28 23:03:24 INFO] Let’s rollback… [2021-06-28 23:03:24 INFO] No declared rollbacks, let’s move on. [2021-06-28 23:03:24 INFO] Experiment ended with status: Completed (.bundler) [root@s5 Chaostoolkit_scenarios]# Check results before performing tests:

[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ………………………

Redis-master-b96c9795b-nqzmr 1/1 Running 0 3d9h 10.100.220.84s6 redis-slave-6b8d456947- 6r42K 1/1 Running 0 3d9h 10.100.220.86s6 redis-slave-6b8d456947-z55m5 1/1 Running 0 3d9h 10.100.53.206s7

After the test:

[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ………………………….

redis-master-b96c9795b-92rc6 0/1 ContainerCreating 0 3s s6

Redis-master-b96c9795b-nqzmr 0/1 Terminating 0 3d9h 10.100.220.84s6 Redis-slave-6b8d456947 -5m2xt 0/1 ContainerCreating 0 2s s6 redis-slave-6b8d456947-6r42K 1/1 Terminating 0 3d9h 10.100.220.86s6 redis-slave-6b8d456947-fj4xc 0/1 ContainerCreating 0 3s s7 Redis-slave-6b8d456947-z55m5 1/1 Terminating 0 3D9h 10.100.53.206s7

When POD is fully started:

[root@s5 ~]# kubectl get pods -n chaosnamespace -o wide

NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES

Redis-master-b96c9795b-92rc6 1/1 Running 0 5m43s 10.100.220.89s6

Redis-slave-6b8d456947-5m2xt 1/1 Running 0 5m42s 10.100.220.90s6

Redis-slave-6b8d456947 – fj4xC 1/1 Running 0 5m43s 10.100.53.211s7

[root@s5 ~]#

As you can see from the above results, the test was successfully executed and several Redispods were killed and pulled up by k8S.

Today we write this one experiment, and you can follow the same steps to generate other experiments.

Remove K8s POD from Chaos Engineering using one of ChaosToolkit

Related Posts

448. Find missing numbers in all arrays | swipe and punch

ServiceMesher community launches co-authored Istio Handbook

Ethereum ERC20 Token standard complete description