Sunday, September 13, 2020

How to Start with Chaos Engineering

First, you should be clear of what is Chaos Engineering for.

Second, create the technical base for Chaos Engineering.

Third, run Chaos Engineering test and analyze the results.

Fourth, Go to one until you reached your goal.


Let's start with first point, what is Chaos Engineering for. Chaos Engineering a software test method that improve the reliability and resilience or robustness of software. If resilience and software reliability are part of product, ask your PO (Product Owner) then a point in the software lifecycle comes where you need to improve these two product features, then you need Chaos Engineering.

Chaos Engineering is the new top of software test pyramid. With Unit Test you guarantee that a method or function works at some points like expected. With component test you test that two modules work together at some points. With integration test you test your component or service in context of the whole software landscape, often the test system is called PRELIVE or NONPROD. But you also test some points. With all these test you try to guarantee the absent of functional errors. Most of these test are positive path test or edge case test. Regardless how hight your test coverage is, there a unknown problems in your software and with this kind of test you never find them or like Edsger Wybe Dijkstra wrote:

Program testing can be a very effective way to show the presence of bugs, but is hopelessly inadequate for showing their absence.

And exactly this is the point where Chaos Engineering comes into play. Chaos Engineering are able to find unknown problems or better unknown unknown problem. 

Donald Rumsfeld: There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.

Chaos Engineering is a software testing method to address unknown unknown errors. And from a more practical point of view, Chaos Engineering is the way to improve reliability and  resilience (robustness). In the upcoming articles I will explain the other points of Chaos Engineering.