Tom Clancy Splinter’s Chaos theory is one of several theories that are constantly mentioned in academic and professional circles, although its definition, use and implications are often misunderstood. However, with technology infrastructures becoming increasing more dynamical and mission critical, then an understanding of chaos theory will help with managing the changes.
So, what is chaos theory?
Chaos theory is the area of mathematics that looks at the behaviour of dynamical systems that are highly sensitive to their initial conditions. A dynamical system is defined as a system whose status or state changes over time according to a set of fixed ‘cause-and-effect’ rules.
This means that small changes in the initial conditions of the system could result in large changes in the system’s behaviours and/or outputs. The end result is that these systems appear unpredictable (or ‘chaotic’). The cause of this ‘chaos’ is that these systems will have a vast number of different parts within them, each with complex interactions which are often not fully visible to the users of the systems. Therefore, this means these systems are almost impossible to understand and which, in turn, means their behaviour cannot be predicted easily or accurately.
In fact, sometimes people refer to these systems as if they have a personality in order to explain these unpredictable behaviours. For example, people would say the ‘system is not behaving itself’ or ‘the system is having a bad day’ as opposed to saying that the ‘unpredictable behaviour is caused by a complex set of interactions which are impossible to fully understand’.
What are the uses of chaos theory at a general level?
One could argue that everything around us appears unpredictable, for example: weather, traffic control and technology infrastructures. This means the applications for chaos theory are almost unlimited. However, specific examples of where chaos theory is used are as follows:
- Weather - attempting to forecast medium to long term weather patterns. While it is possible to predict high level weather trends it is not possible (due to the complexity of weather systems) to predict accurate long term day-to-day forecasts.
- Medicine - attempting to model disease patterns so their spread and behaviours can be measured. For example, models are being developed to try and understand COVID-19’s behaviours.
- Economy and financial markets - attempting to understand what causes changes to local and global economies, as well as related financial markets. However, nobody has been able to consistently predict the behaviour of financial markets on a constant basis.
- Eco-systems - for example, trying to understand the behaviours of rainforests or understanding population growth. Even modelling climate change.
- Space - understanding the formation and the behaviour of the universe and our solar system.
How chaos theory can impact technology infrastructure or ecosystems
Technology infrastructures can be defined as dynamical systems for the following reasons:
- They contain many different components covering business applications, hardware, databases, operating systems, datacentres (now including various types of clouds) and programming languages. There are often different versions of components.
- These components are often supported by a combination of vendors, in-house technology teams and end-user computing.
- These components are linked together by a combination of differing networking and application interfaces.
- There is often poor, or out-of-date (more worryingly) documentation.
- There are many different ways the technology can be accessed by the users. For example, different ways to open accounts, process orders or product reports.
- Finally, it is physically impossible for a single person to fully understand the full system. Knowledge is often spread over a patchwork quilt of different people, teams and suppliers.
Therefore, these infrastructures could be impacted by chaos theory in two main ways:
- Changes to how the users use the system could initially cause unintentional problems with other parts of the infrastructure.
For example, the author worked on a client site where they were having significant (albeit sporadic issues) downstream in their systems. These were eventually traced to some very minor business changes to how some inputs were being made at the top of infrastructure. However, it took several weeks of painful investigation to find the cause of the problems because, effectively, all the links between the different components had to be checked in great detail. Although, once the problem was discovered, it was reasonably quick to change the business behaviours to fix the downstream issues. - Changes to the actual components and their integrations can cause problems.
As technology infrastructures are now so dynamical and complex, then a small change in one area could cause problems elsewhere. For example, upgrading an operating system version or replacing a retiring piece of application software. Those of a certain age will remember the ‘DLL hell’ phenomena, where a DLL (direct link library) on MS-Windows was changed for one application and it caused all sorts of weird behaviours for seemingly other unconnected applications. Even if the offending DDL was then reversed out, it still did not always seem to resolve the problems.
Managing the issues
The obvious next question is how can these issues be managed when they occur, or even better, be prevented completely?
Unfortunately, there is not an easy answer to this question.
Dynamical systems are now part of our normal lives. All technology systems are now dynamical, regardless of whether they are large infrastructures to support banks or small systems such as a mobile telephone. They all have many different components, multipart integrations and a general lack of understanding of how they operate. This is a fertile breeding ground for chaos theory.
However, there are some best practices which could help mitigate the issues of chaos theory:
- Reduce the number of components in technology infrastructure.
To be fair, this is a lot easier to say than do because the business functionality required dictates a large number of components. However, it may be possible to reduce (or merge) duplicate components and/or remove any redundant components. Remember that having fewer components means a reduction in the number of integrations and reduces the level dynamically, which, in turn, eventually reduces the infrastructure to the challenges of chaos theory and makes predicting behaviour easier. - Ensure that the latest versions of the components are in place.
For example, by upgrading to latest versions of the database software. Some firms will always use the ‘last but one’ version because there is a view that the latest version will have bugs in it. To be fair, this is a sensible approach, but if an organisation is several versions behind, then it may be worthwhile looking to upgrade. By installing the latest versions, you ensure that any known issues are fixed, which should make behaviours easier to understand and therefore predict. - Ensure the entire infrastructure and how it is used, is documented in as much detail as possible.
This should cover all the components, how they integrate with each other and how the business actually uses the system. This documentation should allow firms to better understand the impacts of making changes to the technology infrastructure configuration and how the business could use it. - Ensure there is sufficient time and expertise allocated to testing.
Testing is often an overlooked area in technology but it should not be. Testing should cover changes to the technology infrastructure (for example, software upgrades) as well as changes to how the business user uses the system (per the example above). - Implement strong change control processes across both the technology infrastructure.
For example, network upgrades and business usage changes, such as inputting data in a different manner. This will allow all changes to be reviewed by all stakeholders, which should ensure that they both are thoroughly reviewed and that there is general awareness of the change being made. In the event of any problems, the change control log can be reviewed to see if any changes made could have cause problems.
Conclusion
Chaos theory is a fascinating subject, which impacts a range of areas in society. For example, trying to predict long and medium-term weather patterns, or trying to model population growth.
As society continues to become increasingly more dynamical, then chaos theory’s importance and relevance will also increase as more and more systems will be subject to the challenges of chaos.
However, there are a number of specific issues that are relevant to technology. While there is no ‘silver bullet’ to address these fully, there are some best practices that can be followed to provide some level of control and mitigation.
How is butterfly effect connected to chaos theory?
It would not be possible to write an article on chaos theory without mentioning the butterfly effect. This term was first used by mathematician and meteorologist Edward Lorenz, who suggested in the 1960s that weather forecasts were very sensitive to tiny changes in their initial conditions.
In effect, if a butterfly were to flap its wings on one side of the world (i.e., a small change in initial conditions) then it would cause a chain reaction of events which could cause a storm on the other side of the world (i.e., a large change in the system’s behaviours and / or outputs).
While it is still unclear whether this has been proven, the phrase ‘butterfly effect’ has been used in popular culture to mean chaos theory.
About the author
Paul Taylor is a freelance consultant with over 30 years’ experience of implementing change across the various industries.
Read more from Paul Taylor