Director of Avenga Labs
Immutability means something is not supposed to change and once it’s created it cannot be modified.
Immutability sounds like it’s the opposite of the dynamic nature of today’s systems.
Everybody tries to move faster . . . sometimes even breaking things, or even changing things all the time. Systems are dealing with new users, products, orders, and services. New ideas and the experiences of digital products are introduced faster, in days and sometimes in hours even. How can immutability be a good idea?
Is it kind of a futile attempt to slow down things or freeze them for a moment?
Recently, I wrote about evolutionary architectures, so you may question if immutability is something opposite to that, or is it something that is moving towards a different goal?
Let’s dive in a bit.
Many software problems and infrastructure problems are related to state and state management. Immutability is expected to simplify the very complicated world of today’s IT systems, thus reducing the number of mutable components and data while enabling more robustness and scalability.
All the entities in business systems representing users, clients, products, services, and all the state that is necessary to deliver business functionality, have their identifiers.
In the old times, it was popular to assign them increasing integer numbers which was easy and human-readable, but with our more scalable and sophisticated current world, it is no longer a good idea. The problem is that the increasing integers are tightly bound to the local database server which requires it to be created.
So, the client (API Consumer, web browser) cannot create IDs and perform actions unless there’s a response from the data storage system. It impairs scalability.; even though multiple data storage nodes may eventually become consistent, they are not guaranteed to always produce different numbers. The probability of introducing duplicate IDs is very high and that is unacceptable in any system.
A security problem is another thing. In the popular OWASP set of rules, this type of identifier is not recommended and the reason is very simple. These IDs can be easily guessed by attackers which makes it easier to create scanners that guess the different IDs and, of course, then they try to access the data related to the ID.
One of the proven methods is to use globally unique identifiers (GUIDs). These very long randomized numbers, that are created for clients, guarantee uniqueness along with the very high probability of avoiding duplication and thus they are not easy to guess.
This technique is a viable way to achieve immutable and safe identifiers.
Another type of modern identifiers that are content-aware are intensifiers. For instance, let’s imagine we are storing images of medical scans, car crashes, etc; images that are critical for business processes. Filenames are not good identifiers, like image0001.jpg, and similar file names generated by smartphone or professional digital cameras are also not good identifiers. They could be duplicated and would be easy to guess.
Content-aware identifiers are not a new idea in the IT world. An illustration of this would be heavily distributed and decentralized file systems that have been using them for decades. Another example is the infamous Torrents which used hashes of files as identifiers, as well as the Magnet protocol with its cryptic long links.
But also, this same proven idea is perfectly applicable for business systems.
Most of the data in the systems represent the results of business activities and events. For instance: a purchase order once it’s completed is immutable, bank transactions are immutable, etc.
Immutable data allows for easy replication of it and caching. They also represent the facts that are not supposed to be modified for regulatory reasons and system stability reasons.
When the object changes, a new version is created and is inserted into the database or other data store. It has a new identity, though it may possibly be linked with the previous version.
When the guarantee of immutability is the key for ensuring that data wasn’t tampered with, we can also use digital ledgers and blockchains for business.
In a case where documents are exchanged between different parties, digital signatures or even simple one-way hash functions can be added to assure the fact that a message wasn’t tampered with after it was created and sent. We usually only think about the proof that a message was sent from a given sender, but the immutability of the message is also achieved with a digital signature.
The state of the system changes, but the actions and messages that lead to that change are immutable. This immutability tactic also helps with performance, scalability, debugging, auditing business operations and tracking human errors. The separation of messages that change data from those messages that are read-only is kind of a standard practice nowadays (Command-Query-Separation for instance).
Another example of mutability for data is the very principle of how data warehouses have operated for decades.
Once the data is loaded into the warehouse, it is versioned, timestamped, etc. and it is not supposed to be modified. There’s no update operation for archived data as it may have been just added (or deleted, when necessary).
“No, this time it’s too much. How come the software is immutable?” you might say.
Let’s talk again about the versions of software and artifacts produced in the build systems. They are usually containers deployed to Kubernetes clusters as part of a routine CI/CD pipeline. Immutability means once the container with a given version is created, it is not supposed to be modified, and obviously, it does not possess any state of its own.
As we all probably already know, this leads to more scalability as the same immutable container can be deployed to multiple locations, the public cloud, or on-premise in our hybrid cloud reality.
Of course, no matter how fast we type our code and how many developers work on it, the version which is running at the moment is immutable. And, it’s better to replace it with a new version containing bug fixes and new features, and then discard the old version.
Even when using feature toggles and other dynamic feature management methods, the software in itself is immutable, as it contains more configurability at the run-time; however, it was intentionally prepared for that.
APIs, once published and used by external and internal developers, are immutable. It means their contract should not be modified as it would break many clients, but also many are unknown or hard to reach by the API creators.
This applies not only to API contracts (operations, messages), but also to the behavior of the APIs. For example, the already deployed clients depend upon a particular way of responding to the wrong parameters of API calls. Even if the current implementation is not the best one, it’s safer not to change it in order to avoid behaviorally breaking its dependencies.
In these times when enterprises are building their own API management systems and marketplaces, again it is of great importance to embrace the idea of immutability for APIs.
A functional paradigm promotes the idea of immutability by providing immutable functions. The benefit of this is avoiding state-related errors created by programmers in the object-oriented languages.
As we’ve noticed before, the functional programming paradigm has already influenced all of the most popular programming languages used for business applications. Of course, when mixed with other paradigms, it’s not as efficient and its immutability is somewhat impaired with additional constructs and options coming from other paradigms (like object-oriented).
Pure functional programming is hard for most programmers, because they are used to so much freedom in creating and modifying the state of objects, however once mastered, pure functional programming delivers much more resilient and stable software.
Last but not least, let’s at least mention infrastructure immutability. What it means in practice is the same ole’ analogous thing. For instance, once deployed, virtual machines contain multiple components, libraries, services and configuration files. An immutable infrastructure means that instead of tinkering with the existing running machines, new ones will be deployed with new versions of components and configurations, and then the state will be migrated and the old machines will be discarded.
The same principles apply to everything that can be expressed as code. So, the immutability principle may be applied to everything as software.
And let’s not forget that observability is heavily based on the immutability of application and infrastructure logs, as well as gathered data.
As you probably already know, we are almost surrounded by immutability in our software world. We sometimes don’t even think about it at all. It might sound counterintuitive at first, but in order to be even faster, truly agile, and efficient, this concept can be of great help.
There’s so much dynamism in the distributed systems and so many things happen because of API to API and user interaction, that the entire IT industry is embracing immutability as the principle that makes it all at least a bit simpler and more resilient.