Director of Avenga Labs
Digital products must be functional, but even more importantly – enjoyable to use. From the best UX design, safety of user data, and privacy to the actual performance.
The performance of digital products is critical for the user experience and it translates directly into revenue streams.
The human attention span has been shortened from tens of seconds to mere seconds. Your users want to jump-in, do what they want to do and switch to other activities. Anything else is considered a waste of time and a lesson in frustration.
Performance testing, and even more importantly performance engineering, is an important part of the software development lifecycle which ensures that the users of your digital product won’t be negatively affected by performance issues, and that your blink-of-an-eye fast digital experience will win your clients over.
The meta challenge of performance testing can be expressed as “performance testing is hard”.
What does it mean in practice?
Performance testing requires proper load scripts, the right test data, and a proper configuration of agents hitting the APIs. If anything goes wrong, the performance measured by the agents will be far from the actual performance in the real world.
And, there are these additional challenges:
… and we could go on and on.
Performance testing from a traditional separate activity is transforming into performance engineering.
It is in line with the global trend of full cycle development and developers. The current meta tendency is to use the best software engineering practices everywhere when possible, and in this case with performance testing.
All the parts of the system come together. Metrics matter, test data, and scenarios are becoming inseparable parts of the digital product.
Testing later is too risky for digital products as it easily becomes too little, too late. The changes of the system architecture may be too deep to do later on, so it’s better to do at a very early stage in order to ensure the performance characteristics of all the components of their design and implementation.
You, you and you. In the true full cycle development/developers spirit – it’s the entire team.
Any quality concern is each and every team member’s concern.
Reminding all the team members, as part of daily conversations, about the importance of the application performance and testing is one of the social techniques that helps avoid surprises later.
Performance testing inside SDLC has developers involved using the very same tools and frameworks from the start that they used for creating the applications.
Less switching of context/technology/language helps to involve more professional developers, as tests tend to be even more complex than application logic.
The big distant and unfriendly testing platforms of the past were often rejected by developers as strange, complex and unfamiliar.
For instance, Python devs should use a Python based performance testing tool, and even using the same inside PyCharm Integrated Development Environment (IDE).
All developers should use performance tools on a regular basis, as part of their everyday developer’s life; for instance, LightHouse for Chrome in the case of frontend developers.
Developers should be constantly involved in performance engineering.
Performance goals and practices should be baked in into the process and nowadays it must include Continuous Integration and Continuous Deployment pipelines.
The current trend is to include performance testing results as critical criteria for acceptance of the builds. Performance regression monitoring is also essential for avoiding costly performance related bugs by providing fast feedback that results in corrective actions, such as withdrawing the service version which caused performance problems.
Performance testing scripts have been traditionally based on protocol such as HTTP. With modern complex front-end frameworks, such as React or Angular, protocol-level testing is very hard.
Protocol level tools are usually based on capturing messages, then editing them and maintaining them, there’s a lot of effort required to keep them up to date and relevant.
It does not mean that protocol level testing frameworks are obsolete, as they are still useful for testing HTTP based API performance for example. However, they are not good enough anymore for testing the real world performance from the user’s perspective.
People interact with actual web browsers or mobile applications not with networking protocols. So the scripts running in the user space in the browser that simulate user behavior are the best way of measuring actual real world performance, as perceived by the actual users.
It is the users who will complain first about the performance. They will switch your SaaS or digital sales product for your competitor’s if they suffer from a terrible performance experience.
Users tend to mix performance with responsiveness and even scrolling fluidity; they expect at least 60 Hz. The role of the engineering team is to address those needs and avoid the pointless discussions of performance being good but some UI elements stuttering a little bit.
There is no doubt about the need for these types of tests. But what you can experience on the ‘surface’ as the end-user is just the tip of the iceberg, and there the protocol level testing shines. So what really is the best solution?
The prevailing opinion is that there’s no single silver bullet to solve all the known limitations of the techniques from the past.
Therefore hybrid testing, a combination of protocol level API testing and client (web browser, mobile app) testing from the user perspective, is the best combination available today.
Testing for a single user performing functional tests already at hand can be used as a very basic performance test, but they are merely verification of the performance of a single user.
The timer is always there and the time to execute functional tests can be measured. So, it enables finding extreme performance dramas, like the system being unable to work with proper performance for the single user.
In the old times, functional testing was used for performance testing, then there was the capability to use protocol level (HTTP, TCP) performance tools and suits.
There’s the temptation to treat it as a two-in-one solution. It is very cost effective doing both things at the same time, but it’s very easy to turn the good idea of simplified testing (as one of the many performance tests) into an antipattern (no other performance tests).
→ Read more about Avenga Quality Assurance expertise
AI is dominating the IT world, so why not use it for performance testing?
Analyzing the detailed results and the testing to find trends and complex dependencies between thousands of services (i.e., which impacts which performance) is a role that can be offloaded to advanced algorithms. These algorithms are the combination of rule based systems for known rules and machine learning for the undiscovered patterns.
Current generation AI technologies are excellent at finding the patterns, including the usage patterns of applications. For example: what users are doing and in what order; how much time did they have to wait for the pages to load or for the business transactions to actually be completed?
Automation of the creation of test models based on the observation of the actual system running in production is a new trend. The result is the creation of test scripts and flows which represent the actual usage pattern of the system.
This is much better than the traditional approach when test engineers try to anticipate the actual usage patterns of the applications and the actual users always tend to surprise them. This traditional approach makes performance test results less relevant to the reality of the actual live system performance.
Pattern analysis can also be applied to the metrics of resource utilization (CPU, GPU, Network IO, Disk IO) to find the weak performance spots of the application as well as resource overutilization. Again, the properly implemented AI is able to find patterns which humans can not predict, especially within the complex systems containing tons of data and large numbers of concurrent users.
The elements of predictive analysis can be applied here to forecast the performance of the system in the following weeks and months, how they will cope with the spikes of demand from users (for instance Black Friday in e-commerce systems). The performance issues can be fixed before they actually become a production problem; similar to AI applications in manufacturing (predictive maintenance).
Chaos engineering and other techniques prepare the system for unknown behavior.
Why? Why can’t we just prepare the system for the known behaviors, as we are the experts and we know what the system is supposed to do?
Reality shows that unexpected events will happen. It’s no longer ‘fashionable’ to be surprised by them, but there is a strong trend to be better prepared for the unknown. There are no more unexpected errors, just untested scenarios.
The tools generate unusual traffic spikes and API calls to show the bottlenecks of the systems. New techniques, such as circuit breakers and others, are here to prevent the domino effect which would result in a total performance degradation and the lack of the system’s availability for users.
There’s no lack of examples, especially in these pandemic times. Sudden spikes of traffic, the popularity of different usage patterns, increased purchases of SaaS products, and sudden switches from traditional products to digital products are everywhere.
Engineering teams should be aware that when things break, they should try to make the individual components more resilient, and even more importantly to ensure that the critical parts of the systems will last just as long.
The normal optimization is that the parts related to purchasing services or products have to be able to take orders from the clients, even when the more complex backend processing may be overwhelmed by the increased spike in the load.
With current micro services architecture done properly, especially in the context of the public cloud, the scalability and maintaining of response times, despite increased traffic, is much easier to achieve than before, but nevertheless not simple.
Expect the unexpected by proactively trying different scenarios and optimizing your system performance characteristics.
This also includes informing the end users about such issues as temporary performance degradation or unavailability of the services. Informed users, who are not surprised, react better to performance problems when they do occur.
These are not going anywhere. They are being created, updated and managed more automatically, not only by humans but also with the help of machine learning. Theoretically the essence of testing remains the same, but HOW is so much different already compared to ten years ago. Now, we live in a different world of performance testing.
The promise of Automated Performance Management was a nice sales pitch, but the reality of complex digital products requires a more custom approach. The automation is a ‘yes’, but the promise of ‘just run it through our APM product and now everybody can be a performance optimization expert’ is a definite ‘no’; those times are over.
The meta trends in IT are clearly visible in the performance testing area. Full cycle trends are represented here by the transformation of the mindsets and organizations from traditional performance testing to performance engineering; everything as code – that uses the same languages and similar environments as for development.
Another great meta trend nowadays is adaptability, which is represented here by a proactive and holistic approach to performance engineering and as something that is a part of everyday life and not as a separated disconnected set of activities.
We, at Avenga, pay a lot of attention to the performance of the digital products which we help to envision, design, build and maintain for our customers.
Performance engineering is hard, but with the right IT consulting partner who has tons of experience, a modern mindset and the proper skill set, it can be much easier and smoother.
We also encourage you to check out our related products – web performance and security optimization as a service product as well as Wao.Io and the modernization (including performance improvements) of legacy systems using Couper.IO.