Let’s start with the history of servers. They were the computers located in server rooms that did all the difficult jobs in the background with all the blinking lights. Initially the idea was to set up the physical server for an application, then virtualization came and servers became virtual, logical, and managed by local IT, but with more flexible management. The next step was the container revolution (Docker) supported by container fleet management solutions (Kubernetes).
Serverless is another version of the same story of managing the software infrastructure, in different ways and with different philosophies, but still the teams are responsible for managing their pods and containers, instead of virtual machines or physical servers. The management effort is still there, and new skill sets have to be acquired to do it in a new modern way, but all the same it’s the same chronological path of infrastructure management evolution.
But why do we even need servers or containers at all? The developers are supposed to create business value not deal with deployment details.
The curse of the container revolution is the effort that is spent by the devops team, by all teams really, to deal with containers, pods, Kubernetes yaml files, etc. A lot of effort has to be expended to make it work properly and to work all the time.
The idea of function as a service was born, and the developer could write code and commit the function to the cloud. The NoOps movement was very happy about that and the hope of reducing devops costs grew.
The practical implementation of the idea is serverless; a cloud service that enables the deployment of functions without the need to think about memory, disk space, CPU usage, scalability, pods, or containers.
Serverless is not just functions as a service, it’s an entire ecosystem of cloud services such as message queues, databases, logging, and authentication all in a service version. There are no requirements to deal with any engines or instances and their hosting containers or servers.
It’s a totally different approach from the infrastructure path of evolution.
It’s not as new as many think. It was actually established in 2014, making it more than five years old, which is a significant time in which . . . it should have become more popular. And yet it has not. Kubernetes and containers are dominating the landscape, and serverless is something that many plan to do in the next year or so. Yet only 15%-25% actually use it in production, much less as a default technology.
Let’s try to figure out why this is: is it supposed to change in the near future, what are the common usage patterns for serverless, and what are the known limitations slowing down serverless adoption?
→ Read more about what happened to NoOps?
Serverless has been around for some years already, so the patterns have been found and published. What is really interesting about this is that they usually refer strongly to the existing concrete services (AWS Lambda – SQS, DynamoDB for instance) so I have to simplify and generalize here; so, it’s intentional.
The developer writes a function that is performing business operations. For instance, OrderProducts with a list of product IDs which returns a message with the OrderID, confirming that the user placed the correct order.
A classical synchronous or asynchronous call means waiting for an API to return results and display them for the end user.
Serverless supports this scenario and it’s very tempting to use it only for that (the most common) purpose. On the other hand, if it didn’t work for this purpose, it would be doomed from the start.
→ Read more about generic API or back-end for front-end?
The more common model is to use serverless not to act as an API provider (let’s call it user facing API) but to respond to rare business events. The function waits for the event then reacts, from time to time (like once per hour, once per week).
The cold start problem won’t be noticed by the users. In case of a sudden spike of message numbers, the function will automatically handle the increase in requests and then go to sleep waiting for another round.
Keeping the entire container running for that purpose would be a waste of time and resources.
Serverless functions can work as aggregators, by calling multiple other functions and APIs from different clouds and then aggregating the results.
The calls can be organized by external orchestration engines (like with classical enterprise integration patterns) or choreographed by the functions themselves using state machines, queues and temporary storages.
Both approaches have their applications with the usual sets of pros and cons.
The functions, in serverless, are the compiled or interpreted code running in invisible containers, running on invisible virtual machines, and running on invisible physical servers.
When a function is not active for a long period of time its underlying container, or whatever is there, is suspended or shut down to minimize resource usage. It’s a way for cloud providers to optimize their cloud performance and energy usage. (In their words to offer the best pricing for clients).
From the application standpoint, it means that the user of the application has to wait for . . . . minutes to get the results of their business operation. It’s hard to imagine an acceptable scenario for that kind of behavior.
The known workarounds include periodically calling the function just to keep it alive (scheduler). Also, putting many functions in one ‘function group’ (sometimes entire applications) to make it more probable that the given function group will be hit by the caller and thus wake up again, and as a result, there is no cold start problem anymore; but often at the expense of other problems like losing microservices decomposition and creating a serverless monolith.
For me the cold start problem is a classical example of leaking abstraction. We are not supposed to think and care about the underlying infrastructure, and yet the infrastructure reminds us of itself in a very painful way.
In one of our Java projects based on AWS Lambda serverless, our team used GraalVM to minimize the problem of cold starts and the entire application was a great success.
When you see the word stateless, it’s usually expected to mean something. Like, stateless is good because it doesn’t have any state, it scales easily, consumes less resources, does not require complex state transfer between nodes, etc.
The problem is that all business apps and processes are stateful, so there is always something that needs to be done when stateless collides with the stateful nature of business apps.
In case of serverless, the usual idea is to have some kind of sessionID that is passed between calls, and you use a cloud database to store the state. The problems are the performance, the cost (additional calls) and the management of data waste (session data should be removed eventually).
In a more advanced version, serverless can be managed by an orchestration engine that maintains the state between calls, contexts, and the process flow. It also helps with debugging and logging of the operations.
Serverless functions have a short life span, can run on any server anywhere, and can also invoke external services from API providers. It makes them a challenge for proper logging management. It’s doable but it’s not any easier.
Functions have time and memory limits. They are supposed to be fast, not consume too much memory, and finish their execution time by invoking other services asynchronously without waiting for the results.
On the other hand, there are complex operations that take a lot of time and memory to perform which simply cannot be avoided. For instance, complex algorithms that require lots of business data to process, and CPU time as well, like a risk calculation or running ML models; I’m not even speaking of training ML models.
The function execution time can be prolonged by configuration settings and they can run for even minutes; contrary to many myths, that it’s only allowed to run for seconds, so there’s a significant amount of exaggeration here.
Due to the parallel and asynchronous nature of serverless functions, operations requiring an order of execution are the same as an order of incoming data. There’s a need to use additional services such as queues or state machines to ensure the proper order of the execution. Again it’s doable but it’s not well supported by the nature of serverless.
Not really an old habit but a current habit, is the art of letting go and it’s one of the hardest things to do by developers and devops people. No Docker? No Kubernetes? What about all the investments made to learn all those Dockerfiles or yaml configurations? Is it all for nothing?
This is almost an exact quote from one of my conversations with some developers.
Unlearning things is always a problem, as people hang on to what they have learned and don’t want to change (the fear of the unknown), especially tech people who are very conservative; even the 20 year olds.
Cloud adoption is progressing, but it’s not progressing as fast as anticipated.
Serverless can actually be done on premise as well and there are open source platforms to do that. But the true spirit of serverless is not to manage infrastructure, which includes local serverless infrastructure.
The consequence of a slower public cloud adoption also lends to the serverless adoption becoming a victim too.
→ Read more on is the hybrid cloud here to stay forever?
Serverless is a black box which does not always behave as predicted, which tends to be unaccepted by thousands of IT operations and developers.
To benefit from serverless it is required that you let go of most of your need to control the infrastructure, which can be done but it always takes time.
In a mixed environment it adds an additional workload and is another learning curve for devops teams because they have to deal with both traditional containers based on infrastructure, plus the serverless configuration.
It’s rare to find the migration of projects from a current architecture to serverless. Serverless can be used to augment the current application (usually for rare business events handling) and rarely is it an in-place replacement for the containers.
It also slows the adoption rate as application life cycles are not that short. Modules that are to be rewritten because of increasing maintenance costs can be rewritten using serverless, but rarely are they because of “if it works – don’t fix it” prevailing.
In our Avenga example the migration to serverless model was part of the cloudification of the existing system. If we were to move to the cloud, why wouldn’t we benefit from serverless? So we’ve proved more than once that it doesn’t have to be related to writing apps from scratch, but can be a part of a cloud migration.
Tech limitations were described above, and yes they are there, but serverless vendors make them less painful. They respond to the community needs and in the long run, the people related things (trust, control mindsets) are more important as a limiting factor.
Serverless adoption appears to be harder than anticipated.
I believe that the hope for it to be a quick in-place replacement for containers is no longer realistic. In the current state of affairs, it’s a technology worth considering for new applications, new modules/groups of services or in case of major refactorings.
We at Avenga use serverless (AWS, Azure, GCP) when:
This approach is widely accepted as interesting and promising, but “maybe later” or in a limited application. The same results were shown in our CIO survey about the future of business application technologies.
Avenga is clearly ahead of the curve in the area of serverless adoption even though it’s not the dominating way of building business applications.
Let’s see what tomorrow brings for this interesting technology and if it will ever become the primary choice for building the majority of business applications.
Kacper Ryniec, Head of Technology Avenga Poland
There are at least two projects we’ve accomplished using a serverless architecture.
One based on AWS Lambdas and the other with Azure Functions v2. both in .NET.
Andrew Petryk, Head of Java Technologies
As it has already mentioned, we in the Java Department, used AWS Lambda in combination with GraalVM runtime. I can’t say it was easy and straight-forward (especially in combination with the Bitbucket Pipeline to build native image). But what I can say, is that it was totally worth it.
Our use case was multi-service calls; Lambda can work as an aggregator by calling multiple other APIs and aggregating the results.
I think that Serverless has a bright future ahead and it is worth trying.
Vlad Litovka, DevOps Expert
We’ve delivered one serverless project to a client and are planning to deliver one more(he is still thinking about the final scope in order to reduce the cost) but in fact AWS Lambda has it’s limitations and bottlenecks.
Anyway we are using Lambda/Step and GCP functions on real projects especially when there is no need to have running containers all the time.
Personally I do like the serverless approach but it increases support costs (DevOps) a bit and it is not suitable for all project types. In my case, most of the projects would not gain a lot of benefit from this approach.
We were having the following setup for API: AWS Lambda + AWS API Gateway + AWS RDS(client’s choice), and the front-end was handled by S3 + CloudFront.
And Route 53.
This setup is more or less common in my understanding. It would probably be better to use DynamoDb instead of plain RDS instances, just in order to follow a serverless approach in full.
Felix Hassert, Director of Products
In our products we don’t use serverless functions, yet.
I like the idea of reactive architectures a lot. But I also like on-premise hosting. We haven’t put in the effort yet to tear up infra for running ‘serverless’. I would like to look into KNative for that.
In the end, ‘serverless’ is not so serverless, if you are responsible for the infrastructure too. 😛