Mlaas — When Cloud Meets AI

Mlaas —
when cloud
meets AI

Jacek Chmiel

Director of Avenga Labs

June 23, 2021

10 min read

Cloud Data and AI

Machine learning and the public cloud

All the companies want to benefit from the data they generate and store, and obtain valuable insights, make better decisions, learn more about their customers and develop more adequate plans for the future.

Download a digital brochure with real use cases

We’ve come a long way from the beginning of the public cloud, as someone else’s computer, and are treating the public cloud offering more and more as a set of ready to use services which can save us a lot of time and effort.

One of the key areas which is quickly being developed and enhanced by cloud computing players is Machine Learning (ML).

Machine Learning as a Service (MLaaS) is an umbrella term for a set of services that enable machine learning in the cloud. It means data storage and processing, model learning and tuning, model deployment and execution, plus access management, logging and monitoring of the pipelines.

From the service consumer point of view, MLaaS is about using the cloud infrastructure to perform ML tasks instead of, or along with, using local tools and infrastructure.

The benefits of MLaaS

Faster

Setting up ML pipelines in a local infrastructure is complicated, especially on an industrial scale with many processing nodes, access control, and security. MLOps is a daunting task and the cloud promises to skip this task with ready to use sets of services which are compatible with each other and are able to be used right away.

We need data ingestion and preparation infrastructure, data storage infrastructure, hosting models, and development environments. And, everything connected and working with each other in order to be an effective pipeline.

This time saving is the number one selling point of Machine Learning as a service (MLaaS).

Dedicated GPUs and TPUs

Access to hardware, especially GPUs and TPUs, is more limited than ever before, but in the case of MLaaS, these capabilities were available from the beginning. In case of a local alternative, it means a cost that has to be spent in advance and the potential of problems with the availability of the hardware.

Additionally, cloud services for ML have their TPUs heavily optimized for ML services making them well tuned and the users pay only for the time they spend using the dedicated hardware.

In case of the need for scalability, it takes mere minutes to extend the overall capabilities of the model training infrastructure and to receive results faster as well as being able to either tune the model or deploy it to execution in the cloud or on devices.

Ready to use datasets and models

Cloud services also allow access to ready-to-use models which can be applied to the training of the neural networks, using them as they are or for transfer learning. It works the best for image recognition, as data scientists in some cases may skip the entire process of data preparation, cleansing, feature extraction, and labelling and only use the models to recognize their images with high accuracy.

The alternative is to use datasets and augment them with specific data, and then retrain the models to include specific features and labels.

In these scenarios, we can expect great time savings and good results.

Unfortunately, the more we go into the business domain specifically, the less ready and useful the models and datasets are going to be, quickly rendering them as examples which will never be useful in production environments.

Ephemeral workloads

Imagine a situation where the ML team believes the model is working well and now it’s time to train it on much larger datasets, and it’s tempting to use much more processing power to make this step as short as possible.

In the case of the cloud, we can enjoy a feature which is often called an ephemeral workload. We can start multiple processing nodes for a few hours, scale the problem on multiple powerful nodes, and then finish the job as quickly as possible keeping it economically viable.

Such a spike in scalability is impossible in traditional infrastructures, especially when we obviously relate it to dedicated ML hardware (TPUs or GPUs). This is when cloud solutions offer something that is simply impossible to obtain with a local infrastructure.

Enable app developers to join the ML crowd

Setting up the environment and infrastructure was one of the blockers for many software engineers who were trying to play the role of machine learning engineers. Despite their interest, the runtime setup (and math of course) was one of the key factors that was discouraging them.

For instance, even for tutorials there’s usually the alternative of setting up a local CUDA enabled machine with all the toolsets or using cloud services.

An example of this is when the alternative may be between a local Jupyter engine and a Google Colaboratory website which delivers a similar developer experience but without the hassle of setting up anything locally. And as we stated before, never ever underestimate the significance of the Developer Experience (DX).

Cloud or Edge – we’ve got you covered

Some ML workloads, such as instant image recognition using security cameras, are much better suited for Edge computing than cloud computing.

Have a look at Trends in edge AI

The trend of Edge computing taking over more and more ML workloads is visible, especially in the manufacturing and automotive industries, but it is not limited to them. Whenever we need responses in milliseconds and cannot tolerate any connectivity issues with the cloud and from our energy efficient devices with unreliable networks, Edge is the better solution.

The same vendors who offer MLaaS solutions for their public clouds are also presenting their solutions for Edge and IoT.

We also have another set of Edge specific vendors. Lots of choices to choose from for the best possible outcome for today and in the future.

And, there are entirely new possibilities, such as Federated Learning, to combine the power of cloud and Edge.

Reality check

As usual, there’s nothing perfect and though MLaaS is maturing it’s still relatively new compared to more established cloud services for infrastructure or applications.

(Too) Well crafted examples

The cloud examples work really well, as they are well tuned and prepared in advance to showcase how good the cloud ML services work which helps convince potential buyers to buy them.

Too often data science teams are disappointed with the models that are readily available but do not work well on their datasets. The models tend not to generalize well and quickly lose their accuracy to below acceptable levels (such as 60%-80%).

The advice here is not to believe in the sales pitches (as always) and to use a trial period to test the pipelines and models in your business situation (and technology stack) in order to figure out which service/dataset/model is useful in your particular case (if any). A small but well defined POC, plus advice from the experts, may save tens or even hundreds of thousands of dollars/EUR.

Cost and time of data transmission

The cost of data transmission means both the actual money that will have to be spent for transmission of the data as well as the time it will take to do it which makes the ML team wait for them in order to complete things. In the case of a local infrastructure with 10 GBit and even more (25 Gbit), it’s hard for the cloud to beat those transmission speeds. In some applications, the cloud is ruled out for this very reason, while in others our partners set up dedicated ingestion points with the cloud provider and their internet provider to maximize network throughput.

In situations of smaller datasets and models, it is less important than in those of many terabytes or petabytes of data.

Loss of flexibility

Tools and pipelines are generally set up so they work properly from the get go, which is great. But, this also means that the choice is already made and your flexibility is limited. The toolkit may require different skill sets than your ML team possesses at the moment, so the learning curve may consume or even exceed all the time savings of the ready to use tools.

On the other hand, some of the parameters and versions may not even be accessible by the ML team, and sometimes this may be a go or no go for the entire project.

Models for everyone?

The new kids on the block are the datasets and models for particular business sectors and domains. It sounds very promising, but the models usually apply to some generalized company and business processes, and there’s no generic company in the real world. So, customization is needed and it usually means the creation of new dedicated models for the business context, which leaves the ready to use models within the business domain as training exercises only, as they are not production ready.

Longevity of the ML services

One of the risks and associated fears is related to the commitment of the cloud providers for the particular MLaaS service. How long is it going to be supported? How often is it going to require forced upgrades as the new versions of services are launched? There’s a common consensus and expectation that many new MLaaS services will hit the market soon, but not all of them will last for too long. How long is acceptable for you?

Trust issues

Despite cloud providers being better prepared for data protection and security than ever before, and tomorrow even better than today, there are still doubts about their actual data access policies. There is a lack of trust along with fear that their data may be seen by other parties or violate applicable regulations. There are still decision-makers who will rule out the public cloud in particular cases, because they don’t want their sensitive data to ever leave their local infrastructure. And in this case, it means a no-go for MLaaS.

Too many choices?

Kubernetes and Kubeflow open new opportunities for machine learning projects to borrow many proven DevOps practices and technologies, and to be able to adapt them to the ML reality. This is a relatively new option to combine local infrastructure with cloud Kubernetes orchestration engines and ML services.

Industry-specific models and datasets are getting better and there’s more of them. There are also smaller companies specializing in specific industry domains (for instance fraud detection for banking) which deliver ready-to-use services as APIs, that can be used in clients’ applications without even a need to touch the interior of the black box. This option will be chosen more frequently in the near future as it avoids ML projects altogether (assuming the quality of service meets the expectations as well as data security).

What we have now is kind of a curse of too many choices. Which cloud provider offers the best ML services in your particular case? Does it make sense to start with pre-trained models or shall we skip this option and start from scratch? Which services should be combined and how can you achieve the best results? What accuracy can we expect? Or maybe, only locally designed and specifically crafted ML solutions will deliver the expected results.

These and many other questions can be answered when working together with a trusted partner who is not biassed towards any particular cloud services provider or machine learning as a service set of service providers, and who doesn’t exclude any options at the start, but quickly defining the right ML architecture for your business case today and in the future.