Monday, October 31, 2016

Leverage Hybrid Infrastructure with SPOT instances or Preemptible VMs

Cloud computing allows companies to accelerate their business processes, optimize infrastructure costs and scale more quickly. However, integrating these new services with an existing infrastructure could be complex and not fully leverage this new opportunity. This article focuses on unstable instances like SPOT (AWS) instances or Preemptible VMs (GCP) which offers cheaper computational power.

What is a SPOT instance and a Preemptible VM?

AWS offers a service called Spot instances. It allows customers to bid on unused EC2 resources in any availability zone. GCP offers a similar service called Preemptible VMs. Customers can then use compute capacity with no upfront commitment at an hourly rate lower than the on-demand rate. The main drawback is that GCP and AWS can withdraw this instance at any time with little upfront warning depending on the market price of the resource and the bidding price.

Workloads Requirements

As explained in those two descriptions, instances can be withdrawn from customers at any time. The workloads leveraging the computing capacity require an advanced error management tool to support uncertainty on the lifecycle, otherwise any previous work will be lost and defeat the overall purpose. These workloads also require to be split in smaller tasks to ensure computation can be perform in a short period of time. This will ensure any work done to be saved for other dependent tasks.

When to leverage those?

One the common use cases of this services is when time is less a constraint than cost. For instance, a workload running at night could take as much time as required as long as it is completed at 8am.

Another common use case is when demand spikes in the current infrastructure occur which generate a large queue. This could often be seen in R&D environment using a single HPC or a limited infrastructure. In that case, time constraints have to be balanced with price. Spot instances offer the ability to unload the queue with cheaper than on-demand price instances.

Some business applications require a stable environment to perform efficiently. However, this stable resource might be taken when needed. Leveraging these unstable resources will enable to free up future stable resources beforehand.

Many other use cases can be found on GCP and AWS websites or on the Netflix blog (e.g. Netflix Blog)

What offers ProActive in these situations?

First, ProActive offers an expressive interface to create your own (automated) workloads or integrate with any existing system through its open Rest Api. It includes an advanced error management tool to handle errors according to each business needs (number of attempts, pause on error, etc.). It also handles bidding in a single interface. Below are a few strategies that can be implemented in a few hours with ProActive and which leverage the features offered by the solution.

Strategy - Bet low everywhere

Bet on all AWS region at a low price. Each AWS region has its own Spot market and Spot price. With ProActive, you can simply bet on all markets, we handle the complexity of distributing and fault tolerance. Latency should not be an issue for this case. ProActive includes a Resource Manager to visualize all your betting strategies in a single dashboard. Bonus: If during the hour that you’ve been charge, the price goes up and the instance is withdrawn, AWS will reimburse you this hour.

Strategy - Bet relatively low and increase betting gradually

When launching a night job that needs to be completed by 8 a.m, you can afford to bet low at first and then increase your bet closer to the final hour. With ProActive, you can create betting strategy that will start at any given time to make sure you will have more resources.

Strategy - Cloud bursting for demand spikes

On any given threshold on the workload queue, start betting for Spot instances or Preemptible VMs to accelerate job completion and reduce the overall queue. ProActive will also close these new resources with another custom policy setup by the customer.

You have more complex needs?

More advanced strategies can be implemented with our CloudWatch feature offering ability to create complex events and trigger your business requirements. However, this will need its own article to fully present you its capabilities.

No comments:

Post a Comment