When launching an application often times its hard to anticipate the right amount of memory, CPU and IOps needed in the Virtual Machines that will run it. For many applications the answer is to scale VMs horizontally. Typically this means adding more smaller virtual machines as current virtual machines come under heavier load.

That works excellently for things such as 12 factor apps, however when it comes data applications such as Elastic Search, MongoDB, Cassandra, or Postgres it isn't so easy to anticipate that.

So what do you do? Usually DevOps teams just launch the VMs with more specs than they probably need to avoid running into problems in the future.

Often times this could lead to huge cloud deployments with over-sized VMs. The process of figuring out the correct size is also not very straight-forward as every cloud platform has a myriad of VMs, each with different characteristics.

VM Rightsizing

That's were VMPower's VM Rightsizing feature comes in. Rightsizing is the task of figuring out the best VM type on your cloud platform for your application workload.

Resize VM screenshot

VMPower collects key Virtual Machine data points including, IO transfer counts, disk read/write throughput, memory consumption and CPU utilization to make data-driven recommendations about the type of VM that best fits your work-load. Using this data VMPower will identify possible monthly savings if you choose to resize your instances.

It's not just for cloud savings though. VMPower will also detect if a VM has the potential risk of being under resourced:

Resize VM screenshot 2

For reach recommendation, we offer a detailed explanation of what utilization metric triggered our recommendation as well as the cost or savings difference versus the current VM.

You can always view the detailed VM metrics to verify the recommendation and let us know how accurate it is using the feedback buttons.

Methodology

For every cloud platform, compute is generally categorized by:

  • Memory Capacity
  • CPU Capacity
  • IOps Capacity

Using the data points we collect we can accuratley fit your application load to an appropiate VM on your cloud provider which will either optimize performance, or help you save.

For every metric (Memory, CPU, IOps) we operate on a scale with 'rungs'. We always recommend a VM one 'rung' up or down (with respect to the metric).

Rightsizing Diagram

Each metric has 3 possible recommendations:

  • No Change: We think the current VM's configuration is either appropriate or not possible to change in the direction necessary.
  • Scale Up: We think the current VM is potentially pushing its limits on this particular metric.
  • Scale Down: Based on the time window, we think this VM is very far from ever pushing its capacity in this metric.

We combine the three metrics and match it to the instance type which fits the recommendation for each metric.

A Conservative Approach

The heuristics in VMPower will change based on user feedback, however in general the ideology is to be conservative.

VMPower generally categorizes data points as high utilization, spikes or low utilization. When a ratio of high utilization data points, vs all data points is reached, we may consider that VM as pushing its scale.

Recommendations generally go one 'rung' at a time. That is, if you are running on an 8 vCPU instance, we would only recommend scaling down to a 4 vCPU instance and not a 2, or 1 vCPU instance.

IOps

We know that understanding VM utilization by averages isn't reflective on how that VM is actually used in the real world. Which is why VMPower explicitly does not make recommendations simply based on statistical averages.

Whenever we see spikes in the data window, it reduces our heuristic scoring heavily as it demonstrates that your VM experiences spikes in utilization, that may seriously push the capacity of a smaller VM type.

For example, the chart below shows an Azure Standard DS1 IOps activity for a 1 day window. This machine is actually used as one of VMPower's CI machines using Drone CI:

Azure VM IOps Graph

If this was, for example, a regular Azure D series VM with Standard storage (non-SSD) VMPower would detect that this VM has only 1 disk, and would only output 500 IOps maximum. In that case we would see a Scale Up recommendation.

Similarly, if the IOps capacity was well above the peaks of this graph, a Scale Down recommendation will be produced.

CPU

CPU capacity on cloud providers generally increase or decrease by a factor of 2. So this means that when VMPower recommends a scale down of CPU compute it is because the CPU utilization of your current VM is so low, that if you had 1/2 of the compute capacity, you would still have plenty of CPU capacity in case of spikes in utilization.

Memory

Memory scaling is much more granular compared to CPU. In comparison to IOps and CPU utilization, memory usage tends to vary much less. The same heuristic however applies here and we either recommend scaling up to the next available size or down.

AWS Rightsizing

AWS Virtual Machines don't report memory statistics by default.

If we see that you have AWS instances without memory information, you'll see a prompt to copy and paste a one-line script that will setup your VM to report memory utilization to AWS CloudWatch:

AWS Script Dialog

Simply copy and past this into your AWS VM and it will start reporting VM memory utilization every 5 minutes.

Coming Soon: One-click Resizes

We'll soon have the ability to carry out resizes directly from VMPower if your provide Azure VM Contributor Access or AWS EC2 Write IAM access to VMPower.