One of the best features of Google Cloud Platform’s customer-friendly pricing is that with both Custom Machine Types and Committed Use Discounts, you pay independently for CPU and RAM instead of buying them in fixed ratios. This lets you optimize your cloud footprint to match your actual workload, with no wasted cores or RAM due to a poor fit between your workload and the available instance types. It also exposes the relative cost of CPU vs. RAM, which lets you make informed decisions about compute-storage tradeoffs in your architecture. Alas, other cloud platforms haven’t yet caught up to this approach, but with a bit of math we can work out their effective per-vCPU and per-GB RAM prices. So, what are Amazon Web Service and Microsoft Azure unit prices? Let’s use linear regression machine learning to find out!
NOTE: This is kind of a silly use of the term “machine learning”, but I hope it serves as an example of simple things a software engineer can do using ML tooling.
term | vCPU | GB RAM |
---|---|---|
on-demand | $24.22 | $3.25 |
1-year | $14.54 | $1.95 |
3-year | $10.38 | $1.39 |
That works out to a ratio where one vCPU costs the same as 7.45 GB of RAM. Keep in mind that if you’re running more than 25% of a month, Sustained Use Discounts start kicking in, saving you up to 30% off this list price without any commitment.
Anyway, let’s use a basic linear regression to see what AWS’s per-vCPU and per-GB RAM prices are:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
aws_ratecard = pd.DataFrame.from_records([
# We use instance sizes within their class of approximately the same size to
# ensure the linear regression weights them evenly.
['m5.12xlarge', 48, 173, 192, 2.304],
['c5.18xlarge', 72, 278, 144, 3.06],
['m4.16xlarge', 64, 188, 256, 3.20],
['c4.8xlarge', 36, 132, 60, 1.591],
['r4.16xlarge', 64, 195, 488, 4.256],
['x1.16xlarge', 64, 174.5, 976, 6.669],
['x1e.16xlarge', 64, 179, 1952, 13.344],
], columns=['instance type', 'vCPU', 'ECU', 'GB RAM', '$/hr'], index='instance type')
def pricing_regression(ratecard, cpu_unit='vCPU', print_result=True):
per_unit_costs = LinearRegression(fit_intercept=False)
per_unit_costs.fit(ratecard[[cpu_unit, 'GB RAM']], ratecard['$/hr'])
monthly_costs = per_unit_costs.coef_ * 730
r2 = per_unit_costs.score(ratecard[['vCPU', 'GB RAM']], ratecard['$/hr'])
cpu, ram, = monthly_costs
if print_result:
print("{}: ${:.02f}/mo, RAM: ${:.02f}/GB/mo, ratio: {:.02f} R²={:.06f}".format(
cpu_unit, cpu, ram, cpu/ram, r2))
else:
return monthly_costs[0], monthly_costs[1], r2
pricing_regression(aws_ratecard)
vCPU: $19.10/mo, RAM: $4.23/GB/mo, ratio: 4.51 R²=0.991502
With an R² better than 0.99, it looks like our model has done a pretty good job of predicting AWS prices, even though we haven’t even included local SSD or network throughput. At $4.23/GB/mo, RAM in AWS is ~30% more expensive than it is in GCP, on average.
If we look at AWS’s previous generation, though, the story gets a bit different:
pricing_regression(aws_ratecard.loc[['m4.16xlarge', 'c4.8xlarge', 'r4.16xlarge']])
vCPU: $25.54/mo, RAM: $2.97/GB/mo, ratio: 8.60 R²=0.995799
A ratio of 8.6 is much closer to GCP’s 7.45, and the model continues to hold up. What if we exclude the (somewhat exotic) X1 instance types?
pricing_regression(aws_ratecard.loc[['m4.16xlarge', 'c4.8xlarge', 'r4.16xlarge', 'c5.18xlarge', 'm5.12xlarge']])
vCPU: $24.81/mo, RAM: $3.04/GB/mo, ratio: 8.17 R²=0.991503
Hm. Not much different. So what we can conclude here is that the X1 instance types give you a ton of RAM, but it doesn’t come cheap.
Of course, if your workload is strictly RAM-limited, then you probably want to be using the instance type with the most GB RAM per vCPU, which would be the X1e types.
x1e = aws_ratecard.loc['x1e.16xlarge']
print("x1e.16xlarge: ${:.02f}/GB/mo".format(x1e['$/hr'] * 730 / x1e['GB RAM']))
x1e.16xlarge: $4.99/GB/mo
So if your workload is truly RAM-limited, then your RAM costs are going to be higher than the norm. That makes some intuitive sense, because you’re probably stranding/wasting some other resources like vCPU, network connectivity, or electricity costs. For comparison, GCP’s Extended memory price, which kicks in on Custom Machine Types past 6.5GB per vCPU, is $0.009550/hr or $6.97/mo. Even though AWS’s average per-GB RAM price is 30% higher than GCP’s, the incremental price for RAM on AWS is actually 30% cheaper than GCP.
Just for fun, what if we do the same regression as we did at the beginning, but normalize the compute power using ECU, instead of vCPU?
pricing_regression(aws_ratecard, cpu_unit='vCPU')
pricing_regression(aws_ratecard, cpu_unit='ECU')
vCPU: $19.10/mo, RAM: $4.23/GB/mo, ratio: 4.51 R²=0.991502
ECU: $5.53/mo, RAM: $4.38/GB/mo, ratio: 1.26 R²=0.921549
Odd! I would’ve expected the R² to go up when using the normalized CPU measure (ECU), but instead the model actually got worse. Regardless, RAM still looks comparatively expensive on AWS.
Since we have all this infrastructure built, let’s take a peek at Azure to see how it compares. It’s a bit difficult to model with its variety of instance types and dramatic price difference between CPU generations, but if we stick to just the latest Broadwell parts, we can get a pretty clear result:
azure_ratecard = pd.DataFrame.from_records([
# ['A2 v2', 2, 4, 20, 0.091]
['D2 v3', 2, 8, 50, 0.096, 'XEON ® E5-2673 v4', 'Broadwell'],
['D1 v1', 2, 7, 100, 0.146, 'Xeon® E5-2673 v3', 'Haswell'],
['E2 v3', 2, 16, 50, 0.133, 'XEON® E5-2673 v4', 'Broadwell'],
['F2', 2, 4, 32, 0.10, 'Xeon® E5-2673 v3', 'Haswell'],
], columns=['instance type', 'vCPU', 'GB RAM', 'temp storage', '$/hr', 'CPU', 'Generation'], index='instance type')
pricing_regression(azure_ratecard[azure_ratecard['Generation'] == 'Broadwell'])
vCPU: $21.53/mo, RAM: $3.38/GB/mo, ratio: 6.38 R²=1.000000
Including the previous-generation Haswell instances and the compute-optimized F-series, we get a different picture:
pricing_regression(azure_ratecard)
vCPU: $36.78/mo, RAM: $1.50/GB/mo, ratio: 24.53 R²=0.183072
With an R^2 of just 0.18, something is clearly wrong here. Let’s dig in:
ratecard = azure_ratecard
per_unit_costs = LinearRegression(fit_intercept=False)
per_unit_costs.fit(ratecard[['vCPU', 'GB RAM']], ratecard['$/hr'])
predicted_price = per_unit_costs.predict(ratecard[['vCPU', 'GB RAM']])
df = azure_ratecard.copy()
df['predicted'] = per_unit_costs.predict(ratecard[['vCPU', 'GB RAM']])
df['error'] = (df['predicted'] - df['$/hr']).abs() / df['$/hr']
df = df[['vCPU', 'GB RAM', '$/hr', 'predicted', 'error']]
df.style.format({'error': "{:.2%}"})
vCPU | GB RAM | $/hr | predicted | error | |
---|---|---|---|---|---|
instance type | |||||
D2 v3 | 2 | 8 | 0.096 | 0.11721 | 22.09% |
D1 v1 | 2 | 7 | 0.146 | 0.115156 | 21.13% |
E2 v3 | 2 | 16 | 0.133 | 0.133641 | 0.48% |
F2 | 2 | 4 | 0.1 | 0.108994 | 8.99% |
It looks like the D1 v1 instance type is really messing up the regression because it looks basically the same as D2 v3 but costs 50% more. What if we drop it?
pricing_regression(azure_ratecard.drop(['D1 v1']))
vCPU: $29.75/mo, RAM: $2.20/GB/mo, ratio: 13.50 R²=0.824604
That’s more like it! So, to sum up, we have
gcp_cpu, gcp_ram = 24.22, 3.25
aws_cpu, aws_ram, _ = pricing_regression(aws_ratecard, print_result=False)
azure_cpu, azure_ram, _ = pricing_regression(azure_ratecard.drop(['D1 v1']), print_result=False)
df = pd.DataFrame([
[gcp_cpu, gcp_ram, gcp_cpu / gcp_ram],
[aws_cpu, aws_ram, aws_cpu / aws_ram],
[azure_cpu, azure_ram, azure_cpu / azure_ram]
], index=['GCP', 'AWS', 'Azure'], columns=['$/vCPU/month', '$/GB RAM/month', 'ratio'])
df.style.format({'$/vCPU/month': "${:.2f}", '$/GB RAM/month': "${:.2f}", 'ratio': "{:.2f}"})
$/vCPU/month | $/GB RAM/month | ratio | |
---|---|---|---|
GCP | $24.22 | $3.25 | 7.45 |
AWS | $19.10 | $4.23 | 4.51 |
Azure | $29.75 | $2.20 | 13.50 |
Even though AWS bundles CPU and RAM into defined instance types, their pricing reveals an underlying unit price for these resources that’s quite consistent across instance types–even the “extra memory” X1 types. On average, RAM in AWS is about 30% more expensive than in GCP, while Azure charges almost 50% less for RAM vs. CPU than GCP.
This means that if your workload skews toward the high end of GB/vCPU, but doesn’t exceed 8GB/core, Azure might be cheaper for you. At the other end, if your workload skews compute-heavy (but not below 2GB/vCPU, where you’d be stranding RAM) you might be better off with AWS’s unit pricing model. GCP sits in between the two, but with its low minimum of 0.9GB/core, it wins for truly compute-constrained workloads by wasting less RAM.