In my time as a freelancer I’ve come across a number of clients using T2 instances for their infrastructure requirements.
In my experience, these instances often seem to be chosen based largely on their low price compared to other instance types and are often poorly understood.
While T2 instances can offer great value, they come with a number of advantages and disadvantages that must be considered (and understood) before choosing them for your infrastructure.
Let’s examine what T2 instances are.
What Are T2 Instances?
EC2 T2 instances are CPU “burstable” virtual machine instances offered by AWS. This is opposed to the other types of instances which provide a fixed level of CPU performance.
The baseline CPU performance, maximum amount of CPU credits that can be earned, and the rate at which these CPU credits are earned are all based on the size of the T2 instance in question.
In the following sections, we’ll use a t2.micro as an example of these important concepts
What is a CPU Credit?
A (single) CPU credit allows 1 vCPU to operate at 100% usage for 1 minute.
It’s important to remember this for some of the math in the following sections
When are CPU Credits Used?
Anytime your instance uses any amount of CPU, for any reason, CPU credits will be used.
Yes, this means even if the CPU usage is below the baseline performance rate, you will use CPU credits.
To understand this (Already confusing subject), let’s look at how AWS calculates the use and earning of CPU credits.
How is CPU Credit usage calculated?
Remember from above that a single CPU Credit allows 1 vCPU to run at 100% usage for one minute.
This means that 10% CPU usage (The baseline performance of a t2.micro instance) for 1 minute would use 1/10th (0.1) of a CPU Credit. For an hour that would be 6 credits (The amount of credits a micro earns each hour)
20% CPU usage for a minute would by 1/5th (0.2) of a CPU credit and so forth.
How are CPU credits accrued?
CPU Credits are ALWAYS earned if your instance is on but they are only accrued whenever your T2 instance is utilizing less than the baseline performance of your instance.
Continuing our t2.micro example, assuming your server uses no CPU for an hour (An unlikely scenario but this is an example), you would earn 6 CPU credits.
You could continue to earn these credits until you held 144, the maximum amount that can be held for the micro instance type.
The 6 CPU credits earned every hour allow the t2.micro instance to indefinitely maintain it’s baseline CPU performance.
0.1 (10% baseline CPU performance) * 60 (minutes in an hour) = 6 (The number of CPU credits earned in an hour)
Let’s finally look at a basic calculation of the usage and earning over a multi-hour period.
An Example Calculation
Let’s do a basic illustration that can show both the disadvantages and advantages of this instance type.
Note: This is not looking at other metrics of server performance. We are simply looking at how CPU credits are used and calculated
- The t2 micro instance has 138 CPU credits accrued.
- The CPU usage will be consistent. This does not happen in reality but does help us illustrate the concepts easily.
- You have a t2 instance hosting a really awesome website about cats.
On the first hour your micro instance has no traffic at all and uses 0% of the CPU for the entire hour (maybe the server was taking a nap?). Since no CPU time was used that means you’ll keep all 6 CPU credits you earned.
138 (CPU credits you already accrued) + 6 (The credits you earned over the hour) = 144 CPU Credits Accrued
You now have the maximum amount of CPU credits you can accrue for a t2.micro so even if you continue to have CPU usage below the baseline performance, you’ll no longer earn credits above this amount.
On the second hour your micro instance maintains 10% CPU usage (The baseline performance). Since it earns enough CPU credits to maintain this rate, no credits are burned from the 144 you already have and no credits are accrued (Because you used the same amount of credits that you earned and because you already have the maximum allowable amount).
On the third hour you just got featured on a popular site and you’re getting hammered with traffic causing 100% CPU usage for the entire hour (Yay! Traffic!).
1.0 (100% CPU usage) * 60 (number of minutes in the hour) = 60 credits used
144 (Accrued CPU credits) – 60 = 84 credits remaining
But wait, there’s more!
Since you always earn credits, you would actually have more than 84.
84 (Number of remaining CPU credits) + 6 (The amount you earn in an hour) = 90 credits remaining
Assuming there were no other issues with the server and that everything was fine, your website likely continued to work without issue.
After Hour 3 everyone has had their fill of your really awesome cat website and your server sits largely unused again (Who can get enough cat pictures?!). The instance uses 5% of the CPU for the entire hour.
0.05 (5% CPU usage) * 60 = 3 (Credits used that hour)
90 (Accrued credits) + 6 (The number of credits you accrue an hour) – 3 (The number of credits you used that hour) = 93 credits remaining
Since 5% CPU usage is less than the 10% baseline CPU performance, the server accrues some of CPU credits.
Apparently I was right and people really can’t get enough cat pictures. Your website experiences an increase in traffic causing 100% CPU usage for the entire hour.
1.0 (100% CPU usage) * 60 (Minutes in an hour) = 60 CPU credits used
93 (Accrued credits) – 60 (CPU credited used) + 6 (The CPU credits you earn in an hour) = 39 CPU credits remaining
As we can see above, our credits are really starting to dry up but thankfully the server is still holding up.
Better hope all that traffic dies down.
Hour 6 (uh oh!)
Unfortunately, people REALLY love those cat pictures and the traffic remains at the same level. Your instance continues to utilize 100% of it’s CPU for hour 6.
1.0 (100% CPU usage) * 60 (Minutes in an hour) = 60 CPU credits used
39 (Accrued credits) – 60 (CPU credited used) = OH NO! We don’t have any credits left!
About 40 minutes into Hour 6 your instance no longer has any credits remaining. At this point, you’re now limited to the baseline performance of the instance, in this case, 10% of the vCPU.
At this point your application grinds to a halt but since it’s under such heavy load, the instance is not able to earn/accrue as fast as it can use them, resulting in almost complete downtime until the server traffic lowers (Since your website is no longer working properly, that’s going to be pretty quickly).
You will likely find it is difficult or impossible to access the server to take any measures to solve the issue until the CPU credits have accrued.
So why is all this important?
The examples above are contrived but it does illustrate the danger of not properly architecting your AWS system and monitoring it once it’s in place (especially if you’re using T2 instances).
I often see this issue with a website that receives increases in the baseline traffic over long periods of time with occasional spikes. If the instance type is not changed and sized appropriately, this eventually leads to all the accrued CPU credits being used paired with downtime and a significant amount of head scratching.
A Note On T2 Unlimited
Amazon (somewhat) recently added a new feature to T2 instances called T2 Unlimited.
T2 Unlimited instances can burst over baseline performance as long as required at an additional cost.
I’ll likely be writing an article explaining T2 Unlimited concepts in the near future so stay tuned!
Thanks go to rosege over at Hackernews for the suggestion!
Hopefully you walk away having a slightly better understanding of T2 instances and how using them can effect your infrastructure if they are not properly managed.
If you have any questions or comments, please don’t hesitate to mention below. Thanks for reading!
Looking for someone to help architect, manage, or automate your infrastructure? You can easily schedule a consultation with me via the following link.