Amazon Machine Images (AWS AMI) offers two types of virtualization: Paravirtual (PV) and Hardware Virtual Machine (HVM). Each solution offers its own advantages.
When we’re using AWS, it’s easy for someone — almost without thinking — to choose which AMI flavor seems best when spinning up a new EC2 instance. Maybe you’re just doing some quick testing, or maybe you know all you need and your AMI has a relatively recent version of Microsoft SQL Server on it.
However, when you dig a little deeper, you’ll see that the AMIs offer a choice of virtualization type: PV and HVM. What is this and how much do you really need to be concerned with it?
In this article, we’ll cover the basics about AWS AMI Paravirtual and Hardware Virtual Machine. To dive even deeper and learn how to create a customized OS image through an Amazon Machine Image (AMI), check out Cloud Academy’s Create an EBS-Backed Linux AMI Lab. During this lab, you’ll setup a webserver EC2 instance starting from a Linux AMI, and then generate a new AMI.
What’s the difference between PV and HVM?
Choosing an AWS AMI virtualization type may not seem critical or relevant at first, but I believe everyone should have at least a basic understanding of how the different virtualization options function.
How many times have you actually thought about which kind of virtualization is best suited to your needs before you select your AWS AMI? Or better: how often have you thought about it, but ignored it and just started working anyway? When you select an AWS AMI to launch an instance you will see something like this:
What are these highlighted terms all about? I’ll explain.
The AWS AMI and the Xen hypervisor
Every AWS AMI uses the Xen hypervisor on bare metal. Xen offers two kinds of virtualization: HVM (Hardware Virtual Machine) and PV (Paravirtualization). But before we discuss these virtualization capabilities, it’s important to understand how Xen architecture works. Below is a high-level representation of Xen components:
Virtual machines (also known as guests) run on top of a hypervisor. The hypervisor takes care of CPU scheduling and memory partitioning, but it is unaware of networking, external storage devices, video, or any other common I/O functions found on a computing system.
These guest VMs can be either HVM or PV.
The AWS AMI and HVM vs. PV
HVM guests are fully virtualized. It means that the VMs running on top of their hypervisors are not aware that they are sharing processing time with other clients on the same hardware. The host should have the capability to emulate underlying hardware for each of its guest machines. This virtualization type provides the ability to run an operating system directly on top of a virtual machine without any modification — as if it were run on the bare-metal hardware. The advantage of this is that HVMs can use hardware extensions which provide very fast access to underlying hardware on the host system.
Paravirtualization, on the other hand, is a lighter form of virtualization. This technique is fast and provides near native speed in comparison to full virtualization. With Paravirtualization, the guest operating system requires some modification before everything can work. These modifications allow the hypervisor to export a modified version of the underlying hardware to the VMs, allowing them near-native performance. All PV machines running on a hypervisor are basically modified operating systems like Solaris or various Linux distributions.
This is in contrast to HVM, which requires no modifications to the guest OS, and the host OS is completely unaware of the virtualization. This may add to the performance penalty because it places an extra burden on the hypervisor.
Let’s extend this discussion to the AWS AMI. AWS supports Hardware Virtual Machine (HVM) for Windows instances as well as Paravirtualization (PV) for Linux instances. Years ago, AWS would encourage users to use Paravirtualized guest VMs, because they were then considered more efficient than HVM. We’ll talk later about how this has changed, but it’s useful to know the history and the strengths of each type of virtualization.
With that in mind, it’s helpful to take note that there is one major disadvantage with Paravirtualization: You need a region-specific kernel object for each Linux instance. Consider a scenario where you want to recover or build an instance in some other AWS region. In that scenario, you need to find a matching kernel — which can be tedious and complex. Nevertheless, I can’t say that this is the only reason that Amazon now recommends using the HVM virtualization versions of the latest generation of their instances: There are a number of additional recent enhancements in HVM virtualization which have improved its performance greatly.
Here are some key factors that contributed to Hardware Virtual Machine’s closing the performance gap with Paravirtualization:
- Improvements to the Xen Hypervisor.
- Newer generation CPUs with new instruction sets.
- EC2 driver improvements.
- Overall infrastructure changes in AWS.
Consider upgrading if you are using an older instance type.
PV vs HVM choices used to require more research
This table shows which AWS AMI (Amazon Linux) is recommended for each Amazon EC2 instance type:
Amazon currently recommends users to choose HVM instead of PV. Ignoring their advice can have very real consequences. For example, in the AWS Frankfurt region, if you try to select an AWS AMI (Amazon Linux) using PV, you will be greatly restricted in your choice of instance types:
As you can see, the cheapest instance type you can select here is m3.medium. But going with the Amazon Linux AMI on HVM, the cheapest instance type available to you is t2.micro.
As time has shown, it now works this way in all AWS regions, and this should serve to make you aware about the relevance of virtualization type — which we ignore at our own peril.
The PV vs HVM debate is much clearer today
As we’ve seen above, the main difference between PV and HVM AMIs is the way in which they relate to the underlying hardware. However, with the current (as of July 2019) EC2 offerings, HVMs are no longer at a performance disadvantage compared to PV. HVMs can run PV drivers and the correlating EC2 instances have improved such that HVM-based AMIs are faster than PV-based AMIs.
HVM AMIs rule the roost for Windows-based AMIs.
One distinction though: HVM AMIs rule the roost for Windows-based AMIs. PV AMIs are still available for Linux but the same debate carries through — all the newest Linux EC2 instances offer types that run HVM and can be faster, due to specialized hardware access such as enhanced network and GPU access.
Traditionally, Paravirtualized guests performed better with storage and network operations than HVM guests, because they could avoid the overhead of emulating network and disk hardware. This is no longer the case with HVM guests. They must translate these instructions (I/O) every time to effectively emulated hardware. Things have also improved since the introduction of PV drivers for HVM guest. HVM guests will also experience performance advantages in storage and network I/O.
Because Amazon has changed their approach toward the AWS AMI, we have no choice but to address this topic. A few years ago we saw the writing on the wall: It looked like HVM types would completely replace PV types. While that has not completely and utterly happened, it has — in effect — since HVM types are most prevalent among cheap and new instances. So that is why it’s critical that you make informed decisions today.
If you have experience with either AWS AMI instance, share your thoughts in the comments below.