In late 2016, Amazon Web Services released an update including enhanced networking features for EC2 designed to allow some instances to achieve up to 20 gigabits per second (Gbps) networking speeds. As performance for many different applications becomes limited by network I/O, there’s now a virtualization mechanism in AWS that allows more direct access to physical network interfaces for higher performance networking. This results in greater bandwidth, lower latency, and reduced jitter—at no additional cost.

This sounded like a quick and easy win and we were interested in exploring these enhanced networking features for our Docker hosts, NAT boxes, and other instances. This post describes how we built enhanced networking-ready Amazon Machine Images (AMIs) using Hashicorp Packer and then how we measured the performance impact.

Building enhanced networking AMIs with Packer

As of this writing, there are two types of enhanced networking available on AWS: the Intel 82599 Virtual Function Interface (VFI) and the Elastic Network Adapter (ENA). The type available—and maximum bandwidth—depends on the instance type, image attributes, kernel version, hypervisor type, and OS distribution. The instances must also be running inside a virtual private cloud (VPC). It’s a good idea to consult Amazon’s official documentation for the latest supported instances and specific requirements, but this feature is generally available only on larger, more powerful instance types.

To support enhanced networking for the greatest number of instance types, we started with the latest CentOS 7 AMI (7.3 x86_64 with HVM virtualization and EBS storage) and created two new images: one with ENA support and another with Intel VFI support by following the official AWS installation instructions. If you’re using the latest version of AWS Linux on the correct instance type, enhanced networking may already be enabled.

With the two new CentOS-based AMIs, I added new Packer builder types to New Relic’s main Packer template. This allows these AMIs to be easily used with our configuration management tools to launch new enhanced networking–ready CentOS hosts. Here’s what the ENA Packer builder looks like, following internal conventions for naming and tagging:


{
   "ssh_pty": "true",
   "name": "ebs-ena",
   "type": "amazon-ebs",
   "region": "us-east-1",
   "source_ami": "ami-id-created-from-centos2-with-ena",
   "subnet_id": "subnet-id-example",
   "instance_type": "r4.large",
   "ssh_username": "centos",
   "ssh_timeout": "10m",
   "ami_name": "packer {{user `box_name`}}_{{user `box_version`}}_{{user `cm_tool`}}{{timestamp}}",
   "ami_description": "Packer built {{user `box_name`}}_{{user `box_version`}}_{{user `cm_tool`}} EBS VFI ENA",
   "security_group_id": "sg-id-example",
   "associate_public_ip_address": "true",
   "launch_block_device_mappings": [
      {
         "device_name": "/dev/sda1",
         "volume_size": 20,
         "delete_on_termination": true,
         "volume_type": "gp2"
      }
   ],
   "run_tags": {
      "Name": "packer-ami-builder",
      "environment": "staging",
      "department": "product",
      "product": "packer",
      "project": "packer",
      "owning_team": "site_engineering"
   }
}

From here, it’s as easy as executing a new Packer command to spawn new ENA- or VFI-ready EC2 instances:

$ packer build -var 'box_name=nr-c7-vfi' -var 'box_version=1.0.201701.001' -var 'cm_tool=puppet' --only=ebs-vfi template.json
$ packer build -var 'box_name=nr-c7-ena' -var 'box_version=1.0.201701.001' -var 'cm_tool=puppet' --only=ebs-ena template.json

After this command runs, a new EC2 instance is running with enhanced networking features enabled (optionally configured using Puppet).

Verifying an instance is ready for enhanced networking

With the New Relic Infrastructure agent running on our EC2 hosts created from the Packer template, we’re able to easily verify that the new image has the correct Linux kernel module enabled for enhanced networking using inventory. It’s straightforward to do this from the command line on each host by running modinfo ixgbevf, but here we’re able to identify all hosts (from over one thousand running in our account) with the Intel 82599 kernel module—and the version installed—just by searching for the kernel module named ixgbevf:

Enhanced Networking Search example

As this point, we created enhanced networking–ready AMIs and confirmed they were configured correctly. We were curious about the performance impact.

Monitoring EC2 performance of enhanced networking images

With new instances running in the same availability zone, we ran a few simple tests using the iperf3 tool to measure how network throughput was affected. Since we were interested specifically in TCP network throughput, we compared hosts with enhanced networking against hosts without enhanced networking enabled.

On c4.8xlarge-size EC2 test hosts with and without enhanced networking, we started a server listening on port 80 after installing iperf3:

$ sudo iperf3 -s -p 80

Next, from another EC2 host (also with enhanced networking enabled) we started the network throughput tests for approximately 5 minutes and connected to port 80 on the other host:

$ sudo iperf3 -c e2-host-with-vfi.ip-address -i 1 -t 300 -V -p 80

During the throughput tests, we immediately saw the number of transmitted bytes spike on one of the host network interfaces in the Infrastructure network dashboard:

AWS EC2 Monitoring Infrastructure Dashboard Screen

The iperf tool tested bandwidth by generating a large amount of network traffic between two hosts with enhanced networking enabled, seen here in New Relic Infrastructure.

However, processing all of those packets comes at a cost: CPU usage peaked at around 12% and iperf3 reports that after 5 minutes with 316 gigabytes transferred, the bandwidth was 9.04 Gbps.

On identical instance-sizes running without enhanced networking enabled, the bandwidth results were markedly different. For the non-enhanced networking hosts, CPU usage peaked at around 22% and with 154 gigabytes transferred, the bandwidth was nearly halved to 4.40 Gbps.

AWS EC2 monitoring CPU usage dashboard

Hosts without enhanced networking on identical instance sizes saw greater CPU utilization.

Final thoughts

If you’re using—and already paying for—the performance benefits of EC2 instances that can take advantage of high-bandwidth networking features offered by AWS, we found that taking the time to verify and enable the enhanced networking features is well worth it. Because the installation and configuration process for EC2 hosts running non-AWS Linux operating systems is not trivial, automating the installation and configuration of enhanced networking features using pre-baked images with Packer was especially useful in ensuring broader adoption.

This is not a feature that should be turned on without measurement, however. Understanding the relationship between CPU usage, EC2 instance type, and network bandwidth is critical in building high-performance network services or applications.

Additional resources

 

Roger Torrentsgenerós is a Senior Site Reliability Engineer at New Relic in Barcelona. He has deep experience building, designing, and operating a wide variety of compute architectures and studied engineering at Universitat Politècnica de Catalunya.

View posts by .

Interested in writing for New Relic Blog? Send us a pitch!