Written by Kosta Barba.
Introduction
When integrating your corporate data centre with the AWS Cloud you will need the most appropriate connection when transferring data. There are two main methods to connect AWS with your on-premises network: AWS Direct Connect, which is a dedicated physical connection, and AWS Site-to-Site VPN, which is a virtual tunnel over the internet. Data can be encrypted in transit using both solutions, though Direct Connect doesn’t have native support for it.Â
Direct Connect is the better option for transferring data if the type of traffic requires a greater amount of bandwidth than an AWS Site-to-Site VPN does, although Direct Connect is more costly and time-consuming to set up than an AWS Site-to-Site VPN, given it is a dedicated optical link. It supports up to 100 Gbps Ethernet, whereas an AWS Site-to-Site VPN supports 1.25 Gbps. This article covers the use cases of when you should use the different types of connections for Direct Connect in data transfers, including migrations.
Transferring data to public AWS services
For transferring data to an AWS service that isn't in a VPC, such as S3, you should use a public virtual interface. Public interfaces use public IP addresses to access AWS global services. Consider S3 for storing static data for web applications, such as images and high quality videos, or archiving data to S3 if you would like to free up storage space on-premises.
Global services, such as DynamoDB (a NoSQL database) also use a public interface, though there are not many global services compared to regional services when it comes to transferring data.
The diagram below shows how a typical direct connect setup to S3 would look.Â
Transferring data to services within a VPC
When you have terabytes of data in application, web or relational database servers that you would like to transfer to the cloud, you should set up a private virtual interface.Â
Private interfaces use private IP addresses to connect to your VPC where your resources reside, such as EC2 or RDS instances. For migrating databases from on-premises to AWS you can use AWS Database Migration Service (DMS). DMS is used to migrate relational and non-relational databases to a target location in a VPC. For example, you can migrate an on-premises SQL Server to RDS SQL Server.Â
For an application or web server migration you can use EC2 as your destination. AWS Application Migration Service can be used to migrate on-premises servers to EC2, for rehost migrations. A rehost migration involves moving an on-premises server to a cloud server, such as EC2, in an “as-is” state as it was on-premises. Below is a simple example of an on-premises connection to an EC2 instance in a private subnet.
Note that a Direct Connect Gateway is connected to the Virtual Private Gateway. A Virtual Private Gateway is an endpoint that acts like a router at the edge of the VPC, that would interface with your Direct Connect connection. The Direct Connect Gateway allows for a single virtual interface from the Direct Connect location to support connections to multiple VPCs and regions.  Â
Transferring data via Transit Gateway
Transit Gateway is used for more complex AWS environments with multiple VPCs in different regions and accounts. Imagine that you had 5 VPCs to connect to on-premises, in various regions. You would need a Virtual Private Gateway for each VPC. Further to that if you'd like all these VPCs to communicate with one another there would be many VPC peering connections (a point-to-point connection between VPCs) in a mesh topology. At this point setup and management is very complex. This is where the Transit Gateway comes in.
Transit Gateways provide a single connection from on-premises using a transit interface (a transit gateway associated with a direct connect gateway). It can connect all the VPCs in your AWS accounts, and the connections are managed between the VPCs for you within a region or cross-region.Â
It uses a hub-and-spoke topology, which has a central connection point that connects to each VPC and on-premises connection. The topology makes it simpler to set up and it’s very scalable as well when adding more VPCs, compared to multiple Virtual Private Gateways.
For larger scale data transfers and migrations of servers, databases and applications that need their own VPCs across regions and accounts, it is easy to get set up with minimal effort.
Also it’s worth the costs in the long term compared to individual Virtual Private Gateway connections to the VPCs, although it can get costly depending on the type of traffic being sent across it.Â
In the image below it shows a conventional Transit Gateway implementation for a single region.
Scalability
Additionally, you can scale up your bandwidth for your transfers, if you have a very large amount of data that you would like to migrate more quickly than over the standard Direct Connect link. Direct Connect supports link aggregation which combines multiple physical links into a single logical link. This allows for faster speeds, with up to 8 links that can be combined. Let's say that you have four 10 Gigabit links. Therefore your maximum bandwidth would be 40 Gbps.
Jumbo frames are also supported. These are larger than regular frames (9000 bytes compared to 1500 bytes for standard frames). Fewer frames to transfer means less network processing overhead.
Redundancy
Redundancy of the Direct Connect link eliminates a single point of failure in the connection between on-premises and AWS. Link aggregation, for example, provides redundancy. So if one of the aggregated links goes down, the other links will still be used. You can also use multiple routers on-premises and in the Direct Connect locations for increased fault tolerance, though you would have to implement this yourself from your on-premises/Direct Connect location, which increases the cost. AWS will have redundancy already set up from their side of the Direct Connect location and their Regional data centres. A less expensive option is to set up a backup AWS Site-to-Site VPN, to route traffic over the internet when the Direct Connect link goes down.
Security
Direct Connect doesn’t provide built-in end-to-end encryption. However, you can use an AWS Site-to-Site VPN with IPSec over your Direct Connect link if you require encryption in transit. Additionally, TLS is utilised by most AWS services and other web-based technologies to provide encryption in transit.
The table below summarises the data transfer pathways discussed above.
Summary of data transfer pathways
Summary
There are many options when transferring data from on-premises to AWS. It really comes down to what data you have and what your target resources are. Whether it’s static data that is suited for S3, relational and non-relational data that’s for RDS or DynamoDB, or application and web servers that are suited for EC2. For a large amount of data to transfer and if you require high bandwidth, Direct Connect is an ideal option. You just need to choose the connection type that suits your situation, whether it’s a public interface, private interface or Transit Gateway.Â