Connecting tens of thousands of virtual instances within a cloud environment is no easy task. You need a solid and well-designed network to support this kind of scale. Additionally, data center networking is one of the fastest-evolving areas of network engineering. Some of the solutions available today were not available ten years ago.
Initially, Scaleway’s choice to ensure growth was to implement a highly available NAT (Network Address Translation) to make IP addresses move between physical machines with the instances to which they were assigned. While that was the right choice at the time, our past decisions have since created a technical debt that we need to remove to let us grow even further in the future.
In the following months, we will start rolling out some important changes in the way our inner network operates. We will be removing the NAT to create a simpler way to handle the network. The planned changes will bring several improvements for our users, among them IP stability, support for IPv6, and enhanced security.
Tackling technical debt without judgment
It’s important to say that the solution that we’ve been using so far is not a bad solution. It’s just an old solution, and now we have other options.
The choices made a lot of sense when Scaleway was created because we used to manage hardware servers and not virtual machines. Under those circumstances, NAT was an efficient and sufficient method to quickly switch public IPs from one hardware server to another and also allowed for our product range to grow quickly.
But time has passed, and our product catalog has evolved, so it was time for this part of the stack to catch up to today’s needs and find more contemporary solutions that will help us solve some problems and make things easier for us and our customers.
The problem: We need to move IP addresses
In a cloud environment, virtual machines (VM) are hosted on physical hypervisors, and the network must know to which physical host it should send packets in order to deliver them to the VM.
It might seem simple: a virtual machine has its IP address routed like any other on the internet using standard network routing protocols. However, VMs might need to move from one physical hypervisor to another for plenty of reasons:
- The customer stops the VM, which “archives” it: the VM’s snapshot is sent to the remote storage while the compute resources of the hypervisor are freed up. If the customer wants to restart the archived instance after some time has passed, a new slot of compute resources is allocated on a potentially different hypervisor, and the snapshot is sent back from the archive to be run on the new host.
- All hardware can fail, and hypervisors are no exception. When this happens, all customer instances are moved away from the faulty machine.
- There are other types of corrective maintenance and incidents that might require Scaleway to move instances from one hypervisor to another.
When an instance moves to another HV, its IP address has to be moved as well. Doing this for hundreds of thousands of VMs running on thousands of hypervisors is not trivial. Simple routing and bridging techniques used in enterprise networking wouldn’t scale.
The old solution: One-to-One NAT
Years ago, at the beginning of the Scaleway Elements Cloud ecosystem, we chose to address this problem using the principle of indirection.
Instead of assigning a publicly routable IP address to an instance, we provided it with a private one from the RFC1918 space (10.x.x.x) and then mapped it to a publicly routable IP using a centralized NAT solution.
Looking at the above schematic, if we want to move VM2 from HV10 to HV20 and VM5 from HV20 to HV10, we’ll need to assign new hypervisor-bound RFC1918 addresses to them and change the mapping on the NAT to preserve their public IP addresses:
While these addresses from RFC1918 space are often colloquially called “private” due to their non-publicly routable nature, they don’t have much to do with privacy. These IP addresses are reachable by all Scaleway customers and don’t provide any additional security compared to public IP addresses.
If you are interested in private communication between instances, you should consider using Scaleway Private Networks, which provide a communication channel isolated from other customers.
Meet Natasha, our high-performance stateless NAT engine
While NAT is a well-known technique widely used in enterprise and ISP networks, its primary goal is to reduce the use of public IPv4 addresses by mapping many “private” IPv4 addresses to a single public one in a 1:N fashion.
Such 1:N address translation requires so-called stateful packet processing with port mapping, sometimes called NAPT. While inevitable in some network applications, this is not what Scaleway wanted to use due to many limitations and poor scaling. So we decided to implement 1:1 NAT, sometimes called basic NAT, which doesn’t require stateful packet processing.
One of the pitfalls of this approach is that most commercial and open-source NAT implementations don’t support stateless 1:1 mapping and focus on the stateful 1:N translation, as this is the most common NAT use case. So we had to write our own high-performance stateless NAT engine, Natasha.
Despite many doubts and internal debates at the beginning, it turned out that Natasha did the job pretty well, and the performance we achieved was measured in hundreds of gigabits per second. Each Scaleway AZ is equipped with a highly available cluster of 4 to 32 Natasha machines, depending on the size of the AZ.
Known limitations of the NAT solution
But even with great performance, stateless NAT has a number of inconveniences. Let’s look at them more closely.
Lack of transparency for the customer
Customers need to consider that the IP addresses assigned directly to an instance's network interface might change. For example, if you use these addresses for frontend-backend communication with Access Control Lists (ACLs) or security groups, you have to update these ACLs when the addresses change. Scaleway provides a number of tools, for example, dynamic DNS records for public and private addresses, which are updated automatically, but all this still requires action at the customer end.
Long switchover time
It takes time to update the NAT mapping tables. When an instance moves to another host, it takes up to 20 seconds until the new private address is mapped to its public address.
1:1 NAT isn’t useful for IPv6 addresses
While not technically impossible, adopting 1:1 NAT for IPv6 would be against the very idea of IPv6 and the best practices in the industry. So we don’t use this technique for IPv6 addresses; instead, we route the IPv6 range directly to the hypervisor running the instance, meaning that IPv6 changes when the instance is moved. This, of course, significantly reduces the number of use cases of IPv6 for customer applications.
Issues with some applications
Natasha supports only TCP, UDP, and ICMP protocols on top of IPv4. While most modern protocols and applications work well with NAT, there are some legacy or corner case technologies like bare ESP IPSec or GRE tunnels that Natasha does not support.
Low feature development velocity
As Natasha is based on very low-level “Data Plane Development Kit” (DPDK) code, adding new features and rolling them over to production is no easy task. This slows the development of new products and features we would like to introduce to our customers.
Mandatory public interface
While not directly related to NAT, one of the side effects of the current architecture is that you can’t have an instance without a public network interface and only use Private Networks for all communication.
So, our plan is to improve the customer experience by eliminating these limitations.
What is changing with IP Mobility, and what will these changes bring for Scaleway users?
No more NAT! It’s as simple as that. In the future, instances will have a network interface with either a globally routable IP address or a VPC (Virtual Private Cloud) address. This has some important implications for Scaleway users.
Immediate changes for Scaleway users
- IP stability: When an instance is moved from one hypervisor to another, for whatever reason, it will keep the same IP addresses.
- Support for IPv6: Without the need for 1:1 NAT, IPv6 addresses can now stay directly connected to the instance.
- Support for multiple IPs: Multiple IPv4 addresses can be attached to a single instance (IP-Failover).
- Improved resource isolation: Instances will either be connected to the internet or inside of a VPC. The Scaleway network infrastructure will no longer be directly visible to users. Hence, instances without a public IP will be 100% isolated from other Scaleway users.
Future changes with IP Mobility
Users have some future changes to look forward to as well. Here’s part of our roadmap for further improvements.
With these new network capabilities, we will be able to implement live migration of our instances between hypervisors. This will drastically decrease the interruptions our clients will see when we have hypervisors that need to be taken out for maintenance and will increase the uptime of your services.
Improved IP management
Currently, we can attach an IPv4 to different products such as Instances, Load Balancers, and Public Gateways. However, those IPs are linked to the product family, making migration impossible. This means, in simple terms, that if you create an Instance with a public IPv4 and later you want to use the same address on a Load Balancer, there is no way to move the IP from one product to another. This brings some frustration to our users.
But with the new implementation of the IP Mobility stack, we will be able to provide better management of IPs for all products. In the future, we will be able to reuse the same IPs on different products.
Availability and rollout timeline
Deploying IP Mobility will be done in two major steps:
Step 1: Deployment
We will roll out an upgraded dual-stack network on our hypervisors to make new features available while also maintaining compatibility with our old NAT stack.
Whenever an availability zone gets IP Mobility, users can start creating or migrating their instances to the new stack to benefit from its improvements.
Step 2: Migration
We will migrate all internal products and all existing customer VMs to the new IP Mobility stack and deactivate the NAT.
After the rollout in all of our availability zones, we will start the full migration process. The goal is that all instances will be running on the IP Mobility stack as soon as possible to start developing the new exciting features that IP Mobility will enable.
Rollout will be zone by zone throughout 2023 to minimize any impact:
- WAW-2: already available since June 2023 — check out the documentation!
- Beginning of Q3: PAR-3, PAR-2, and AMS-2
- End Q4: AMS-1, PAR-1
Should you wait for the automatic migration or migrate yourself?
The automatic migration will simply update your current network interface for a new one using IP Mobility while keeping your attached IPv4 (if you had one).
We strongly recommend that you migrate your instance yourself as soon as IP Mobility is fully deployed to avoid any problems that might arise during the automatic migration. For example, migration implies reboots which you might want to do yourself; using internal Scaleway private IP addresses will no longer work (use Scaleway VPC instead); check that your services are binding in “0.0.0.0” or in your public IP to be sure they will work as expected with the automatic migration, etc.