Private Endpoints Revisited: NSGs and UDRs

Private Endpoints Revisited: NSGs and UDRs

Update September 2022 – Route summarization and NSGs are now generally available for Private Endpoints!

Welcome back fellow geeks!

It’s been a while since my last post. For the past few months I’ve been busy renewing some AWS certificates and putting together some Azure networking architectures on GitHub. A new post was long overdue, so I thought it would be fun to circle back to Private Endpoints yet again. I’ve written extensively about the topic over the past few years, yet there always seems more to learn.

There have historically been two major pain points with Private Endpoints which include routing complexity when trying to inspect traffic to Private Endpoints and a lack of NSG (network security groups) support. Late last year Microsoft announced in public preview a feature to help with the routing and support for NSGs. I typically don’t bother tinkering with features in public preview because the features often change once GA (generally available) or never make it to GA. Now that these two features are further along and likely close to GA, it was finally a good time to experiment with them.

I built out a simple lab with a hub and spoke architecture. There were two spoke VNets (virtual networks). The first spoke contained a single subnet with a VM, a private endpoint for a storage account, and a private endpoint for a Key Vault. The second spoke contained a single VM. Within the hub I had two subnets. One subnet contained a VM which would be used to route traffic between spokes, and the other contained a VM for testing routing changes. All VMs ran Ubuntu. Private DNS zones were defined for both Key Vault and blob storage and linked to all VNets to keep DNS simple for this test case.

Lab setup

Each spoke subnet had a custom route table assigned with a route to the other spoke set with a next hop as the VM acting as a router in the hub.

Once the lab was setup, I had to enable the preview features in the subscription I wanted to test with. This was done using the az feature command.

az feature register --namespace Microsoft.Network --name AllowPrivateEndpointNSG

The feature took about 30 minutes to finish registering. You can use the command below to track the registration process. While the feature is registering the state will report as registering, and when complete the state will be registered.

az feature show --namespace Microsoft.Network --name AllowPrivateEndpointNSG

Once the feature was ready to go, I first decided to test the new UDR feature. As I’ve covered in a prior post, creation of a private endpoint in a VNet creates a /32 route for the private endpoint’s IP address in both the VNet it is provisioned into as well as any directly peered VNets. This can be problematic when you need to route traffic coming from on-premises through a security appliance like a Palo Alto firewall running in Azure to inspect the traffic and perform IDS/IPS. Since Azure has historically selected the most specific prefix match for routing, and the GatewaySubnet in the hub would contain the /32 routes, you would be forced to create unique /32 UDRs (user-defined routes) for each private endpoint. This can created a lot of overhead and even risked hitting the maximum of 400 UDRs per route table.

With the introduction of these new features, Microsoft has made it easier to deal with the /32 routes. You can now create a more summarized UDR and that will take precedence over the more specific system route. Yes folks, I know this is confusing. Personally, I would have preferred Microsoft had gone the route of a toggle switch which would disable the /32 route from propagating into peered VNets. However, we have what we have.

Let’s take a look at it in action. The image below shows the effective routes on the network interface associated with the second VM in the hub. The two /32 routes for the private endpoints are present and active.

Effective routes for second VM in the hub

Before the summarized UDR can be added, you need to set a property on the subnet containing the Private Endpoint. There is a property on each subnet in a virtual network named PrivateEndpointNetworkPolicies. When a private endpoint is created in a subnet this property is set disabled. This property needs to be set to enabled. This is done by setting the –disable-private-endpoint-network-policies parameter to false as seen below.

az network vnet subnet update \
  --disable-private-endpoint-network-policies false \
  --name snet-pri \
  --resource-group rg-demo-routing \
  --vnet-name vnet-spoke-1

I then created a route table, added a route for 10.1.0.0/16 with a next hop of the router at 10.0.0.4, and assigned the route table to the second VM’s subnet. The effective routes on the network interface for the second VM now show the two /32s as invalid with new route now active.

Effective routes for second VM in the hub after routing change

A quick tcpdump on the router shows the traffic flowing through the router as we have defined in our routes.

tcpdump of router

For fun, let’s try that same wget on the Key Vault private endpoint.

Uh-oh. Why aren’t I getting back a 404 and why am I not seeing the other side of conversation on the router? If you guessed asymmetric routing you’d be spot on! To fix this I would need to setup the iptables on my router to NAT (network address translation) to the router’s address. The reason attaching a route table to the spoke 1 subnet the private endpoints wouldn’t work is because private endpoints do not honor UDRs. I imagine you’re scratching your head asking why it worked with the storage account and not with Key Vault? Well folks, it’s because Microsoft does something funky at the SDN (software defined networking) layer for storage that is not done for any other service’s private endpoints. I bring this up because I wasted a good hour scratching my head as to why this was working without NAT until I came across that buried issue in the Microsoft documentation. So take this nugget of knowledge with you, storage account private endpoint networking works differently from all other PaaS service private networking in Azure. Sadly, when things move fast, architectural standards tend to be one of the things that fall into the “we’ll get back to that on a later release” bucket.

So wonderful, the pain of /32 routes is gone! Sure we still need to NAT because private endpoints still don’t honor UDRs attached to their subnet, but the pain is far less than it was with the /32 mess. One thing to take from this gain is there is a now a disclaimer to Azure routing precedence. When it comes to routing with private endpoints, UDRs take precedence over the system route even if the system route is more specific.

Now let’s take a look at NSG support. I next created an NSG and associated it to the private endpoint’s subnet. I added a deny rule blocking all https traffic from the VM in spoke 2.

NSG applied to private endpoint subnet

Running a wget on the VM in spoke 2 to the blob storage endpoint on the private endpoint in spoke 1 returns the file successfully. The NSG does not take effect simply by enabling the feature. The PrivateEndpointNetworkPolicies property I mentioned above must be set to enabled.

After setting the property, the change takes about a minute or two to complete. Once complete, running another wget from the VM in spoke 2 failed to make a connection validating the NSG is working as expected.

NSG blocking connection

One thing to be aware of is NSG flow logs will not log the connection at this time. Hopefully this will be worked out by GA.

Well folks that’s it for this post. The key things you should take aware are the following:

  • Testing these features requires registering the feature on the subscription AND setting the PrivateEndpointNetworkPolicies property to enabled on the subnet. Keep in mind setting this property is required for both UDR summarization and enabling NSG support. (Thank you to my peer Silvia Wibowo for pointing out that it is also required for UDR summarization).
  • NAT is still required to ensure symmetric routing when traffic is coming from on-premises or another spoke. The only exception are private endpoints for Azure Storage because it operates different at the SDN.
  • The UDR feature for private endpoints makes less-specific UDRs take precedence over the more specific private endpoint system route.
  • NSG and summarized UDR support for private endpoints are still in public preview and are not recommended for production until GA.
  • NSG Flow Logs do not log connection attempts to private endpoints at this time.

See you next post!