10/12/22 Update – Private Resolver is now Generally Available!
Today I’m going to cover the new Azure DNS Private Resolver feature that recently went into public preview. I’ve written extensively about Azure DNS in the past and I recommend reading through that series if you’re new to the platform. It has grown to be significantly important in Azure architectures due to its role in name resolution for Private Endpoints. A common pain point for customers using Private Endpoints from on-premises is the requirement to have a VM in Azure capable of acting as a DNS proxy. This is explained in detail in this post. The Azure DNS Private Resolver seeks to ease that pain by providing a managed DNS solution capable of acting as a DNS proxy and conditional forwarder facilitating hybrid DNS resolution (for those of you coming from AWS, this is Azure’s Route 53 Resolver). Alexis Plantin beat me to the punch and put together a great write-up on the basics of the feature so my focus instead be on some additional scenarios and a pattern that I tested and validated.
I’m a big fan of keeping infrastructure services such as DNS centralized and under the management of central IT. This is one reason I’m partial to a landing zone with a dedicated shared services virtual network attached to the transit virtual network as illustrated in the image below. In this shared services virtual network you put your DNS, patching/update infrastructure, and potentially identity services such as Windows Active Directory. The virtual network and its resources can then be dropped into a dedicated subscription and locked down to central IT. Additionally, as an added bonus, keeping the transit virtual network dedicated to firewalls and virtual network gateways makes the eventual migration to Azure Virtual WAN that must easier.
The design I had in mind would place the Private Resolver in the shared services virtual network and would funnel all traffic to and from the resolver and on-premises or another spoke through the firewall in the transit virtual network. This way I could control the conversation, inspect the traffic if needed, and centrally log it. The lab environment I built to test the design is pictured below.
The first question I had was whether or not the inbound endpoint would obey the user defined routes in the custom route table I associated with the inbound endpoint subnet. To test this theory I made a DNS query from the VM running in spoke 2 to resolve an A record in a Private DNS Zone. This Private DNS Zone was only linked to the virtual network where the Private Resolvers were. If the inbound endpoint wasn’t capable of obeying the custom routes, then the return traffic would be dropped and my query would fail.
Success! The inbound endpoint is returning traffic back through the firewall. Logs on the firewall confirm the traffic flowing through.
Next I wanted to see if traffic from the outbound endpoint would obey the custom routes. To test this, I configured a DNS forwarding rule (conditional forwarding component of the service) to send all DNS queries for jogcloud.com back to the domain controller running in my lab. I then performed a DNS query from the VM running in spoke 2.
Success again as the query was answered! The traffic from the outbound endpoint is seen traversing the firewall on its way to my domain controller on-premises. This confirmed that both the inbound and outbound endpoints obey custom routing making the design I presented above viable.
Beyond the above, I also confirmed the Private Resolver is capable of resolving reverse lookup zones (for PTR records). I was happy to see reverse zones weren’t forgotten.
One noticeable gap today is the Private Resolver does not yet offer DNS query logging. If that is important to you, you may want to retain your existing DNS Proxy. If you happen to be using Azure Firewall, you could make use of the DNS Proxy feature which allows for logging of DNS queries. Azure Firewall could then be configured to use the Private Resolver as its resolver providing that conditional forward capability Azure Firewall’s DNS Proxy feature lacks.
It’s been a while since my last post. For the past few months I’ve been busy renewing some AWS certificates and putting together some Azure networking architectures on GitHub. A new post was long overdue, so I thought it would be fun to circle back to Private Endpoints yet again. I’ve written extensively about the topic over the past few years, yet there always seems more to learn.
There have historically been two major pain points with Private Endpoints which include routing complexity when trying to inspect traffic to Private Endpoints and a lack of NSG (network security groups) support. Late last year Microsoft announced in public preview a feature to help with the routing and support for NSGs. I typically don’t bother tinkering with features in public preview because the features often change once GA (generally available) or never make it to GA. Now that these two features are further along and likely close to GA, it was finally a good time to experiment with them.
I built out a simple lab with a hub and spoke architecture. There were two spoke VNets (virtual networks). The first spoke contained a single subnet with a VM, a private endpoint for a storage account, and a private endpoint for a Key Vault. The second spoke contained a single VM. Within the hub I had two subnets. One subnet contained a VM which would be used to route traffic between spokes, and the other contained a VM for testing routing changes. All VMs ran Ubuntu. Private DNS zones were defined for both Key Vault and blob storage and linked to all VNets to keep DNS simple for this test case.
Each spoke subnet had a custom route table assigned with a route to the other spoke set with a next hop as the VM acting as a router in the hub.
Once the lab was setup, I had to enable the preview features in the subscription I wanted to test with. This was done using the az feature command.
az feature register --namespace Microsoft.Network --name AllowPrivateEndpointNSG
The feature took about 30 minutes to finish registering. You can use the command below to track the registration process. While the feature is registering the state will report as registering, and when complete the state will be registered.
az feature show --namespace Microsoft.Network --name AllowPrivateEndpointNSG
Once the feature was ready to go, I first decided to test the new UDR feature. As I’ve covered in a prior post, creation of a private endpoint in a VNet creates a /32 route for the private endpoint’s IP address in both the VNet it is provisioned into as well as any directly peered VNets. This can be problematic when you need to route traffic coming from on-premises through a security appliance like a Palo Alto firewall running in Azure to inspect the traffic and perform IDS/IPS. Since Azure has historically selected the most specific prefix match for routing, and the GatewaySubnet in the hub would contain the /32 routes, you would be forced to create unique /32 UDRs (user-defined routes) for each private endpoint. This can created a lot of overhead and even risked hitting the maximum of 400 UDRs per route table.
With the introduction of these new features, Microsoft has made it easier to deal with the /32 routes. You can now create a more summarized UDR and that will take precedence over the more specific system route. Yes folks, I know this is confusing. Personally, I would have preferred Microsoft had gone the route of a toggle switch which would disable the /32 route from propagating into peered VNets. However, we have what we have.
Let’s take a look at it in action. The image below shows the effective routes on the network interface associated with the second VM in the hub. The two /32 routes for the private endpoints are present and active.
Before the summarized UDR can be added, you need to set a property on the subnet containing the Private Endpoint. There is a property on each subnet in a virtual network named PrivateEndpointNetworkPolicies. When a private endpoint is created in a subnet this property is set disabled. This property needs to be set to enabled. This is done by setting the –disable-private-endpoint-network-policies parameter to false as seen below.
I then created a route table, added a route for 10.1.0.0/16 with a next hop of the router at 10.0.0.4, and assigned the route table to the second VM’s subnet. The effective routes on the network interface for the second VM now show the two /32s as invalid with new route now active.
A quick tcpdump on the router shows the traffic flowing through the router as we have defined in our routes.
For fun, let’s try that same wget on the Key Vault private endpoint.
Uh-oh. Why aren’t I getting back a 404 and why am I not seeing the other side of conversation on the router? If you guessed asymmetric routing you’d be spot on! To fix this I would need to setup the iptables on my router to NAT (network address translation) to the router’s address. The reason attaching a route table to the spoke 1 subnet the private endpoints wouldn’t work is because private endpoints do not honor UDRs. I imagine you’re scratching your head asking why it worked with the storage account and not with Key Vault? Well folks, it’s because Microsoft does something funky at the SDN (software defined networking) layer for storage that is not done for any other service’s private endpoints. I bring this up because I wasted a good hour scratching my head as to why this was working without NAT until I came across that buried issue in the Microsoft documentation. So take this nugget of knowledge with you, storage account private endpoint networking works differently from all other PaaS service private networking in Azure. Sadly, when things move fast, architectural standards tend to be one of the things that fall into the “we’ll get back to that on a later release” bucket.
So wonderful, the pain of /32 routes is gone! Sure we still need to NAT because private endpoints still don’t honor UDRs attached to their subnet, but the pain is far less than it was with the /32 mess. One thing to take from this gain is there is a now a disclaimer to Azure routing precedence. When it comes to routing with private endpoints, UDRs take precedence over the system route even if the system route is more specific.
Now let’s take a look at NSG support. I next created an NSG and associated it to the private endpoint’s subnet. I added a deny rule blocking all https traffic from the VM in spoke 2.
NSG applied to private endpoint subnet
Running a wget on the VM in spoke 2 to the blob storage endpoint on the private endpoint in spoke 1 returns the file successfully. The NSG does not take effect simply by enabling the feature. The PrivateEndpointNetworkPolicies property I mentioned above must be set to enabled.
After setting the property, the change takes about a minute or two to complete. Once complete, running another wget from the VM in spoke 2 failed to make a connection validating the NSG is working as expected.
Well folks that’s it for this post. The key things you should take aware are the following:
Testing these features requires registering the feature on the subscription AND setting the PrivateEndpointNetworkPolicies property to enabled on the subnet. Keep in mind setting this property is required for both UDR summarization and enabling NSG support. (Thank you to my peer Silvia Wibowo for pointing out that it is also required for UDR summarization).
NAT is still required to ensure symmetric routing when traffic is coming from on-premises or another spoke. The only exception are private endpoints for Azure Storage because it operates different at the SDN.
The UDR feature for private endpoints makes less-specific UDRs take precedence over the more specific private endpoint system route.
NSG and summarized UDR support for private endpoints are still in public preview and are not recommended for production until GA.
NSG Flow Logs do not log connection attempts to private endpoints at this time.
I recently had a customer reach out to me with an interest in learning more about ARS (Azure Route Server). The customer hoped it might ease some of the burden of managing routing across a large Azure implementation. I had yet to mess around with ARS since it only recently went GA (generally available), so I figured this would be a great opportunity to build out a lab and give it a whirl. Let’s get to it!
ARS is one of Microsoft’s new networking offerings in Azure offering a managed routing service that is hooked into Azure’s SDN (software defined network). The way I like to think of the service is a couple of VMs (virtual machines) that are managed by Microsoft, running a BGP (Border Gateway Protocol) service, and which have the ability to program routes directly into VNets (virtual networks). In addition to introducing some pretty cool new networking patterns such as an SD WAN and ExpressRoute pattern and dual-homed network, the feature that most interested my customer was its capability of BGP peering with a customer’s NVA (network virtual appliance) such as a Palo Alto or Cisco appliance and injecting those learned dynamic routes into the VNet. This is the feature which I’ll cover in this blog post.
I did some thinking about how I wanted to lab this out and what I wanted to use as an NVA. Since the behavior I wanted to test was primarily BGP, I figured I’d keep it simple and run a Linux VM which would host a lightweight BGP service. I found a wonderful post from Adam Stuart (seriously a great read and really cool Azure networking pattern in his post) which mentioned Exabgp. After doing a bit of research on it, it looked relatively easy to setup and use so the choice was made (Thanks Adam!) In addition to the NVA, I decided to build out a hub-and-spoke architecture and make use of my home pfSense appliance for S2S VPN (site-to-site virtual private network) connectivity and to replicate an on-premises environment. The result was the lab pictured below (image 1).
One thing you may notice in the above image is ARS has a public IP address associated to it. Remember when I mentioned this is a couple of Microsoft-managed VMs? Well this public IP facilitates Microsoft management of the VMs similar to other managed services such as Azure SQL MI (Managed Instance). Before you ask, no you can’t associate an NSG (Network Security Group) to the subnet ARS is provisioned into, and the subnet has to be named RouteServerSubnet.
I setup a S2S VPN connection and BGP peering between my pfSense appliance and Azure VPN Gateway and advertised the set of routes documented in the lab diagram (image 1). I then provisioned a couple of Ubuntu VMs with two in the hub, one in the first spoke, and one in the second spoke.
Exabgp was a bit painful to setup because the documentation out there is a bit sparse and I had to piece it together from multiple sources. Given that, I’m going to spend a few minutes walking through the setup.
I first setup my ARS using the instructions in this link. Ensure you use a valid ASN for your NVA (autonomous system number).
Once ARS is setup, you can begin setting up Exabgp on the VM using the commands below.
sudo apt update
sudo apt install exabgp
You’ll then need to modify the service file located in /lib/systemd/system/exabgp.service to uncomment the lines below.
Each instance of ARS is configured to be highly available and is deployed across availability zones if the region supports it. It comes with two BGP peering IPs you’ll need to peer with and advertise your routes to. If you only peer one with or advertise different routes, you’ll get funky behaviors. In the configuration file above, each JSON object represents one of these peers. You’ll notice I’m advertising a number of routes which will demonstrate different behaviors I’ll walk through with this post.
Once you’ve done the above you can start the service and validate that it successfully started.
sudo systemctl start exabgp
sudo systemctl status exabgp
Give it a minute and then you can run the following command to output the routes the exabgp service has learned from ARS.
sudo exabgpcli show adj-rib in extensive
In the output below (image 2) you can see ARS advertising the address space of the hub and spoke1 Vnet.
So why isn’t spoke 2 being advertised? Well ARS requires the peerings between the VNets to be configured with UseRemoteGateway and AllowGatewayTransit properties (referred to in the Portal as Use the remote virtual network’s gateway or Route Server and Use this virtual network’s gateway or Route Server) in order for ARS to propagate routes between the Vnets. As you’ll note from the lab image, I enabled these settings for spoke 1 but not spoke 2.
This requirement does present a challenge in that when it’s enabled ARS propagates routes to the peered Vnets, but so does your ExpressRoute or VPN Gateway. My customer base typically requires all traffic leaving the spoke to flow through a security appliance. If you do not enable this setting the spoke only knows about itself and the hub VNet. To direct the traffic through an appliance in a hub you only need two UDRs (one for 0.0.0.0/0 and one for the hub VNet address space). This makes it easy to stamp out spokes from a networking perspective and optionally audit and enforce with Azure Policy.
Looking at the effective routes of the VM in spoke 1 (image 3), you can see routes highlighted in red are routes being propagated into the spoke by my VPN Gateway which is receiving them from my pfSense appliance. To ensure traffic between on-premises and Azure flows through a security appliance you’d need to define all of the routes you’re propagating over your ExpressRoute/VPN in your NVA. This way ARS propagates them to the peered VNets and overrides the ones coming in from the gateway.
Recall from my lab image (image 1), I’m sending 192.168.0.1/24, 192.168.2.0/24, 192.168.3.0/24, and 192.168.4.0/24 over BGP to Azure. In the image above (image 3) you’ll notice three out of four of these routes are coming from my VPN Gateway (10.0.3.5), but 192.168.0.1/24 is coming from Exabgp VM (10.0.0.4). The documentation says that when two routes for the same address space but different AS PATH lengths are received from different NVAs, ARS will only program the route with the shorter AS PATH. Apparently this isn’t just a trait of ARS and is a trait of the Azure SDN.
To validate this is the case, I edited my Exabgp config file and appended a longer AS PATH length to the route as seen below.
Exabgp config with modified AS PATH
After about a minute, the route table on my spoke 1 VM now displays the route for 192.168.0.1 coming from the VPN Gateway because the route coming from my Exabgp now has a longer AS PATH. This indicates that when two routes for the same address space are advertised, the route with the shortest AS PATH gets programmed to the VNet while the route with the longer AS PATH is seemingly discarded. I would have expected to see the 192.168.0.1/24 coming from on-premises with an Active value of Invalid, but instead it’s discarded.
The next route I want to look at it is 10.1.0.0/16 which is the address space of spoke 1. I configured the Exabgp VM to advertise this route to ARS. Looking at the effective routes for the VM in the hub (VM-DEMO) the only route for this address space is the system route for the peering. This is because system routes for the VNet, VNet peering, or service endpoints are the preferred routes, even if BGP routes are more specific. This means you can’t use ARS to push routes that would force traffic from a spoke to flow through an NVA in the hub because of the VNet peering system route. For that use case it looks like you will need to continue to use UDRs (user defined routes)
I have the default route of 0.0.0.0/0 which I am advertising from the Exabgp VM. You can see in the above images is propagating to both the hub and spoke. Take note that if you advertise this route, you’re going to want to ensure you place a route table and UDR on the subnet your NVAs are in. Otherwise you’ll run into a scenario where you’ll have a routing loop because the default route would be received by the NVAs subnet.
Let’s pull the VPN Gateway into the mix. ARS does support BGP peering with an ExpressRoute or VPN Gateway. This allows you to propagate the routes ARS is learning from the NVA back on-premises. You enable this functionality by enabling the Branch-to-branch feature of ARS. In my lab environment, I commented out the default route in my Exabgp conf file because I didn’t want to send that back on-premises because I don’t know how to filter out routes in FRRoute (the BGP service running on pfSense). I then ran an az network vnet-gateway list-learned-routes and confirmed my VNG is now receiving routes from ARS as seen in the image below (image 5).
On the pfSense side I was now able to see the routes coming from my NVA coming over the VPN Gateway and back my on-premises appliance. These are the routes with an AS PATH of 65510 65515 65010 in the image below (image 6).
The above shows that the routes coming from the NVA are being propagated back on-premises. What about the routes coming from on-premises? Are they being propagated all the way to the NVA? Let’s check it out!
To verify this I printed the routes ARS is advertising to the NVA about using az network routeserver peering list-advertised-routes. The routes in bold are the routes coming from the pfSense appliance showing that ARS is receiving the routes and advertising them to the Exabgp VM.
Lastly, I wanted to see the routes and details on the Exabpg VM. For that I ran the command below on the VM.
sudo exabgpcli show adj-rib in extensive
In the image above (image 7) the on-premises routes for the 192.168.0 address spaces have been learned and have the AS PATH leading back on-premises. Note the community value of 65517:65517 being attached to the routes. That is not coming from my pfSense appliance but rather the VPN Gateway. The community tag is used to identify these routes as coming from a VPN Gateway and are used by Microsoft for filtering.
Well folks that about wraps it up. The key takeaways of my time with ARS were the following:
ARS requires the UseRemoteGateway and AllowGatewayTransit properties be configured on VNet peering. This makes managing traffic flow between an spoke and on-premises a bit more complicated. Instead of defining a 0.0.0.0/0 route and calling it a day, you would need to define all the routes in your NVA and propagate them using ARS.
This isn’t necessarily a bad thing, you’re just shifting management of routing outside the Azure control plane and into the NVA data plane. That may be your preference.
When multiple routes for the same address space and different length AS PATHs are propagated to a VNet, the Azure SDN will only program the route the shortest AS PATH. The other route with the longer AS PATH is discarded.
You can’t use an NVA and ARS to propagate the hub and spoke VNet address spaces back on-premises because system routes for the VNet, VNet peering, and Service Endpoints supersede routes propagated via BGP. Instead, you could use a larger summarized route encompassing the entire address space for the VNets in your region and propagate that back on-premises using the NVA and ARS.
You can use an NVA and ARS to propagate routes from Azure back to on-premises. A use case might be that you want all Internet-bound traffic to egress out of Azure because you get better performance than you’re getting from your ISP on-premises. I’ve never seen it, but nevertheless. 🙂
Hopefully some of the above helps you become more familiar with Azure Route Server and what the benefits and considerations are.
Recently I was giving a customer an overview of Azure Managed Identities and came across an interesting find while building a demo environment. If you’re unfamiliar with managed identities, check out my prior series for an overview. Long story short, managed identities provide a solution for non-human identities where you don’t have to worry about storing, securing, and rotating the credentials. For those of you coming from AWS, managed identities are very similar to AWS Roles. They come in two flavors, user-assigned and system-assigned. For the purposes of this post, I’ll be focusing on system-assigned.
Under the hood, a managed identity is essentially a service principal with some orchestration on top of it. Interestingly enough, there are a number of different service principal types. Running the command below will spit back the different types of service principals that exist in your Azure AD tenant.
az ad sp list --query='.servicePrincipalType' --all | sort | uniq
If you’re interested in seeing the service principals associated with managed identities in your Azure AD tenant, you can run the command below.
az ad sp list --query="[?servicePrincipalType=='ManagedIdentity']" --all
Managed identities include a property called alternativeNames which is an array. In my testing I observed two values within this array. The first value is “isExplicit=True” or “isExplicit=False” which is set to True for user-assigned managed identities and False when it’s a system-assigned managed identity. If you want to see all system-assigned managed identities for example, you can run the command below.
az ad sp list --query="[?servicePrincipalType=='ManagedIdentity' && alternativeNames[?contains(@,'isExplicit=False')]]" --all
The other value in this array is the resource id of the managed identity in the case of a user-assigned managed identity. With a system-assigned managed identity this is the resource id of the Azure resource the system-assigned managed identity is associated with.
So why does any of this matter? Before we get to that, let’s cover the major selling point of a system-assigned managed identity when compared to a user-assigned managed identity. With a system-assigned managed identity, the managed identity (and its service principal) share the lifecycle of the resource. This means that if you delete the resource, the service principal is cleaned up… well most of the time anyway.
Sometimes this cleanup process doesn’t happen and you’re left with orphaned service principals in your directory. The most annoying part is you can’t delete these service principals (I’ve tried everything including calls direct to the ARM API) and the only way to get them removed is to open a support ticket. Now there isn’t a ton of risk I can think of with having these orphaned service principals left in your tenant since I’m not aware of any means to access the credential associated with it. Without the credential no one can authenticate as it. Assuming the RBAC permissions are cleaned up, it’s not really authorized to do anything within Azure either. However, beyond dirtying up your directory, it’s an identity with a credential that shouldn’t be there anymore.
I wanted an easy way to identify these orphaned system-assigned managed identities so I could submit a support ticket and get it cleaned up before it started cluttering up my demonstration tenant. This afternoon I wrote a really ugly bash script to do exactly that. The script uses some of the az cli commands I’ve listed above to identify all the system-assigned managed identities and then uses az cli to determine if the resource exists. If the resource doesn’t exist, it logs the displayName property of the system-assigned managed identity to a text file. Quick and dirty, but does the job.
Interestingly enough, I had a few peers run the script on their tenants and they all had some of these orphaned system-assigned managed identities, so it seems like this problem isn’t restricted to my tenants. Again, I personally can’t think of a risk of these identities remaining in the directory, but it does point to an issue with the lifecycle management processes Microsoft is using in the backend.