VirtualNetwork Service Tag and Network Security Groups

Hello fellow geeks!

Earlier this week I was messing around with Kubernetes SSHing into the nodes and I ran into an interesting quirk of NSGs (Network Security Groups). I noticed that traffic I did not expect to be allowed through the NSG was making it through. A bit of digging let me down the path of a documented, but not well known, behavior of the VirtualNetwork service tag when used in NSG security rules. Today I’m going to walk through that behavior, why you should care, and what you can do to avoid being surprised like I was.

NSGs are layer four stateful firewalls that operate at the SDN (software-defined network). They serve a similar purpose and function in much the same way as AWS Security Groups. NSGs are used for microsegmentation within and across Virtual Networks typically supplementing the centralized control that is provided by a security appliance such as Azure Firewall or a Palo Alto firewall. They are associated to a subnet (best practice) or NIC (network interface) (few use cases for this). Each contains a collection of security rules, which includes default rules and user-defined rules. NSG security rules are processed by priority and are matched based on a 5-tuple.

As described in the previous link, service tags can be used within NSG security rules to simplify access to Azure resources. Service tags contain a summarized list of IPs that is managed by Microsoft. This makes life far easier, because whitelisting the IPs to something like Azure Storage Rules would be a nightmarish task that would require customer-created automation to keep up to date as IPs are added or removed to the underlining service. The benefit of service tags does come with a consideration as we’ll see in this post.

Each subnet or NIC can have one NSG applied to it, but the NSG can be applied to multiple subnets or NICs. In the instance of NSGs being applied at both the subnet and NIC, the processing for inbound traffic is detailed here and for outbound here.

Now that you know the basics of NSGs, let me talk a bit about the lab. For this lab I used my simple hub and spoke lab with a few modifications. I have added an Ubuntu VM running in the application subnet (snet-app) in the workload spoke virtual network. I’ve also temporarily removed the UDR from the custom route table on the application subnet. The NSG applied to the spoke contains only the default NSG rules. The lab architecture can be seen below.

Lab environment

Reviewing the NSG applied to the application subnet, the three default inbound rules are present as expected. The rule I’m going to look more deeply at is the AllowVnetInBound rule highlighted below. Specifically, I’m going to show you how to look at the IPs behind a service tag.

Default Inbound NSG Security Rules

To see the IPs associated with a service tag, I’m going to use the Effective security rules tool in Azure’s Network Watcher. If you’re unfamiliar with Network Watcher, you’re missing out. It contains a plethora of useful tools to help diagnose network connectivity. The Effective security rules tool looks at the NSGs applied to a NIC at both the subnet and NIC level to provide you with a holistic view of the what traffic is allowed and combined between NSGs applied at each level.

Effective security rules tool in Network Watcher

One of the lesser known features of the tool is it gives you the ability to look at the IPs included within a service tag for a specific NSG security rule. In the image below you will see that the IPs included in the VirtualNetwork service tag are the workload virtual network IP range (10.2.0.0/16), the peered transit virtual network IP range (10.0.0.0/16), and the Azure “magic IP” 168.63.129.16. This is likely what you expected to see in the VirtualNetwork tag.

VirtualNetwork service tag contents without UDR

Remember when I said I removed the UDR for the default route from the custom route table applied to the application subnet? I then added that route back in, pointed it to the Azure Firewall, waited about 2 minutes, then re-ran the Effective security rules tool.

VirtualNetwork service tag contents with UDR of default route

My first reaction to seeing all IP addresses now allowed through the VirtualNetwork tag was pretty much the Scanners head explosion GIF (classic if you haven’t seen it). It turns out this behavior is documented. The VirtualNetwork service tag has the following explanation:

The virtual network address space (all IP address ranges defined for the virtual network), all connected on-premises address spaces, peered virtual networks, virtual networks connected to a virtual network gateway, the virtual IP address of the host, and address prefixes used on user-defined routes. This tag might also contain default routes.

https://docs.microsoft.com/en-us/azure/virtual-network/service-tags-overview#available-service-tags

The part of that excerpt you need to care about is the piece about it includes the address prefixes on user-defined routes. This means that the prefixes in the UDRs you place on a custom route table applied to the subnet are added to the VirtualNetwork service tag in the NSG security rules used by the NSGs applied to your resource. I’m not sure why this behavior was implemented, but it can impact separation of duties where you’d have a networking team managing the routing within route tables and the security team managing which traffic is allowed in or out with NSGs. If someone has control over the routing tables, they can influence the VirtualNetwork service tag prefixes, which will influence the behavior of the default NSG security rules and others using that tag.

If you’re like me, your first level of panic was around the risk of this allowing traffic from the public Internet inbound to the resource if the resource had a public IP. You can rest easy in that my testing showed this is not possible even with an additional UDR in place to assure symmetric flow of traffic to the Internet endpoint coming in directly via the public IP. It’s likely Microsoft is doing some type of filtering at the SDN layer excluding traffic identified as being sourced from the Internet from being included in this security rule.

It gets more interesting when you use the IP Flow Verify tool in Network Watcher. Here I picked a random public IP and tested an inbound flow. The tool reports the flow as being allowed by the default AllowVnetInBound rule. Take note of this behavior because it could lead to confusion with your Information Security team or third-party auditors.

IP Flow Verify showing flow is allowed

The second level of panic I had was that this rule would allow any endpoint that has connectivity to my Virtual Network (such as other Virtual Networks attached as spokes to the hub Virtual Network) full connectivity to the endpoints behind the NSG. This concern is actually legitimate and was the reason I originally went down the rabbit hole. Traffic from a VM in the Shared Services Virtual Network is allowed full network connectivity the VM in the application subnet since the Virtual Network service tag includes the all IPv4 addresses (note this traffic was allowed through the Azure Firewall).

So why should you care about any of this? You should care because the programmed behavior of adding prefixes from UDRs to the VirtualNetwork service tag means those with control over the custom route tables (typically the networking team) have the ability to affect which traffic is allowed through an NSG if any NSG security rules use the VirtualNetwork service tag. From a separation of duties perspective, this is very far from optimal. Additionally, since most hub and spoke architectures use a UDR with a default route of 0.0.0.0/0, unless you have a user-defined deny security rule in place, you are affected by this. Lastly, it goes to show that tools such as IP Flow Verify which work on evaluating the SDN rule set can produce confusing results.

There are some great ways to mitigate this risk thankfully. You could use Azure Policy to audit, deny, or remediate NSGs that are deployed without a default deny option. There are some great examples of remediation in the community GitHub. Funneling workload-to-workload and user-to-workload traffic through a security appliance such as Azure Firewall running in the transit Virtual Network is another great risk mitigator. Lastly, tightly controlling access to your route tables and limiting use of the VirtualNetwork service tags are other options.

Well folks, that wraps up this post. Hopefully the information was useful and you can leverage some of it to more tightly secure your Azure environment.

Have a great week!

A look at the Azure DNS Private Resolver

Hello again!

Today I’m going to cover the new Azure DNS Private Resolver feature that recently went into public preview. I’ve written extensively about Azure DNS in the past and I recommend reading through that series if you’re new to the platform. It has grown to be significantly important in Azure architectures due to its role in name resolution for Private Endpoints. A common pain point for customers using Private Endpoints from on-premises is the requirement to have a VM in Azure capable of acting as a DNS proxy. This is explained in detail in this post. The Azure DNS Private Resolver seeks to ease that pain by providing a managed DNS solution capable of acting as a DNS proxy and conditional forwarder facilitating hybrid DNS resolution (for those of you coming from AWS, this is Azure’s Route 53 Resolver). Alexis Plantin beat me to the punch and put together a great write-up on the basics of the feature so my focus instead be on some additional scenarios and a pattern that I tested and validated.

I’m a big fan of keeping infrastructure services such as DNS centralized and under the management of central IT. This is one reason I’m partial to a landing zone with a dedicated shared services virtual network attached to the transit virtual network as illustrated in the image below. In this shared services virtual network you put your DNS, patching/update infrastructure, and potentially identity services such as Windows Active Directory. The virtual network and its resources can then be dropped into a dedicated subscription and locked down to central IT. Additionally, as an added bonus, keeping the transit virtual network dedicated to firewalls and virtual network gateways makes the eventual migration to Azure Virtual WAN that must easier.

Common landing zone design

The design I had in mind would place the Private Resolver in the shared services virtual network and would funnel all traffic to and from the resolver and on-premises or another spoke through the firewall in the transit virtual network. This way I could control the conversation, inspect the traffic if needed, and centrally log it. The lab environment I built to test the design is pictured below.

Lab environment

The first question I had was whether or not the inbound endpoint would obey the user defined routes in the custom route table I associated with the inbound endpoint subnet. To test this theory I made a DNS query from the VM running in spoke 2 to resolve an A record in a Private DNS Zone. This Private DNS Zone was only linked to the virtual network where the Private Resolvers were. If the inbound endpoint wasn’t capable of obeying the custom routes, then the return traffic would be dropped and my query would fail.

Result of query from VM in another spoke

Success! The inbound endpoint is returning traffic back through the firewall. Logs on the firewall confirm the traffic flowing through.

Firewall logs showing DNS traffic from spoke

Next I wanted to see if traffic from the outbound endpoint would obey the custom routes. To test this, I configured a DNS forwarding rule (conditional forwarding component of the service) to send all DNS queries for jogcloud.com back to the domain controller running in my lab. I then performed a DNS query from the VM running in spoke 2.

Firewall logs showing DNS traffic to on-premises

Success again as the query was answered! The traffic from the outbound endpoint is seen traversing the firewall on its way to my domain controller on-premises. This confirmed that both the inbound and outbound endpoints obey custom routing making the design I presented above viable.

Beyond the above, I also confirmed the Private Resolver is capable of resolving reverse lookup zones (for PTR records). I was happy to see reverse zones weren’t forgotten.

One noticeable gap today is the Private Resolver does not yet offer DNS query logging. If that is important to you, you may want to retain your existing DNS Proxy. If you happen to be using Azure Firewall, you could make use of the DNS Proxy feature which allows for logging of DNS queries. Azure Firewall could then be configured to use the Private Resolver as its resolver providing that conditional forward capability Azure Firewall’s DNS Proxy feature lacks.

That wraps up this post. Keep in mind that this feature is still in public preview, so do not go deploying it into production. I ran into some bugginess when I initially deployed it where it locked up a virtual network after a failed deployment, so you have been warned. Fingers crossed the existing capabilities make it to GA and DNS query logging comes in the future.

Thanks!

Private Endpoints Revisited: NSGs and UDRs

Private Endpoints Revisited: NSGs and UDRs

Welcome back fellow geeks!

It’s been a while since my last post. For the past few months I’ve been busy renewing some AWS certificates and putting together some Azure networking architectures on GitHub. A new post was long overdue, so I thought it would be fun to circle back to Private Endpoints yet again. I’ve written extensively about the topic over the past few years, yet there always seems more to learn.

There have historically been two major pain points with Private Endpoints which include routing complexity when trying to inspect traffic to Private Endpoints and a lack of NSG (network security groups) support. Late last year Microsoft announced in public preview a feature to help with the routing and support for NSGs. I typically don’t bother tinkering with features in public preview because the features often change once GA (generally available) or never make it to GA. Now that these two features are further along and likely close to GA, it was finally a good time to experiment with them.

I built out a simple lab with a hub and spoke architecture. There were two spoke VNets (virtual networks). The first spoke contained a single subnet with a VM, a private endpoint for a storage account, and a private endpoint for a Key Vault. The second spoke contained a single VM. Within the hub I had two subnets. One subnet contained a VM which would be used to route traffic between spokes, and the other contained a VM for testing routing changes. All VMs ran Ubuntu. Private DNS zones were defined for both Key Vault and blob storage and linked to all VNets to keep DNS simple for this test case.

Lab setup

Each spoke subnet had a custom route table assigned with a route to the other spoke set with a next hop as the VM acting as a router in the hub.

Once the lab was setup, I had to enable the preview features in the subscription I wanted to test with. This was done using the az feature command.

az feature register --namespace Microsoft.Network --name AllowPrivateEndpointNSG

The feature took about 30 minutes to finish registering. You can use the command below to track the registration process. While the feature is registering the state will report as registering, and when complete the state will be registered.

az feature show --namespace Microsoft.Network --name AllowPrivateEndpointNSG

Once the feature was ready to go, I first decided to test the new UDR feature. As I’ve covered in a prior post, creation of a private endpoint in a VNet creates a /32 route for the private endpoint’s IP address in both the VNet it is provisioned into as well as any directly peered VNets. This can be problematic when you need to route traffic coming from on-premises through a security appliance like a Palo Alto firewall running in Azure to inspect the traffic and perform IDS/IPS. Since Azure has historically selected the most specific prefix match for routing, and the GatewaySubnet in the hub would contain the /32 routes, you would be forced to create unique /32 UDRs (user-defined routes) for each private endpoint. This can created a lot of overhead and even risked hitting the maximum of 400 UDRs per route table.

With the introduction of these new features, Microsoft has made it easier to deal with the /32 routes. You can now create a more summarized UDR and that will take precedence over the more specific system route. Yes folks, I know this is confusing. Personally, I would have preferred Microsoft had gone the route of a toggle switch which would disable the /32 route from propagating into peered VNets. However, we have what we have.

Let’s take a look at it in action. The image below shows the effective routes on the network interface associated with the second VM in the hub. The two /32 routes for the private endpoints are present and active.

Effective routes for second VM in the hub

Before the summarized UDR can be added, you need to set a property on the subnet containing the Private Endpoint. There is a property on each subnet in a virtual network named PrivateEndpointNetworkPolicies. When a private endpoint is created in a subnet this property is set disabled. This property needs to be set to enabled. This is done by setting the –disable-private-endpoint-network-policies parameter to false as seen below.

az network vnet subnet update \
  --disable-private-endpoint-network-policies false \
  --name snet-pri \
  --resource-group rg-demo-routing \
  --vnet-name vnet-spoke-1

I then created a route table, added a route for 10.1.0.0/16 with a next hop of the router at 10.0.0.4, and assigned the route table to the second VM’s subnet. The effective routes on the network interface for the second VM now show the two /32s as invalid with new route now active.

Effective routes for second VM in the hub after routing change

A quick tcpdump on the router shows the traffic flowing through the router as we have defined in our routes.

tcpdump of router

For fun, let’s try that same wget on the Key Vault private endpoint.

Uh-oh. Why aren’t I getting back a 404 and why am I not seeing the other side of conversation on the router? If you guessed asymmetric routing you’d be spot on! To fix this I would need to setup the iptables on my router to NAT (network address translation) to the router’s address. The reason attaching a route table to the spoke 1 subnet the private endpoints wouldn’t work is because private endpoints do not honor UDRs. I imagine you’re scratching your head asking why it worked with the storage account and not with Key Vault? Well folks, it’s because Microsoft does something funky at the SDN (software defined networking) layer for storage that is not done for any other service’s private endpoints. I bring this up because I wasted a good hour scratching my head as to why this was working without NAT until I came across that buried issue in the Microsoft documentation. So take this nugget of knowledge with you, storage account private endpoint networking works differently from all other PaaS service private networking in Azure. Sadly, when things move fast, architectural standards tend to be one of the things that fall into the “we’ll get back to that on a later release” bucket.

So wonderful, the pain of /32 routes is gone! Sure we still need to NAT because private endpoints still don’t honor UDRs attached to their subnet, but the pain is far less than it was with the /32 mess. One thing to take from this gain is there is a now a disclaimer to Azure routing precedence. When it comes to routing with private endpoints, UDRs take precedence over the system route even if the system route is more specific.

Now let’s take a look at NSG support. I next created an NSG and associated it to the private endpoint’s subnet. I added a deny rule blocking all https traffic from the VM in spoke 2.

NSG applied to private endpoint subnet

Running a wget on the VM in spoke 2 to the blob storage endpoint on the private endpoint in spoke 1 returns the file successfully. The NSG does not take effect simply by enabling the feature. The PrivateEndpointNetworkPolicies property I mentioned above must be set to enabled.

After setting the property, the change takes about a minute or two to complete. Once complete, running another wget from the VM in spoke 2 failed to make a connection validating the NSG is working as expected.

NSG blocking connection

One thing to be aware of is NSG flow logs will not log the connection at this time. Hopefully this will be worked out by GA.

Well folks that’s it for this post. The key things you should take aware are the following:

  • Testing these features requires registering the feature on the subscription AND setting the PrivateEndpointNetworkPolicies property to enabled on the subnet. Keep in mind setting this property is required for both UDR summarization and enabling NSG support. (Thank you to my peer Silvia Wibowo for pointing out that it is also required for UDR summarization).
  • NAT is still required to ensure symmetric routing when traffic is coming from on-premises or another spoke. The only exception are private endpoints for Azure Storage because it operates different at the SDN.
  • The UDR feature for private endpoints makes less-specific UDRs take precedence over the more specific private endpoint system route.
  • NSG and summarized UDR support for private endpoints are still in public preview and are not recommended for production until GA.
  • NSG Flow Logs do not log connection attempts to private endpoints at this time.

See you next post!

A Deep Dive Into Azure Route Server

Hello again fellow geeks.

I recently had a customer reach out to me with an interest in learning more about ARS (Azure Route Server). The customer hoped it might ease some of the burden of managing routing across a large Azure implementation. I had yet to mess around with ARS since it only recently went GA (generally available), so I figured this would be a great opportunity to build out a lab and give it a whirl. Let’s get to it!

ARS is one of Microsoft’s new networking offerings in Azure offering a managed routing service that is hooked into Azure’s SDN (software defined network). The way I like to think of the service is a couple of VMs (virtual machines) that are managed by Microsoft, running a BGP (Border Gateway Protocol) service, and which have the ability to program routes directly into VNets (virtual networks). In addition to introducing some pretty cool new networking patterns such as an SD WAN and ExpressRoute pattern and dual-homed network, the feature that most interested my customer was its capability of BGP peering with a customer’s NVA (network virtual appliance) such as a Palo Alto or Cisco appliance and injecting those learned dynamic routes into the VNet. This is the feature which I’ll cover in this blog post.

I did some thinking about how I wanted to lab this out and what I wanted to use as an NVA. Since the behavior I wanted to test was primarily BGP, I figured I’d keep it simple and run a Linux VM which would host a lightweight BGP service. I found a wonderful post from Adam Stuart (seriously a great read and really cool Azure networking pattern in his post) which mentioned Exabgp. After doing a bit of research on it, it looked relatively easy to setup and use so the choice was made (Thanks Adam!) In addition to the NVA, I decided to build out a hub-and-spoke architecture and make use of my home pfSense appliance for S2S VPN (site-to-site virtual private network) connectivity and to replicate an on-premises environment. The result was the lab pictured below (image 1).

Image 1 – Azure Route Server Lab

One thing you may notice in the above image is ARS has a public IP address associated to it. Remember when I mentioned this is a couple of Microsoft-managed VMs? Well this public IP facilitates Microsoft management of the VMs similar to other managed services such as Azure SQL MI (Managed Instance). Before you ask, no you can’t associate an NSG (Network Security Group) to the subnet ARS is provisioned into, and the subnet has to be named RouteServerSubnet.

I setup a S2S VPN connection and BGP peering between my pfSense appliance and Azure VPN Gateway and advertised the set of routes documented in the lab diagram (image 1). I then provisioned a couple of Ubuntu VMs with two in the hub, one in the first spoke, and one in the second spoke.

Exabgp was a bit painful to setup because the documentation out there is a bit sparse and I had to piece it together from multiple sources. Given that, I’m going to spend a few minutes walking through the setup.

I first setup my ARS using the instructions in this link. Ensure you use a valid ASN for your NVA (autonomous system number).

Once ARS is setup, you can begin setting up Exabgp on the VM using the commands below.

sudo apt update
sudo apt install exabgp

You’ll then need to modify the service file located in /lib/systemd/system/exabgp.service to uncomment the lines below.

...
[Service]
#User=exabgp
#Group=exabgp
Environment=exabgp_daemon_daemonize=false
PermissionsStartOnly=true
ExecStartPre=-mkfifo /run/exabgp.in
...

Now you’ll need to create a configuration file for exabgp and save it to /etc/exabgp/exabgp.conf. Below is the configuration file I used for the lab.

neighbor 10.0.2.4 {
        router-id 10.0.0.4;
        local-address 10.0.0.4;
        local-as 65010;
        peer-as 65515;

        static {
                route 192.168.10.0/24 next-hop 10.0.0.4;
        #       route 192.168.1.0/24 next-hop 10.0.0.4 as-path [65010 65010] community [65010:2];
                route 192.168.1.0/24 next-hop 10.0.0.4 community [65010:2];
                route 10.1.0.0/16 next-hop 10.0.0.4;
                route 10.10.0.0/16 next-hop 10.0.0.4;
                route 0.0.0.0/0 next-hop 10.0.0.4;
        }
}
neighbor 10.0.2.5 {
        router-id 10.0.0.4;
        local-address 10.0.0.4;
        local-as 65010;
        peer-as 65515;

        static {
                route 192.168.10.0/24 next-hop 10.0.0.4;
        #       route 192.168.1.0/24 next-hop 10.0.0.4 as-path [65010 65010] community [65010:2];
                route 192.168.1.0/24 next-hop 10.0.0.4 community [65010:2];
                route 10.1.0.0/16 next-hop 10.0.0.4;
                route 10.10.0.0/16 next-hop 10.0.0.4;
                route 0.0.0.0/0 next-hop 10.0.0.4;
        }
}

Each instance of ARS is configured to be highly available and is deployed across availability zones if the region supports it. It comes with two BGP peering IPs you’ll need to peer with and advertise your routes to. If you only peer one with or advertise different routes, you’ll get funky behaviors. In the configuration file above, each JSON object represents one of these peers. You’ll notice I’m advertising a number of routes which will demonstrate different behaviors I’ll walk through with this post.

Once you’ve done the above you can start the service and validate that it successfully started.

sudo systemctl start exabgp
sudo systemctl status exabgp

Give it a minute and then you can run the following command to output the routes the exabgp service has learned from ARS.

sudo exabgpcli show adj-rib in extensive

In the output below (image 2) you can see ARS advertising the address space of the hub and spoke1 Vnet.

Image 2 – Exabgp output

So why isn’t spoke 2 being advertised? Well ARS requires the peerings between the VNets to be configured with UseRemoteGateway and AllowGatewayTransit properties (referred to in the Portal as Use the remote virtual network’s gateway or Route Server and Use this virtual network’s gateway or Route Server) in order for ARS to propagate routes between the Vnets. As you’ll note from the lab image, I enabled these settings for spoke 1 but not spoke 2.

This requirement does present a challenge in that when it’s enabled ARS propagates routes to the peered Vnets, but so does your ExpressRoute or VPN Gateway. My customer base typically requires all traffic leaving the spoke to flow through a security appliance. If you do not enable this setting the spoke only knows about itself and the hub VNet. To direct the traffic through an appliance in a hub you only need two UDRs (one for 0.0.0.0/0 and one for the hub VNet address space). This makes it easy to stamp out spokes from a networking perspective and optionally audit and enforce with Azure Policy.

Looking at the effective routes of the VM in spoke 1 (image 3), you can see routes highlighted in red are routes being propagated into the spoke by my VPN Gateway which is receiving them from my pfSense appliance. To ensure traffic between on-premises and Azure flows through a security appliance you’d need to define all of the routes you’re propagating over your ExpressRoute/VPN in your NVA. This way ARS propagates them to the peered VNets and overrides the ones coming in from the gateway.

Image 3 – Spoke 1 Effective Routes

Recall from my lab image (image 1), I’m sending 192.168.0.1/24, 192.168.2.0/24, 192.168.3.0/24, and 192.168.4.0/24 over BGP to Azure. In the image above (image 3) you’ll notice three out of four of these routes are coming from my VPN Gateway (10.0.3.5), but 192.168.0.1/24 is coming from Exabgp VM (10.0.0.4). The documentation says that when two routes for the same address space but different AS PATH lengths are received from different NVAs, ARS will only program the route with the shorter AS PATH. Apparently this isn’t just a trait of ARS and is a trait of the Azure SDN.

To validate this is the case, I edited my Exabgp config file and appended a longer AS PATH length to the route as seen below.

Exabgp config with modified AS PATH

After about a minute, the route table on my spoke 1 VM now displays the route for 192.168.0.1 coming from the VPN Gateway because the route coming from my Exabgp now has a longer AS PATH. This indicates that when two routes for the same address space are advertised, the route with the shortest AS PATH gets programmed to the VNet while the route with the longer AS PATH is seemingly discarded. I would have expected to see the 192.168.0.1/24 coming from on-premises with an Active value of Invalid, but instead it’s discarded.

Image 4 – VPN Gateway with shorter AS PATH

The next route I want to look at it is 10.1.0.0/16 which is the address space of spoke 1. I configured the Exabgp VM to advertise this route to ARS. Looking at the effective routes for the VM in the hub (VM-DEMO) the only route for this address space is the system route for the peering. This is because system routes for the VNet, VNet peering, or service endpoints are the preferred routes, even if BGP routes are more specific. This means you can’t use ARS to push routes that would force traffic from a spoke to flow through an NVA in the hub because of the VNet peering system route. For that use case it looks like you will need to continue to use UDRs (user defined routes)

I have the default route of 0.0.0.0/0 which I am advertising from the Exabgp VM. You can see in the above images is propagating to both the hub and spoke. Take note that if you advertise this route, you’re going to want to ensure you place a route table and UDR on the subnet your NVAs are in. Otherwise you’ll run into a scenario where you’ll have a routing loop because the default route would be received by the NVAs subnet.

Let’s pull the VPN Gateway into the mix. ARS does support BGP peering with an ExpressRoute or VPN Gateway. This allows you to propagate the routes ARS is learning from the NVA back on-premises. You enable this functionality by enabling the Branch-to-branch feature of ARS. In my lab environment, I commented out the default route in my Exabgp conf file because I didn’t want to send that back on-premises because I don’t know how to filter out routes in FRRoute (the BGP service running on pfSense). I then ran an az network vnet-gateway list-learned-routes and confirmed my VNG is now receiving routes from ARS as seen in the image below (image 5).

Image 5 – VPN Gateway learned routes

On the pfSense side I was now able to see the routes coming from my NVA coming over the VPN Gateway and back my on-premises appliance. These are the routes with an AS PATH of 65510 65515 65010 in the image below (image 6).

Image 6 – pfSense learned routes

The above shows that the routes coming from the NVA are being propagated back on-premises. What about the routes coming from on-premises? Are they being propagated all the way to the NVA? Let’s check it out!

To verify this I printed the routes ARS is advertising to the NVA about using az network routeserver peering list-advertised-routes. The routes in bold are the routes coming from the pfSense appliance showing that ARS is receiving the routes and advertising them to the Exabgp VM.

RouteServiceRole_IN_0:
- asPath: '65515'
  localAddress: 10.0.2.4
  network: 10.0.0.0/16
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: '65515'
  localAddress: 10.0.2.4
  network: 10.1.0.0/16
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.4.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.2.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.3.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0

Lastly, I wanted to see the routes and details on the Exabpg VM. For that I ran the command below on the VM.

sudo exabgpcli show adj-rib in extensive
Image 7 – Exabgp learned routes

In the image above (image 7) the on-premises routes for the 192.168.0 address spaces have been learned and have the AS PATH leading back on-premises. Note the community value of 65517:65517 being attached to the routes. That is not coming from my pfSense appliance but rather the VPN Gateway. The community tag is used to identify these routes as coming from a VPN Gateway and are used by Microsoft for filtering.

Well folks that about wraps it up. The key takeaways of my time with ARS were the following:

  • ARS requires the UseRemoteGateway and AllowGatewayTransit properties be configured on VNet peering. This makes managing traffic flow between an spoke and on-premises a bit more complicated. Instead of defining a 0.0.0.0/0 route and calling it a day, you would need to define all the routes in your NVA and propagate them using ARS.

    This isn’t necessarily a bad thing, you’re just shifting management of routing outside the Azure control plane and into the NVA data plane. That may be your preference.
  • When multiple routes for the same address space and different length AS PATHs are propagated to a VNet, the Azure SDN will only program the route the shortest AS PATH. The other route with the longer AS PATH is discarded.
  • You can’t use an NVA and ARS to propagate the hub and spoke VNet address spaces back on-premises because system routes for the VNet, VNet peering, and Service Endpoints supersede routes propagated via BGP. Instead, you could use a larger summarized route encompassing the entire address space for the VNets in your region and propagate that back on-premises using the NVA and ARS.
  • You can use an NVA and ARS to propagate routes from Azure back to on-premises. A use case might be that you want all Internet-bound traffic to egress out of Azure because you get better performance than you’re getting from your ISP on-premises. I’ve never seen it, but nevertheless. 🙂

Hopefully some of the above helps you become more familiar with Azure Route Server and what the benefits and considerations are.

See you next post!