Hello fellow geeks!
Earlier this week I was messing around with Kubernetes SSHing into the nodes and I ran into an interesting quirk of NSGs (Network Security Groups). I noticed that traffic I did not expect to be allowed through the NSG was making it through. A bit of digging let me down the path of a documented, but not well known, behavior of the VirtualNetwork service tag when used in NSG security rules. Today I’m going to walk through that behavior, why you should care, and what you can do to avoid being surprised like I was.
NSGs are layer four stateful firewalls that operate at the SDN (software-defined network). They serve a similar purpose and function in much the same way as AWS Security Groups. NSGs are used for microsegmentation within and across Virtual Networks typically supplementing the centralized control that is provided by a security appliance such as Azure Firewall or a Palo Alto firewall. They are associated to a subnet (best practice) or NIC (network interface) (few use cases for this). Each contains a collection of security rules, which includes default rules and user-defined rules. NSG security rules are processed by priority and are matched based on a 5-tuple.
As described in the previous link, service tags can be used within NSG security rules to simplify access to Azure resources. Service tags contain a summarized list of IPs that is managed by Microsoft. This makes life far easier, because whitelisting the IPs to something like Azure Storage Rules would be a nightmarish task that would require customer-created automation to keep up to date as IPs are added or removed to the underlining service. The benefit of service tags does come with a consideration as we’ll see in this post.
Each subnet or NIC can have one NSG applied to it, but the NSG can be applied to multiple subnets or NICs. In the instance of NSGs being applied at both the subnet and NIC, the processing for inbound traffic is detailed here and for outbound here.
Now that you know the basics of NSGs, let me talk a bit about the lab. For this lab I used my simple hub and spoke lab with a few modifications. I have added an Ubuntu VM running in the application subnet (snet-app) in the workload spoke virtual network. I’ve also temporarily removed the UDR from the custom route table on the application subnet. The NSG applied to the spoke contains only the default NSG rules. The lab architecture can be seen below.
Reviewing the NSG applied to the application subnet, the three default inbound rules are present as expected. The rule I’m going to look more deeply at is the AllowVnetInBound rule highlighted below. Specifically, I’m going to show you how to look at the IPs behind a service tag.
To see the IPs associated with a service tag, I’m going to use the Effective security rules tool in Azure’s Network Watcher. If you’re unfamiliar with Network Watcher, you’re missing out. It contains a plethora of useful tools to help diagnose network connectivity. The Effective security rules tool looks at the NSGs applied to a NIC at both the subnet and NIC level to provide you with a holistic view of the what traffic is allowed and combined between NSGs applied at each level.
One of the lesser known features of the tool is it gives you the ability to look at the IPs included within a service tag for a specific NSG security rule. In the image below you will see that the IPs included in the VirtualNetwork service tag are the workload virtual network IP range (10.2.0.0/16), the peered transit virtual network IP range (10.0.0.0/16), and the Azure “magic IP” 126.96.36.199. This is likely what you expected to see in the VirtualNetwork tag.
Remember when I said I removed the UDR for the default route from the custom route table applied to the application subnet? I then added that route back in, pointed it to the Azure Firewall, waited about 2 minutes, then re-ran the Effective security rules tool.
My first reaction to seeing all IP addresses now allowed through the VirtualNetwork tag was pretty much the Scanners head explosion GIF (classic if you haven’t seen it). It turns out this behavior is documented. The VirtualNetwork service tag has the following explanation:
The virtual network address space (all IP address ranges defined for the virtual network), all connected on-premises address spaces, peered virtual networks, virtual networks connected to a virtual network gateway, the virtual IP address of the host, and address prefixes used on user-defined routes. This tag might also contain default routes.https://docs.microsoft.com/en-us/azure/virtual-network/service-tags-overview#available-service-tags
The part of that excerpt you need to care about is the piece about it includes the address prefixes on user-defined routes. This means that the prefixes in the UDRs you place on a custom route table applied to the subnet are added to the VirtualNetwork service tag in the NSG security rules used by the NSGs applied to your resource. I’m not sure why this behavior was implemented, but it can impact separation of duties where you’d have a networking team managing the routing within route tables and the security team managing which traffic is allowed in or out with NSGs. If someone has control over the routing tables, they can influence the VirtualNetwork service tag prefixes, which will influence the behavior of the default NSG security rules and others using that tag.
If you’re like me, your first level of panic was around the risk of this allowing traffic from the public Internet inbound to the resource if the resource had a public IP. You can rest easy in that my testing showed this is not possible even with an additional UDR in place to assure symmetric flow of traffic to the Internet endpoint coming in directly via the public IP. It’s likely Microsoft is doing some type of filtering at the SDN layer excluding traffic identified as being sourced from the Internet from being included in this security rule.
It gets more interesting when you use the IP Flow Verify tool in Network Watcher. Here I picked a random public IP and tested an inbound flow. The tool reports the flow as being allowed by the default AllowVnetInBound rule. Take note of this behavior because it could lead to confusion with your Information Security team or third-party auditors.
The second level of panic I had was that this rule would allow any endpoint that has connectivity to my Virtual Network (such as other Virtual Networks attached as spokes to the hub Virtual Network) full connectivity to the endpoints behind the NSG. This concern is actually legitimate and was the reason I originally went down the rabbit hole. Traffic from a VM in the Shared Services Virtual Network is allowed full network connectivity the VM in the application subnet since the Virtual Network service tag includes the all IPv4 addresses (note this traffic was allowed through the Azure Firewall).
So why should you care about any of this? You should care because the programmed behavior of adding prefixes from UDRs to the VirtualNetwork service tag means those with control over the custom route tables (typically the networking team) have the ability to affect which traffic is allowed through an NSG if any NSG security rules use the VirtualNetwork service tag. From a separation of duties perspective, this is very far from optimal. Additionally, since most hub and spoke architectures use a UDR with a default route of 0.0.0.0/0, unless you have a user-defined deny security rule in place, you are affected by this. Lastly, it goes to show that tools such as IP Flow Verify which work on evaluating the SDN rule set can produce confusing results.
There are some great ways to mitigate this risk thankfully. You could use Azure Policy to audit, deny, or remediate NSGs that are deployed without a default deny option. There are some great examples of remediation in the community GitHub. Funneling workload-to-workload and user-to-workload traffic through a security appliance such as Azure Firewall running in the transit Virtual Network is another great risk mitigator. Lastly, tightly controlling access to your route tables and limiting use of the VirtualNetwork service tags are other options.
Well folks, that wraps up this post. Hopefully the information was useful and you can leverage some of it to more tightly secure your Azure environment.
Have a great week!