Network Security Perimeters – NSP Components

Posted on August 26, 2025 by mattfeltonma

This is part of my series on Network Security Perimeters:

Welcome back fellow geeks!

Today I will be continuing my series on NSPs (Network Security Perimeters). In the last post I outlined the problems NSPs were built to solve. I covered how users of Azure have historically controlled inbound and outbound traffic for PaaS (platform-as-a-service) in Azure and the gaps and challenges of those designs. In this post I’m going to dig into the components that make up an NSP, what their function is, how they’re related, and how they work together.

A Quick Review

Before I dive into the gory details of NSP primitives, I want to do a quick refresh on terminology I’ll be using in this post. As I covered in the last post, I divide PaaS services in Azure into what I call compute-based PaaS and service-based PaaS. Service-based PaaS is PaaS where you upload data but don’t control the code executed by the PaaS whereas with compute-based PaaS you control the code executed within the service. NSPs shine in the service-based PaaS realm and that will be my focus for this series.

**Compute-based PaaS vs Service-based PaaS**

Securing service-based PaaS with the traditional tooling of IP whitelisting, service whitelisting, resource whitelisting, resource-specific outbound controls (and they are very rare) presented the problems below:

Issues at scale (IP whitelisting).
Certain features were not available in all PaaS (resource-based whitelisting or product-specific outbound controls).
The configuration for inbound network control features lived as properties of the resource and could be configured differently by different teams resulting in inconsistent configurations.
Logging widely differed across products.

All the challenges above demanded a better more standardized solution, and that’s where NSPs come in.

Network Security Perimeter Components

I’m a huge fan of breaking down any technology into the base components that make it up. It makes it way easier to understand how the hell these things work together holistically to solve a problem. Let’s do that.

Network Security Perimeter Hierarchy

At the highest level is the Network Security Perimeter resource. This is what I refer to as a top-level resource, meaning a resource that exists directly under an Azure Resource Provider and is visible under a resource group. The Network Security Perimeter resource exists in the Microsoft.Network resource provider, is regional in nature (vs global), and serve as the outer container for the logic of the NSP.

The resource is very simple and the only properties of note are name, location, and tags.

Profiles

Underneath the Network Security Perimeter resource is the Profile resource. The easiest way to think about a profile is that it’s a collection of inbound and outbound network access rules that you plan to tie to a resource or resources. Each Network Security Perimeter resource can have 1 or more profiles. It’s recommended you have less than 200 profiles, however, I have trouble thinking of a use case for that many unless you’re in an older and more legacy subscription model where you’re packing everything into as few subscriptions as possible (not where you should be these days).

From mucking around with profiles, I can see the use for putting resources in the same NSP into different profiles. For example, I may wrap an NSP around a Storage Account and the Key Vault which holds the CMK (customer-managed key) used to encrypt the Storage Account. I’d likely want one profile for the storage account with a large set of inbound rules and another profile for the Key Vault with a profile with no inbound or outbound rules restricting the Key Vault to be accessed by the Storage Account. My CI/CD pipeline could reach the Key Vault via a Private Endpoint as NSPs DO NOT affect traffic ingressing through the resource’s Private Endpoint. Resources within the same NSP can communicate with each other as long as they’re associated with a managed identity (according to docs, still want to test and validate this myself).

Access Rules

Next up you have the Access Rule resource which exists underneath the Profile resource. This is where the the rubber hits the road. Access rules should be familiar to you old folks like myself as they are very similar to firewall rules (at least today). You have a direction (inbound or outbound) and some type of type of property to filter by. For example, as of the date of this post, inbound traffic can support filtering on subscriptions (very similar to the resource whitelisting in Azure Storage) and IP-based rules (similar to IP whitelisting of traditional service firewall). Outbound traffic can be filtered by FQDN (fully qualified domain name). You can have up to 200 rules per profile (this is a hard limit).

If you’re nosy like I am, you may have glanced at the API reference. Inside the API reference, there are additional items that may someday come to fruition such as email addresses and phone numbers (really curious as to the use cases for these) and service tags (DAMN handy but not yet usable as of the date of this post).

**Network Security Perimeter Profile Access Rules**

Resource Association

The Resource Association resource exists underneath the Network Security Perimeter. Resource associations connect a supported Azure resource to Network Security Perimeter’s profile and dictate an access mode. There are two documented access modes today which include learning (since renamed transition mode in documentation), and enforced. Transition mode (or learning mode in the APIs) is the default mode and is the mode you’ll start with to understand the inbound and outbound traffic of the resources associated with the NSP. Only after you understand those patterns should you switch to enforced mode. In enforced mode the access rules applied the relevant profile will take effect and all inbound and outbound traffic outside the NSP will be blocked unless explicitly allowed.

Audit mode is a third mode that is available via the API but isn’t mentioned in main public documentation. Your use case for audit mode as far as I can see is if you (say you’re information security) want to audit inbound and outbound traffic out of PaaS resources that are associated to a Network Security Perimeter you do not control.

To answer a few questions that I know immediately popped in my head:

Yes, you can associate resources to a network security perimeter that is in different a subscription.
No, you cannot associate a resource to multiple profiles in enforced mode in the same Network Security Group. Associations to other profiles must be configured in access mode.
No, you cannot associate a resource to multiple profiles in enforced mode in different Network Security Groups. Associations to other profiles must be configured in access mode.

**Network Security Perimeter Resource Associations**

Transition and Enforcement Mode

I want to dig a bit more into transition and enforcement mode and how they affect the resource-level service firewall controls.

In transition mode (which is the default mode) the NSP will evaluate the access rules of the profile the resource is associated with and will log whether the traffic would be allowed or denied (if diagnostic logging is enabled which you most definitely should enable and I’ll cover quickly below but more in depth in a future post) by the NSP and will then fallback to obeying the rules defined in the resource’s service firewall. This is a stellar way for you to get a baseline on how much shit will break with your new rules (and oh yes will shit break for some folks out there).

Once you understand the traffic patterns and design rules around what you want to allow in via the public side of the PaaS, you can flip the resource association to enforced mode. Once you flip that switch, the associated resources will have public access blocked both inbound and outbound. If you have IP whitelisting defined, those rules are ignored. If you checked off the “Allow Microsoft trusted services”, those rules are ignored. As I mentioned above, access through a Private Endpoint is never affected by NSPs as NSPs are concerned with traffic ingressing or egressing from the public sides of the PaaS resource.

It’s worth noting that a new value of the publicNetworkAccess property has been introduced. The publicNetworkAccess property is a property standard to PaaS (I haven’t seen one without it yet). Typically, the two values are either Enabled or Disabled. The property will be typically be set to Enabled if you’re allowing all public access or are restricting it to specific IP addresses, trusted services, specific resources, or service endpoints. It will be typically Disabled when you are blocking all public access and restricting it to Private Endpoints. Note that I use the word typically because this is Azure and there seem to always be exceptions. The new value introduced is SecuredByPerimeter. If you set the publicNetworkAccess property to this the resource is completely locked down to public traffic except for what is allowed in the NSP (even if no NSP is applied). If you plan on transitioning to NSPs fully for controlling public access (once all the resources you need have been onboarded and cross-NSP linking has been introduced) this is the setting you’ll want to go with.

Logging

The best feature of NSPs (in this dude’s opinion) is the logging feature. Like other Azure resources, NSPs support diagnostic settings. These are configured on a per Network Security Perimeter resource and include a plethora of logs categories. These diagnostic settings allow you to turn on detailed logging of traffic that comes in and out of an NSP. This provides you a standard log for all inbound and outbound traffic vs relying on the resource-level logs which can vary greatly per service (again, looking at you AI Search!).

Imagine being able to confirm that a user’s machine is accessing resource over the Microsoft public backbone vs a Private Endpoint and being able to see exactly what IP they’re using to do it. I can’t tell you how helpful this would have been over the past few years where forward web proxies were in use and were incorrectly configured for DNS affecting access to Private Endpoints.

This is such a damn cool feature that it is deserving of its own deep dive post. I’ll be adding this over the next few weeks.

Summing it up

For the purposes of this post, I really want you to take away an understanding of the NSP primitives, what they do, and how they’re related. Do some noodling on how you think you’d use these primitives and what use cases you might apply them to.

Here are some key points to walk away with:

Network Security Perimeters rules apply only to public traffic, this means traffic incoming to the service’s public IP on the Microsoft public backbone or leaving the service the Microsoft public backbone. They do not affect traffic coming in through a Private Endpoint.
Resources can be associated to Network Security Perimeters across subscriptions.
Leverage transition mode (learning mode in the API) to understand what traffic is coming in and going out of your resources over the public endpoints and how your access rules may effect them.
For resources that support NSPs (and are generally available) think about deploying those resources (it’s only a few as of the date of this post) with the publicAccessProperty set to SecuredByPerimeter at creation to ensure all public network access is controlled by a Network Security Perimeter. This will force you to use NSPs to control that traffic. Auditing or enforcing with Azure Policy would be nice control on top.
Do some experimentation with the logging provided by NSPs. It will truly blow your mind how useful the logs are to troubleshooting networking issues and identifying network security threats.
Keep in mind that NSPs will block diagnostic logs from being sent to a Log Analytics Workspace, Storage Account, or Event Hub for any resources within the NSPs. If you use a centralized Log Analytics Workspace model today, you are best off keeping NSPs in transition mode until cross-NSP linking becomes available.

Before I close this out, you’re probably wondering why I didn’t cover the Links resource. Well today, this resource doesn’t do anything and isn’t mentioned in the documentation. If you’re a close observer, you’ll notice the API properties of this resource hint to a possible upcoming feature (I ain’t spoiling anything here since the REST documentation is out there) that looks to provide a feature for NSP to NSP communication. We’ll have to wait and see!

In my next post I’ll walk through three common use cases for NSPs. I’ll cover how the problem was previously solved with service firewall controls and how it can be solved with NSPs. I’ll also walk through the configuration in the Portal and through Terraform. I’ll finish off this series (unless I think of anything else) with a deep dive into NSP logging.

See you next post!

Network Security Perimeters – The Problem They Solve

Posted on August 24, 2025 by mattfeltonma

This is part of my series on Network Security Perimeters:

Hello folks!

Last month a much anticipated new feature became available. No, it wasn’t AI-related (hence why little fanfare) but instead one of the biggest network security improvements in Azure. In early August, Microsoft announced that Network Security Perimeters were finally generally available. If you’ve ever had the pain of making PaaS (platform-as-a-service) to PaaS work in Azure, or had an organization that hadn’t yet fully adopted PrivateLink Private Endpoints, or even had a use case where public access was required and there wasn’t a way around it, then Network Security Perimeters (NSPs) will be one of your favorite new features.

This is gonna be lengthy, so I’m going to divide this content into 2 -3 separate posts. In this first post I’m going to cover a bit of history and the problem NSPs were created to solve.

Types of PaaS

Like any cloud platform, Azure has many PaaS-based services. I like to mentally divide these services into compute-based PaaS (where you upload code and control the actions the code performs) and service-based PaaS (where you upload data but don’t control the code that executes the actions). Examples of compute-based PaaS would be App Services, Functions, AKS, and the like. Service-based PaaS would be things like Storage, Key Vault, and AI Search. I’m sure there are other more effective ways to divide PaaS, but this is the way I’m gonna do it and you’ll have to like it for the purposes of this post.

Compute-based PaaS has historically been easier to control both inbound and outbound given because the feature set to do that was built directly into the product. These control mechanisms consisted of Private Endpoints or VNet injection to control inbound traffic and VNet integration and VNet injection to control outbound traffic.

Controlling service-based PaaS was a much different story.

Prior to the introduction of PrivateLink Private Endpoint support for service-based PaaS customers controlled inbound traffic destined for the public IP of the service instance using what I refer to as the service firewall. The service firewall has a few capabilities to control inbound traffic to the public IP address including:

IP-based whitelisting
Service-based whitelisting (via Trusted Microsoft Services)
Subnet whitelisting (via Service Endpoints)
Resource-based whitelisting

Not every service-based PaaS supports all of these capabilities. For example, resource-based whitelisting via the service firewall is only available in Storage today.

Service Firewall – IP Whitelisting

IP-based whitelisting is exactly what you think it is. You plug in an individual public IP or public IP CIDR (consider this a rule) and those sources are allowed through the service firewall and can create TCP connections. This feature was commonly used to allow specific customer IP addresses, often linked to forward web proxies, access to the service-based PaaS prior to the organization implementing Private Endpoints. The major consideration of this feature is there is a finite number of rules (400 last I checked) you can have. While this worked well for the forward web proxy use case, it would often be insufficient for PaaS to PaaS communication (such as Storage to Key Vault to retrieve a CMK for encryption or attempting to whitelist the entire Power BI service prior to its own vnet integration support).

Service Firewall – Service Whitelisting

Next up you had service-based whitelisting through toggling the “Trusted Microsoft Services” option. This toggle would allow all of the public IP addresses belonging a specific set of Microsoft services which differs on a per service-based PaaS basis. This means that toggling that switch for Storage allows a different set of services versus toggling that for Key Vault. The differing of what constitutes a trusted service made this option painful because the documentation on what’s trusted for each service has never been documented well. Additionally, this allows ALL public IPs from that service used by any Microsoft customer not just the public IPs used by your instances of these services. This particular risk was always a pain point for most organizational security teams. Unfortunately, in the past, this was required to be enabled to facilitate any service-based PaaS to service-based PaaS communication. Even worse is the listing of trusted services didn’t include the service you actually needed.

**Service Firewall – Trusted Microsoft Services exception**

Service Firewall – Virtual Network Service Endpoints

Next up you have Virtual Network Service Endpoints (or subnet whitelisting as I think of them). Service Endpoints were the predecessor to Private Endpoints. They allowed you to limit access to the public IP address of an Azure PaaS instance based on the virtual network (really the subnet in the virtual network) id. They do this by injecting specific routes into the virtual network’s subnet where they are deployed. When the traffic egresses the subnet, it is tagged with an identifier which is used in the ACL (access control list) of the PaaS instance. These routes can be seen when viewing the effective routes on a NIC (network interface card) of a virtual machine running in the subnet.

**Service Firewall – Service Endpoints Routes**

You would then whitelist the virtual network’s subnet on the PaaS instance allowing that traffic to flow the PaaS instance’s public IP address.

**Service Firewall – Service Endpoint Rules**

The major security issue with Service Endpoints is they create a wide open data exfiltration point. Any traffic leaving a compute within that subnet will traverse directly to the PaaS instance and will ignore any UDRs (since the UDRs are less specific) bypassing any customer NVA (network virtual appliance like a firewall). Unlike Private Endpoints, Service Endpoints aren’t limited to a specific instance of your PaaS but allow network access to all instances of that PaaS. This means an attacker could take advantage of a Service Endpoint to exfiltrate data to their malicious PaaS instance with zero network visibility to the customer. Gross. There were attempts to address this gap with Service Endpoint Policies which allow you to limit the egress to specific PaaS instances, but these never saw wider adoption than storage.

Operationally, these things are a complete shit show. First off, they are non-routeable outside the subnet so they do you no good from on-premises. The other pain point with them is customers will often implement them without understanding how they actually work causing confusion on why Azure traffic is routing the way it is or trying to figure out why traffic isn’t getting to where it needs to go.

These days service endpoints have very limited use cases and you should avoid them where possible in favor of Private Endpoints. The exception being cost optimization. There is an inbound and outbound network charge for Private Endpoints which can get considerable when talking about Azure Backup or Azure Site Recovery at scale. Service Endpoints do not have that same inbound/outbound network cost and can help to reduce costs in those circumstances.

While Service Endpoints don’t really have much to do with NSPs, I did want to cover them because of the amount of confusion around how they work (and how much I hate them) and because they have traditionally been an inbound network control.

Service Firewall – Resource Whitelisting

Last but not least, we have resource whitelisting. Resource whitelisting is a service-firewall capability unique to Azure Storage. It allows you to permit inbound network connectivity to Azure Storage to a specific instance of an Azure PaaS service, all instances of a PaaS service in a subscription, or all within the customer tenant. The resource then uses it managed identity and RBAC assignments to control what it can do with the resource after it creates its network connection. This was a good example of early attempt to authorize network access based on a service identity. It was commonly used when an instance of Azure Machine Learning (AML), AI Foundry Hubs, AI Search, and the like needed to connect to storage. It provided a more restricted method of network control vs the traditional Allow Trusted Microsoft Services option.

Ok… and what about outbound?

Another major issue you should notice is the above are all inbound. What the hell do you do about outbound controls for these service-based PaaS? Well, the answer has historically been jack shit for almost every PaaS service. The focus was all about controlling the inbound network traffic and authorization of the resource to hope no one does anything shady with the resource by having it make outbound calls to other services to exfiltrate data or do something else malicious. Not ideal right?

Some service-based PaaS services came up with creative solutions around this. Resources that fall under the Cognitive Services umbrella, such as the Azure OpenAI Service got data exfiltration controls. AI Search introduced the concept of Shared Private Access which didn’t really control the outbound access, but did allow you to further secure the downstream resources AI Search accessed. Each Product Group was doing their best within the confines of their bubble.

Network Security Perimeters to the rescue

You should now see the challenges that faced inbound and outbound network controls in service-based PaaS.

Sure you had lots of options for inbound controls, but they had issues at scale (IP whitelisting) or certain features were not available in all PaaS (resource-based whitelisting).
The configuration for inbound network control features lived as properties of the resource and could be configured differently by different teams resulting in access challenges.
Outbound, you had very few controls, and if there were controls, they differed on a per-service basis.
Some services would give great details as to the inbound traffic allowed or denied (such as Azure Storage) while other services didn’t give you any of that network information (I’m looking at you AI Search).

In my next post I’ll walk through how NSPs help to solve each of these issues and why you should be planning a migration over to them as soon as possible. I’ll walk through a few common examples of how NSPs can ease the burden using common service-based PaaS to service-based PaaS. I’ll also cover how they can be extremely useful in troubleshooting network connectivity caused by routing issues or even DNS issues.

See you next post!

DNS in Microsoft Azure – DNS Security Policies

Posted on August 3, 2025 by mattfeltonma

This is part of my series on DNS in Microsoft Azure.

Hi there folks! After a busy July packed with a vacation and an insane amount of work , I’m back with a new post. Today I’m going to cover a new feature that has been years coming. Yes folks, DNS query logging is now native to the platform with the introduction of DNS Security Policies into GA (generally available) last month. No longer will you have to solution around this long painful gap. In this post I’ll walk through what this new resource is, what it can do (beyond DNS query logging), cover the use cases I’ve tested with it, show you some samples of the logs, and finally cover some potential designs to incorporate it. Let’s dive in!

A long time coming

If you’ve ever spent time troubleshooting a connection error or trying to detect, block, and analyze malware you are likely familiar with the value of DNS query logs. The former makes it a must for day-to-day operations and the latter a critical piece of data for information security. Historically, it’s been a pain to gather this in Microsoft Azure. The wire server (magic IP, 168 address, whatever your favorite nickname) that is made available within a virtual network to use Azure’s built-in DNS resolution service has lacked the capability to capture DNS queries. This mean queries from compute within your virtual network that were resolving to Azure Private DNS zones or a public DNS zone via Azure-provided DNS weren’t captured. Even the introduction of the Azure Private Resolver didn’t address this gap. This lead to customers with requirements to capture DNS query logs having to get fancy.

The most common pattern customers used to address this gap was to introduce a third-party DNS service like an Infoblox, Bluecat, BIND server, or even Windows DNS Server that all compute running within Azure would use for resolution. While customers were able to use this pattern to get the logs, it meant more virtual machines, more costs, more overhead, and it was typically too expensive to implement for workloads that may require complete isolation and didn’t fit into a typical hub and spoke pattern.

**Example design for BYODNS for query logging**

When the Azure Private Resolver service got introduced along with DNS Forwarding Rule Sets, customers using Azure Firewall had the option of ditching the third-party DNS service and using Azure Firewall’s DNS proxy service which included DNS query logging (kind odd it went there first, right?). This was another common pattern I saw pop up in that Azure Firewall customer base.

**Example design using Azure Firewall for DNS query logging and Azure Private DNS Resolver**

Beyond whatever other creative ways customers were addressing this gap, it was a gap and it was costing customers extra money. In comes DNS Security Policies to save the day.

DNS Security Policies Components

DNS Security Policies provide 2 core functions today:

DNS query filtering
DNS query logging

Before I dive into those features in depth, I’m a fan of looking at the resource as a whole from the API layer to get an idea of the components, their purpose, and their relationships.

**DNS Security Policies and related resources**

DNS Security Policies fall under the Microsoft.Network resource provider and are regional resources. The simplest way to understand a resource provider is to think of a namespace in traditional programming. Within a namespace there are resource types (think classes) with specific resource operations. Within the Microsoft.Network resource provider, the three direct children resources that are key.

You’ll notice the Microsoft Learn documentation uses different terminology from what the API uses for some of the resources. To keep things simple, I’ll be using the Microsoft Learn documentation. Here is a quick cheat sheet:

DNS Resolver Policies -> DNS Security Policy
DNS Security Rules -> DNS Traffic Rules
DNS Resolver Domain Lists -> Domain Lists

Each DNS Security Policy has two children resources: DNS Traffic Rules and Virtual Network Links. DNS traffic rules are the guts of your logic for the DNS Security Policy. Each policy can have up to 10 rules (as of August 2025). Each rule consists of a priority (100 – 65000), action (block, allow, alert), and related domain list (I’ll cover these in a few). You can create multiple rules and order them in priority similar to the screenshot below.

Based on the above logic, when the DNS Security Policy triggers a rule based on the domain matching the associated domain list. If the domain being requested is in the list associated with the priority 100 rule, the query is blocked. If not, it’s then processed by the alert rule (which seems to do nothing in my experience as I’ll cover later). Finally, it will hit the last rule which will allow it through but log it.

As I covered above, each rule is associate with one or more domain list. Domain lists are sibling resources to DNS Security Policies. By being a sibling vs a child, they can be re-used across multiple DNS Security Policies (and whatever other use Microsoft comes up with). This allows you to define your domain lists centrally and re-use them across multiple rule sets if, for example, you wanted to maintain your domains lists consistently across environments (test/qa/prod/etc). Domain lists are pretty simple resources consisting of a domain name or wildcard (denoted by a period). It’s important to understand how the domains will be processed. For example (I’m going to steal this direct from the docs), if you allow contoso.com at rule 100 but block bad.contoso.com at rule 110 the query to bad.contoso.com will be allowed because it falls under contoso.com which was allowed by a higher priority rule.

The virtual network link resource is the other child of the DNS Security Policy. This functions similar to the virtual network links with Private DNS Zones as it associates the DNS Security Policy to a virtual network where it will process queries sent through the wire server (Azure-provided DNS). Each virtual network can be linked to one DNS Security Policy but each DNS Security Policy can be linked multiple virtual networks allowing you to use them for those virtual networks connected in a hub and spoke like architecture with centralized DNS as well as those virtual networks that may require complete network isolation.

**Example of DNS Security Policy virtual network links**

DNS Security Policies support diagnostic logging. This allows you to send each query captured by the policy to storage, event hub, or a log analytics workspace. If using a log analytics workspace, the logs are written to a table named DNSQueryLogs. Log entries will look like the below. You’ll get the key pieces of information such as source IP address of the query and the action taken on it. Here you’ll see the query was denied which is indicated by the ResolverPolicyRuleAction. The values here will be “Deny” for blocks, “None” for alerts, and “Allow” for anything allowed.

When the query is denied, instead of getting back an NXDomain, the machine making the query receives back a CNAME of blockpolicy.azuredns.invalid indicating the query has been blocked by DNS Security Policy. This is much better behavior than a NXDomain because now we know what the culprit for the failed DNS query is.

**Example of DNS query being denied by DNS Security Policy**

To visualize how the allow and deny works, I threw together two quick and dirty visual representations.

**Example of how Allow and Block DNS Traffic Rules work**

Scenarios you may be wondering about

Like many of you, I’m curious to see what does work and doesn’t work. I went through and tested a variety of scenarios. Here are a few below and my results when using these policies:

Machine using an external DNS server and is not using wire server (magic IP, 168 address, etc)
- Query is not logged by DNS Security Policies
Machine using its wire server in its virtual network
- Query is captured
Machine using Private DNS Resolver in the same virtual network
- Query is captured
Machine using a DNS Proxy which sits in front of the Private DNS Resolver
- Query is captured
Machine queries an A record or PTR record
- Query is captured
Machine queries AAAA record
- Query is captured
Machine queries using TCP-based query instead of UDP-based query
- Query is captured
PaaS Services tested successfully
- Azure Bastion
- Azure Firewall

How might you use this?

So now you better understand how the service works and what it does. I’ll now tell you how I’d use it. I’m sure folks smarter than me will come out with more effective ways, but here is how I’m envisioning it now.

Based on the testing I’ve done (and testing done by one of my wonderful peers Chris Jasset) the DNS Security Policies seem to take effect at the wire server. This means you’ll want to link the policies to the virtual networks where DNS packets are directed to the wire server. In a centralized DNS design such as below, this would be linked to the virtual network containing the Azure Private DNS Resolver or 3rd-party DNS solution. You would need one DNS Security Policy per region give they are regional in nature.

**Sample design for centralized DNS resolution**

If you’re using a distributed DNS model, or have isolated virtual networks, your design would look something more like below. Here the DNS Security Policies are linked to each virtual network to ensure the packet is captured at the wire server of the virtual network where the query originates.

**Sample design for distributed DNS Security Policy**

As for domain lists, I think most organizations will likely have three separate domain lists. One for block, one for alert (again I don’t find this super useful as of now), and one for allow. These domain lists could be established in a production subscription and shared across lower environments to ensure consistency of blocked domains across environments.

Summing it up

There are a few big takeaways for you this post:

It’s time to revisit how you’re capturing DNS query logs. If your only reason for implementing a third-party DNS service was DNS query logging, you may want to revisit that to see to see if this new solution is more cost effective.
Just like Azure Private DNS, don’t forget to link your policy to the right virtual network. Whatever virtual network you’re sending DNS queries to the wire server is where these should be linked.
DNS query logs are very chatty. You may want to look at ways of optimizing what you capture (if you’re sending it to a third-party logging solution) of how much you retain (if you’re keeping it in a Log Analytics Workspace). This is especially true if you use a wildcard in the allow to capture everything. PaaS especially is very chatty. If you aren’t careful about this, you’ll owe Microsoft a big fat check by the end of that first month.

Lastly, I threw together some samples of the creation of these resources in Terraform if you’re curious. You can find the code here.

Well folks, hopefully you learned something new today. Thanks as always for taking the time to read the content!

Azure Authorization – Azure ABAC (Attribute-based Access Control)

Posted on June 9, 2025 by mattfeltonma

This is part of my series on Azure Authorization.

Welcome back fellow geeks.

I do a lot of learning and educational sessions with my customer base. The volume pretty much demands reusable content which means I gotta build decks and code samples… and worse maintain them. The maintenance piece typically consists of me mentally promising myself to update the content and kicking the can down the road for a few months. Eventually, I get around to updating the content.

This month I was doing some updates to my content around Azure Authorization and decided to spend a bit more time with Azure ABAC (Attribute-based access control). For those of you unfamiliar with Azure ABAC, well it’s no surprise because the use cases are so very limited as of today. Limited as the use cases are, it’s a worthwhile functionality to understand because Microsoft does use it in its products and you may have use cases where it makes sense.

The Dream of ABAC

Let’s first touch briefly on the differences between (RBAC) role-based access control and (ABAC) attribute-based access control. Attribute-based access control has been the dream for the security industry for as long as I can remember. RBAC has been the predominant authorization mechanism in a majority of applications over the years. The challenge with RBAC is it has typically translated to basic group membership where an application authorizes a user solely on whether or not the user is in a group. Access to these groups would typically come through some type of request for membership and implementation by a central governance team. Those processes have tended to be not super user friendly and the access has tended to be very course-grained.

ABAC meanwhile promised more fine-grained access based upon attributes of the security principal, resource, or whatever your mind can dream up. Sounds awesome right? Well it is, but it largely remained a dream in the mainstream world with a few attempts such as Windows Dynamic Access Control (Before you comment, yeah I get you may have had some cool apps doing this stuff years ago and that is awesome, but let’s stick with the majority). This began to change when cloud came around with the introduction of more modern protocols and standards such as SAML, OIDC, and OAuth. These protocols provide more flexibility with how the identity provider packages attributes about the user in the token delivered to the service provider/resource provider/what have you.

When it came to the Azure cloud, Microsoft went the traditional RBAC path for much of the platform. User or group gets placed in Azure RBAC role and user(s) gets access. I explain Azure RBAC in my other posts on RBAC. There is a bit of flexibility on the Entra ID side for the initial access token via Entra ID Conditional Access, but RBAC in the Azure realm. This was the story for many years of Azure.

In 2021 Microsoft decided something more flexible was needed and introduced Azure ABAC. The world rejoiced… right? Nah, not really. While the introduction of ABAC was awesome, its scope of use was and still is extremely limited. As of the date of this blog, ABAC is only usable for Azure Storage blob and queue operations. All is not lost though, there are some great use cases for this feature so it’s important to understand how it works.

How does ABAC work?

Alright, history lesson and complaining about limited scope aside, let’s now explore how the feature works.

ABAC is facilitated through an additional property on Azure RBAC Role Assignment resources. I’m going to assume you understand the ins and out of role assignments. If you don’t, check out my prior post on the topic. In its most simple sense, an Azure RBAC role assignment is the association of a role to a security principal granting that principal the permissions defined in the role over a particular scope of resources. As I’ve covered previously, role assignments are an Azure resource that have defined sets of properties. The properties we care about for the scope of this discussion are the conditionVersion and condition properties. The conditionVersion property will always have a value of 2.0 for now. The condition property is where we work our ABAC magic.

The condition property is made up of a series of conditions which each consist of an action and one or more expressions. The logic for conditions is kinda weird, so I’m walk you through it using some of the examples from documentation as well as complex condition I throw together. First, let’s look at the general structure.

**Structure of conditions used in ABAC**

In the above image you can see the basic building blocks of a condition. Looks super confusing and complicated right? I know it did to me at first. Thankfully, the kind souls who write the public documentation broke this down in a more friendly programming-like way.

**Far more simple explanation of conditions**

In each condition property we first have the action line where the logic looks to see if the action being performed by the security principal doesn’t (note the exclamation point which negates whats in the parentheses) match the action we’re applying the conditions to. You’ll commonly see a line like:

!(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})

This line is saying if the action isn’t blobs/read (which would be data plane call to read the contents of the blob) then the line should evaluate to true. If it evaluates to true, then the access is allowed and the expressions are not evaluated any further.

After this line we have the expression which is only evaluated when the first line evaluates to false (which in the example I just covered would mean the security principal is trying to read the content of a blob). The expressions support four categories of what Microsoft refers to as condition features. There are currently four features in various states of GA (general availability) and preview (refer to the documentation for those details). These four categories include:

Requests
Environment
Resource
Principal (security principal)

These four categories give you a ton of flexibility. Requests covers the details of the request to storage, for example such as limiting a user to specific blob prefixes based on the prefix within the request. Environment can be used to limit the user to accessing the resource from a specific Private Link Private Endpoint over Private Link in general (think defense-in-depth here). The resource feature exposes properties of the resource being accessed, which I find the most flexible thing to be blob index tags. Lastly, we have security principal and this is where you can muck around with custom security attributes in Entra ID (very cool feature if you haven’t touched it).

In a given condition we can have multiple expressions and within the condition property we can string together multiple conditions with AND and OR logic. I’m a big believer in going big or going home, so let’s take a look at a complex condition.

Diving into the Deep End

Let’s say I have a whole bunch of data I need to make available via a blobs in an Azure Storage Account. I have a strict requirement to use a single storage account and the blobs I’m going to store have different data classifications denoted by a blob index tag key named access_level. Blobs without this key are accessible by everyone while blobs classified high, medium, or low are only accessible by users with approval for the same or higher access levels (example: user with high access level can access high, medium, low, and data with no access level). Lastly, I have a requirement that data at the high access level can only be accessed during business hours.

I use a custom security attribute in Entra ID called accesslevel under an attribute set named organization to denote a user’s approved access level.

Here is how that policy would break down.

My first condition is built to allow users to read any blobs that don’t have the access_level tag.

# Condition that allows users within scope of the assignment access to documents that do not have an access level tag
(
  (
    # If the action being performed doesn't match blobs/read then result in true and allow access
    !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
  )
  OR 
  (
    # If the blob doesn't have a blob index tag with a key of access_level then allow access
    NOT @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags&$keys$&] ForAnyOfAnyValues:StringEquals {'access_level'}
  )
)

If the blob does have an access tag, I want to start incorporating my logic. The next condition I include allows users with the accesslevel security attribute set to high to read blobs with a blob index tag of access_level equal to low or medium. I also also allow them to read blobs tagged with high if it’s between 9AM and 5PM EST.

# Condition that allows users within scope of the assignment to access medium and low tagged data if they have a custom 
# security attribute of accesslevel set to high. High data can also be read within working hours
OR
(
 (
   # If the action being performed doesn't match blobs/read then result in true and allow access
   !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
 )
 OR 
 (
   # If the blob has an index tag of access_level with a value of medium or low allow the user access if they have a custom security
   # attribute of organization_accesslevel set to high
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'medium', 'low'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'high'
 )
 OR
 (
   # If the blob has an index tag of access_level with a value of high allow the user access if they have a custom security
   # attribute of organization_accesslevel set to high and it's within working hours
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'high'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'high'
   AND
   @Environment[UtcNow] DateTimeGreaterThan '2025-06-09T12:00:00.0Z'
   AND
   @Environment[UtcNow] DateTimeLessThan '2045-06-09T21:00:00.0Z'
 )
)

Next up is users with medium access level. These users are granted access to data tagged medium or low.

# Condition that allows users within scope of the assignment to access medium and low tagged data if they have a custom 
# security attribute of accesslevel set to medium
OR
(
  (
    # If the action being performed doesn't match blobs/read then result in true and allow access
    !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
  )
  OR 
  (
    # If the blob has an index tag of access_level with a value of medium or low allow the user access if they have a custom security
    # attribute of organization_accesslevel set to medium
    @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'medium', 'low'}
    AND
    @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'medium'
 )
)

Finally, I allow users with low access level to access data tagged as low.

# Condition that allows users within scope of the assignment to access low tagged data if they have a custom 
# security attribute of accesslevel set to low
OR
(
 (
   # If the action being performed doesn't match blobs/read then result in true and allow access
   !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
 )
 OR 
 (
   # If the blob has an index tag of access_level with a value of low allow the user access if they have a custom security
   # attribute of organization_accesslevel set to low
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'low'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'low'
 )
)

Notice how I separated each condition using OR. If the first condition resolves to false, then the next condition is evaluated until all access is granted or all conditions are exhausted. Neat right?

Summing it up

So why should you care about this if its use case is so limited? Well, you should care because that is ABAC’s use case today, and it would be expanded in the future. Furthermore, ABAC allows you to be more granular in how you grant access to data in Azure Storage (again, blob or queue only). You likely have use cases where this can provide another layer of security to further constrain a security principal’s access. You’ll also see these conditions used in Microsoft’s products such as AI Foundry.

The other reason it’s helpful to understand this language used for the condition, is conditions are expanding into other services such as Azure RBAC Delegation (which if you aren’t using you should be). While the language can be complex, it does make sense if you muck around with it a bit.

A final bit of guidance here, don’t try to write conditions by hand. Use the visual builder in the Azure Portal as seen below. It will help you get some basic conditions in place that you can further modify directly via the code view.

Next time you’re locking down an Azure storage account, think about whether or not you can further restrict humans and non-humans alike based on the attributes discussed today. The main places I’ve seen this used are for user profiles, further restricting user access to specific subsets of data (similar to the one I walked through above), or even adding an additional layer of network security baked directly into the role assignment itself.

See you next post!