Microsoft Foundry – BYO AI Gateway – Part 3

Microsoft Foundry – BYO AI Gateway – Part 3

Hello once again folks! Today I’m going to add yet another post to my BYO AI Gateway feature of Microsoft Foundry series. In my first post I gave a background on the use case for this feature, in the second post I walked the concepts required to understand the feature, the resources involved in the setup, and the schema of those resource objects. In this post I’m going to walk through the architecture I setup to play with this feature, why I made the choices I did, and dig into some of the actual Terraform code I put together to set this whole thing up. Let’s dive in!

The foundational architecture

When I wanted to experiment with this feature I wanted to test it in an architecture that is typical to my customer base. For this I chose the classic tried and true hub and spoke architecture. I opted out of VWAN and went with a traditional virtual network model because I prefer the visibility and control to that model during experimentation. When the hub becomes a managed VWAN Hub, I get that fancy overlay which makes invisible some of the magic of what is happening underneath. This model enables me to do packet captures at every step and manage routing at a very granular level, which is a must when playing with cutting edge features.

For this setup I have a lab I built out in Terraform which gives me that hub and spoke architecture, centralized DNS resolution, logging, and access to multiple regions. The multiple regions piece of the puzzle is key because feature availability across Foundry features and APIM v2 SKUs are still in flux. The lab also uses three spoke virtual networks. This gives allows me to plop pieces in different spokes to see how things behave and track traffic patterns. It also gives me flexibility when I need to wait for purge operations like when purging a Microsoft Foundry resource configured with a standard agent setup and clearing the lock on the delegated subnet for the VNet injection model. If you’ve mucked around with this you know sometimes it can be 15 minutes and sometimes it can be 2 days.

I drop one of three spokes into one of the “hero” regions. This is a region that gets new features sooner than ours. For example, in this lab I drop it into East US 2 while the hub and other two spokes go in West US 3 (where I’m less likely to run into an quota or capacity issues). East US 2 gives me the option to deploy APIM v2 Standard SKU. In the next section I’ll explain why I’m going with v2 for this experimentation.

Foundational architecture

AI Gateway Architecture

For an AI Gateway I decided to use APIM. My buddy Piotr Karpala has a great repository of 3rd-party AI Gateway solutions if you want to test this with something outside of APIM. I’m going to plop this into the “hero” region spoke in East US 2 to so I can deploy a v2 Standard SKU. The reason I’m using a v2 SKU is it provides another networking model that the classic SKUs do not, and that is Private Endpoint and VNet integration. In this model I block public traffic to the APIM service, create a Private Endpoint to enable private inbound access, and setup VNet integration to a delegated subnet to keep outbound traffic from any of the APIM instances flowing through my virtual network so I can mediate it and optionally inspect it. While the Private Endpoint is only supported for the Gateway and not the Developer Portal, I don’t care in this instance because I don’t plan on using the Developer Portal on an APIM acting as an AI Gateway. You can also create a private endpoint for a APIM v2 service instance that uses VNet injection, but it requires the Premium SKU and I’m super cheap, so I opted out of that.

APIM v2 with Private Endpoint and VNet Integration

The reason I picked this networking model for APIM is it makes it easy for me to inject the service into a Microsoft Foundry account configured with a standard agent and the managed virtual network model. In a future post I’ll dive more into the managed virtual network model. For now, just be aware that is exists, it’s in preview, and it doesn’t have many of the limitations the Foundry Agent Service VNet injection model has. There are considerations no doubt, but my personal take is it’s the better of the two strategically.

On the APIM instance I configured two backend objects, one for each Foundry instance. The backends are organized into a pooled backend so I could load balance across the two Foundry instances to maximize my TPM (tokens per minute). I defined four APIs. Two APIs support the Azure OpenAI inferencing and authoring API, one supports the Azure OpenAI v1 API, and the last is a simple custom Hello World API I use to test connectivity. I use two APIs for the Azure OpenAI inferencing and authoring API because one is designed to support APIM as an AI Gateway uses some custom policy snippets and the other is very generic and is used to test model gateway connections from Foundry purely so I’m familiar with the basics of them.

APIM APIs

Foundry Architecture

The Foundry architecture is quite simple. I deployed a single instance of Foundry configured to support standard agents and using a VNet injection model. A subnet is delegated in a different spoke to support the agent vnet injection and supporting Private Endpoints are deployed to a separate subnet in that same virtual network.

The whole setup looks something like the below:

Lab setup

Setting up the AI Gateway

At this point you should have a good understanding of what I’m working with. Let’s talk button pushing. The first thing you’ll need to do is get your AI Gateway setup. To setup the APIM instance I using the Terraform AzureRM and AzApi providers. Like I mentioned above, it was setup as a v2 with the standard SKU public network access disabled, inbound access restricted to private endpoints and outbound access configured for VNet integration. You can find the whole of the code in my lab repository if you’re curious. For the purposes of the post, I’ll only be including the relevant snippets.

One critical thing to take note of is whatever networking model you choose for APIM for this integration, you need to use a certificate issued by a trusted public CA (certificate authority). This is required because at the date of this post, the agent service does not support certificates issued by private CAs. Reason being, you have no ability to inject that root and intermediate certs into the trusted store of the agent compute. For this lab I used the Terraform Acme and Cloudflare providers. It’s actually not bad at all to have a fresh cert provisioned directly as part of the pipeline for labbing and the like, and best part is it’s free for cheap people like myself. There is a sample of that code in the repo.

As I mentioned in my last post, the BYO AI Gateway integration with Foundry supports static or dynamic setup. In the static model you define the models directly in the connection metadata you want to be made available to the connection (see my last post for an example). In the dynamic model the models can be fetched by an API call to the management.azure.com API. This latter option requires additional operations be defined in the API such as what you see below.

## Create an operation to support getting a specific deployment by name when using the Foundry APIM connection
##
resource "azurerm_api_management_api_operation" "apim_operation_openai_original_get_deployment_by_name" {
depends_on = [
azurerm_api_management_api.openai_original
]
operation_id = "get-deployment-by-name"
api_name = azurerm_api_management_api.openai_original.name
api_management_name = azurerm_api_management.apim.name
resource_group_name = azurerm_resource_group.rg_ai_gateway.name
display_name = "Get Deployment by Name"
method = "GET"
url_template = "/deployments/{deploymentName}"
template_parameter {
name = "deploymentName"
required = true
type = "string"
}
}
## Create an operation to support enumerating deployments when using the Foundry APIM connection
##
resource "azurerm_api_management_api_operation" "apim_operation_openai_original_list_deployments_by_name" {
depends_on = [
azurerm_api_management_api_operation_policy.apim_policy_openai_original_get_deployment_by_name
]
operation_id = "list-deployments"
api_name = azurerm_api_management_api.openai_original.name
api_management_name = azurerm_api_management.apim.name
resource_group_name = azurerm_resource_group.rg_ai_gateway.name
display_name = "List Deployments"
method = "GET"
url_template = "/deployments"
}

You then define a policy for that operation to configure it to call the correct endpoint via the ARM API like below. Notice I used the authentication-managed-identity policy snippet to use the APIM managed identity to call the Foundry resource to fetch deployment information. If you’re sharing the API across backends, make sure all backends have all the same models deployed. If not, you’ll need to incorporate some additional logic to hit the backend for each pool to ensure you don’t return models that don’t exist in a specific backend. This will require your APIM instance managed identity to have at least the Azure RBAC Reader role over the Foundry resources.

## Create an policy for the get deployment by name operation to route to the Foundry APIM connection
##
resource "azurerm_api_management_api_operation_policy" "apim_policy_openai_original_get_deployment_by_name" {
depends_on = [
azurerm_api_management_api_operation.apim_operation_openai_original_get_deployment_by_name,
]
api_name = azurerm_api_management_api.openai_original.name
operation_id = azurerm_api_management_api_operation.apim_operation_openai_original_get_deployment_by_name.operation_id
api_management_name = azurerm_api_management.apim.name
resource_group_name = azurerm_resource_group.rg_ai_gateway.name
xml_content = <<XML
<policies>
<inbound>
<authentication-managed-identity resource="https://management.azure.com/" />
<rewrite-uri template="/deployments/{deploymentName}?api-version=${local.ai_services_arm_api_version}" copy-unmatched-params="false" />
<!--Specify a Foundry deployment that has the models deployed -->
<set-backend-service base-url="https://management.azure.com${azurerm_cognitive_account.ai_foundry_accounts[keys(local.ai_foundry_regions)[0]].id}" />
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
</outbound>
<on-error>
<base />
</on-error>
</policies>
XML
}
## Create an policy for the list deployments operation to route to the Foundry APIM connection
##
resource "azurerm_api_management_api_operation_policy" "apim_policy_openai_original_list_deployments_by_name" {
depends_on = [
azurerm_api_management_api_operation.apim_operation_openai_original_list_deployments_by_name
]
api_name = azurerm_api_management_api.openai_original.name
operation_id = azurerm_api_management_api_operation.apim_operation_openai_original_list_deployments_by_name.operation_id
api_management_name = azurerm_api_management.apim.name
resource_group_name = azurerm_resource_group.rg_ai_gateway.name
xml_content = <<XML
<policies>
<inbound>
<authentication-managed-identity resource="https://management.azure.com/" />
<rewrite-uri template="/deployments?api-version=${local.ai_services_arm_api_version}" copy-unmatched-params="false" />
<!--Azure Resource Manager-->
<set-backend-service base-url="https://management.azure.com${azurerm_cognitive_account.ai_foundry_accounts[keys(local.ai_foundry_regions)[0]].id}" />
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
</outbound>
<on-error>
<base />
</on-error>
</policies>
XML
}

In my lab, I defined these two operations for both the classic (OpenAI Inferencing and Authoring API) and v1 API. This allowed me to mess around with both static and dynamic APIM and Model Gateway connections.

Once you get Foundry hooked into APIM using this integration (and I’ll cover the Foundry part in the next post), you get access to some pretty neat information in the headers. As of the date of this post, these will be some of the headers you’ll see. You’ll notice my x-forwarded-for path includes my endpoint’s IP address as well as the IP of the container running in the managed Microsoft-compute environment (notice that is using CGNAT IP space which clears up why CGNAT is unsupported to be used by the customer when using agent with VNet injection). The x-ms-foundry-project-id is the unique project GUID of the project the agent was created under (could be useful for throttling and logging). The x-ms-foundry-agent-id is the unique agent identifier of the specific revision of the agent (again useful for logging and throttling). The x-ms-client-request-id is actually the Foundry project managed identity, not the agent identity which is important to note. If you want to use Entra for the BYO AI Gateway APIM connection, you’re going to be limited to this or API key. There is a connection authentication option to use the agent’s actual Entra ID Agent Identity, but I’ve only used that for the MCP Server feature of Foundry, never for this so I’m not sure if it works or is supported.

{
"Authorization": "Bearer REDACTED",
"Content-Length": "474",
"Content-Type": "application/json; charset=utf-8",
"Host": "apimeusXXXXX.azure-api.net",
"Max-Forwards": "10",
"Correlation-Context": "leaf_customer_span_id=173926958944XXXXXX",
"traceparent": "00-62ff160923b2c1724242c037be40e7cb-4f1b402461aXXXXX-01",
"X-Request-ID": "96534855-a35a-481a-886d-XXXXXXXXXXXX",
"x-ms-client-request-id": "76ddf586-260b-4e37-8f4c-XXXXXXXXXXXX",
"openai-project": "sampleproject1",
"x-ms-foundry-agent-id": "TestAgent-ai-gateway-static:5",
"x-ms-foundry-model-id": "conn1apimgwstaticopenai/gpt-4o",
"x-ms-foundry-project-id": "455cbebf-a0bc-425e-99f6-XXXXXXXXXXX",
"x-forwarded-for": "100.64.9.87;10.0.9.213:10095",
"x-envoy-external-address": "100.64.9.87",
"x-envoy-expected-rq-timeout-ms": "1800000",
"x-k8se-app-name": "j8820ec0658b4aeXXXXX-dataproxy--vuww7ja",
"x-k8se-app-namespace": "wonderfulsky-a2fXXXXX",
"x-k8se-protocol": "http1",
"x-k8se-app-kind": "web",
"x-ms-containerapp-name": "j8820ec0658b4aeXXXXX-dataproxy",
"x-ms-containerapp-revision-name": "j8820ec0658b4aeXXXXX-dataproxy--vuww7ja",
"x-arr-ssl": "2048|256|CN=Microsoft Azure RSA TLS Issuing CA 04;O=Microsoft Corporation;C=US|CN=*.azure-api.net;O=Microsoft Corporation;L=Redmond;S=WA;C=US",
"x-forwarded-proto": "https",
"x-forwarded-path": "/v1/https/apimeusXXXXX.azure-api.net/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
"X-ARR-LOG-ID": "76ddf586-260b-4e37-8f4c-XXXXXXXXXXXX",
"CLIENT-IP": "10.0.9.213:10095",
"DISGUISED-HOST": "apimeusXXXXX.azure-api.net",
"X-SITE-DEPLOYMENT-ID": "apimwebappXXXXXX6OTVsZqxOcTZLpubQ9iNmzQ8kzMOmkEhw",
"WAS-DEFAULT-HOSTNAME": "apimwebappXXXXXX6otvszqxoctzlpubq9inmzq8kzmomkehw.apimaseXXXXXXX6otvszqxoctz.appserviceenvironment.net",
"X-AppService-Proto": "https",
"X-Forwarded-TlsVersion": "1.3",
"X-Original-URL": "/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
"X-WAWS-Unencoded-URL": "/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
"X-Azure-JA4-Fingerprint": "t13d1113h2_d3731e0d3936_XXXXXXXXXXXX"
}

Using the information above, I crafted the policy below. It’s nothing fancy, but shows an example of throttling based on the project id and logging the agent identifier via the token metrics policy to potentially make chargeback more granular. Either way, these additional headers give you more to play with.

## Create an API Management policy for the OpenAI v1 API
##
resource "azurerm_api_management_api_policy" "apim_policy_openai_v1" {
depends_on = [
azurerm_api_management_api.openai_v1
]
api_name = azurerm_api_management_api.openai_v1.name
api_management_name = azurerm_api_management.apim.name
resource_group_name = azurerm_resource_group.rg_ai_gateway.name
xml_content = <<XML
<policies>
<inbound>
<base />
<!-- Evaluate the JWT and ensure it was issued by the right Entra ID tenant -->
<validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden">
<openid-config url="https://login.microsoftonline.com/${var.entra_id_tenant_id}/v2.0/.well-known/openid-configuration" />
<issuers>
<issuer>https://sts.windows.net/${var.entra_id_tenant_id}/</issuer>
</issuers>
</validate-jwt>
<!-- Extract the Entra ID application id from the JWT -->
<set-variable name="appId" value="@(context.Request.Headers.GetValueOrDefault("Authorization",string.Empty).Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", "none"))" />
<!-- Extract the Agent ID from the x-ms-foundry-agent-id header. This is only relevant for Foundry native agents -->
<set-variable name="agentId" value="@(context.Request.Headers.GetValueOrDefault("x-ms-foundry-agent-id", "none"))" />
<!-- Extract the project GUID from the x-ms-foundry-project-id header. This is only relevant for Foundry native agents -->
<set-variable name="projectId" value="@(context.Request.Headers.GetValueOrDefault("x-ms-foundry-project-id", "none"))" />
<!-- Extract the Foundry Project name from the "openai-project" header. This is only relevant for Foundry native agents -->
<set-variable name="projectName" value="@(context.Request.Headers.GetValueOrDefault("openai-project", "none"))" />
<!-- Extract the deployment name from the uri path -->
<set-variable name="uriPath" value="@(context.Request.OriginalUrl.Path)" />
<set-variable name="deploymentName" value="@(System.Text.RegularExpressions.Regex.Match((string)context.Variables["uriPath"], "/deployments/([^/]+)").Groups[1].Value)" />
<!-- Set the X-Entra-App-ID header to the Entra ID application ID from the JWT -->
<set-header name="X-Entra-App-ID" exists-action="override">
<value>@(context.Variables.GetValueOrDefault<string>("appId"))</value>
</set-header>
<set-header name="X-Foundry-Agent-ID" exists-action="override">
<value>@(context.Variables.GetValueOrDefault<string>("agentId"))</value>
</set-header>
<set-header name="X-Foundry-Project-Name" exists-action="override">
<value>@(context.Variables.GetValueOrDefault<string>("projectName"))</value>
</set-header>
<set-header name="X-Foundry-Project-ID" exists-action="override">
<value>@(context.Variables.GetValueOrDefault<string>("projectId"))</value>
</set-header>
<choose>
<!-- If the request isn't from a Foundry native agent and is instead an application or external agent -->
<when condition="@(context.Variables.GetValueOrDefault<string>("agentId") == "none" && context.Variables.GetValueOrDefault<string>("projectId") == "none")">
<!-- Throttle token usage based on the appid -->
<llm-token-limit counter-key="@(context.Variables.GetValueOrDefault<string>("appId","none"))" estimate-prompt-tokens="true" tokens-per-minute="10000" remaining-tokens-header-name="x-apim-remaining-token" tokens-consumed-header-name="x-apim-tokens-consumed" />
<!-- Emit token metrics to Application Insights -->
<llm-emit-token-metric namespace="openai-metrics">
<dimension name="model" value="@(context.Variables.GetValueOrDefault<string>("deploymentName","None"))" />
<dimension name="client_ip" value="@(context.Request.IpAddress)" />
<dimension name="appId" value="@(context.Variables.GetValueOrDefault<string>("appId","00000000-0000-0000-0000-000000000000"))" />
</llm-emit-token-metric>
</when>
<!-- If the request is from a Foundry native agent -->
<otherwise>
<!-- Throttle token usage based on the agentId -->
<llm-token-limit counter-key="@($"{context.Variables.GetValueOrDefault<string>("projectId")}_{context.Variables.GetValueOrDefault<string>("agentId")}")" estimate-prompt-tokens="true" tokens-per-minute="10000" remaining-tokens-header-name="x-apim-remaining-token" tokens-consumed-header-name="x-apim-tokens-consumed" />
<!-- Emit token metrics to Application Insights -->
<llm-emit-token-metric namespace="llm-metrics">
<dimension name="model" value="@(context.Variables.GetValueOrDefault<string>("deploymentName","None"))" />
<dimension name="client_ip" value="@(context.Request.IpAddress)" />
<dimension name="agentId" value="@(context.Variables.GetValueOrDefault<string>("agentId","00000000-0000-0000-0000-000000000000"))" />
<dimension name="projectId" value="@(context.Variables.GetValueOrDefault<string>("projectId","00000000-0000-0000-0000-000000000000"))" />
</llm-emit-token-metric>
</otherwise>
</choose>
<choose>
<!-- If the request is from a Foundry native agent -->
<when condition="@(context.Variables.GetValueOrDefault<string>("agentId") != "none" && context.Variables.GetValueOrDefault<string>("projectId") != "none")">
<authentication-managed-identity resource="https://cognitiveservices.azure.com/" />
</when>
</choose>
<set-backend-service backend-id="${module.backend_pool_aifoundry_instances_openai_v1.name}" />
</inbound>
<backend>
<forward-request />
</backend>
<outbound>
<base />
</outbound>
</policies>
XML
}

Summing it up

I was going to go crazy and incorporate the Foundry setup and testing into this post as well but decided against it. There is a point when the brain melts and if mine is already melting, yours may be as well. I’ll walk through those pieces in the next post. You have a few main takeaways. First, let’s review the high level setup of your AI Gateway.

  1. Create your backends that point to the Microsoft Foundry endpoints.
  2. Import the relevant API. If at all possible, go with the v1 API. It will support access to other models besides OpenAI models and additional features.
  3. Add the GET and LIST operations and define the relevant policies if you’re planning on supporting dynamic models vs static. Dynamic seems to make more sense to me, but I haven’t seen enough orgs adopt this yet to form a good opinion.
  4. Craft your custom policies. I highly recommend you regularly review the headers being passed. They could change and even better data may be added to them.

Next, let’s talk about key gotchas.

  1. The certificate used on your AI Gateway MUST be issued from a well-known public CA in order for it to be trusted by the agent running in Foundry comptue. If it isn’t, this integration will fail and may not fail in a way that is obvious the TLS session failure between the agent compute and the AI Gateway is to blame.
  2. If you’re using APIM, think about the Private Endpoint and VNet integration pattern if you’re capable of using v2. If it won’t work for you, or you’re still using the classic SKU, if you want to support managed VNet you’ll need to incorporate an Application Gateway in front of your AI Gateway likely. This means more operational overhead and costs.
  3. While every Foundry Agent (v2) is given an Entra ID Agent Identity created from the Entra ID Agent Blueprint associated to the project, when using the ProjectManagedIdentity authentication type, you’ll see the project’s managed identity in the logs. If you’re able to test with the agent identity authentication type, let me know.
  4. Really noodle on how you can use the project headers for throttling and possibly chargeback. It makes a ton of sense if you’re aligning your Foundry account and project model correctly.

See you next post!

Microsoft Foundry – BYO AI Gateway – Part 2

Microsoft Foundry – BYO AI Gateway – Part 2

Hello again! Today I’m going to continue my series on Microsoft Foundry’s new support for the BYO AI Gateway. In my past few posts I’ve walked through the evolution of Foundry and covered at a high level what an AI Gateway is and the problem this feature solves. In this post we’re gonna get down and dirty with the technical details on setting this up within Microsoft Foundry. I’ll do a follow-up post to focus on the APIM (API Management) configuration. Grab your coffee and put on your thinking music (for me that is some Blink and Third Eye Blind. Yeah, I’m old.).

Let’s get to it!

Current State Architecture

My customer base is primarily in the regulated industry so most of my customers are still at the experimentation state with the Foundry Agent Service. Given these customers have strict security requirements they are largely using the agent service with the standard agent configuration. In this configuration the outbound traffic (subsets of it, but that is a much larger conversation) can be tunneled through the customer virtual network for centralized logging, mediation, and facilitating access to private resources (again, with limitations today) through what the product group calls VNet injection but I’d say is more closely described as VNet integration via a delegated subnet. Threads (conversations in v2 agents) and agent metadata are stored in a Cosmos DB, vector stores created by an agent from tools such as the File Search tool are stored in AI Search, and files uploaded to the Foundry resource by users are stored in a Storage Account. These resources are all provisioned by the customer into the customer subscription and fully managed by the customer (RBAC, encryption, HA settings, etc). Private Endpoints for each resource are created within the customer’s virtual network and made accessible from the agent delegated subnet. The whole environment looks similar to what you see below.

Foundry Agent Service – Standard Agent Configuration with VNet Injection

As I covered in my last post, as of the date of this post Foundry native agents can only consume models deployed to their own Foundry resource. This creates an issue for customers wanting the governance of the models, visibility into the use of the LLMs, and improvements security posture and operational optimizations an AI Gateway can provide when it sits between the agent and the model. For now, customers are working around doing this using what I refer to as external agents. External agents run outside of Microsoft Foundry on customer-managed compute like an on-premises Kubernetes cluster or an Azure Function deployed to the customer subscription. The downfall of this direction is these external agents live on compute customers have to manage and can’t access many of the tools available to Foundry-native agents. This is the problem the BYO AI Gateway feature is attempting to fix.

No BYO Gateway vs BYO Gateway

Foundry resource architecture

Here is where the new connection type introduced in Foundry comes to the rescue. Before I dive into the details of that, I think it’s helpful to level set a bit on the resource hierarchy within Foundry. At the top is the top-level Azure resource referred to as the Foundry service which under the hood is a Cognitive Services account. The relevant resources for this discussion are below the account resource and are projects, deployments, and connections. Projects serve a few purposes with two of them being logical boundaries around connections (at the management plane) and agents (at the data plane) provisioned under the projects. Deployments of models (such as GPT-5) are children of the account and are made available to all projects within the account. The account can also have connections objects which can be shared across projects.

Relevant resource hierarchy

For the purposes of this discussion, I’m going to focus on the connection objects. Connection objects can be created at the account level and project level as discussed above. In the standard agent configuration, you’ll create a number of different connections during setup including connections to Cosmos, AI Search, and Azure Storage. Additional common connections could be to an App Insights instance for tracing or a Grounding With Bing Search resource to use with the Grounding with Bing tool. Connection objects will contain some type of pointer, like a URI and a credential. That credential is usually API Key, some Entra ID-based authentication mechanism, or general OAuth.

Connections are created at the account level when the Foundry account itself needs to access them. This could be for the usage of Content Understanding, to a Key Vault for storing connection secrets (API keys) in a customer subscription, an an App Insights instance used for tracing. From what I’ve observed, you will create connections at the account level if they need to be shared across all projects OR they’re used by the Foundry resource in general vs some type of project construct. Connections used by projects can also be created at the project level. When you provision a standard agent for example, you’ll create connection objects to the Cosmos DB, Storage Account, and AI Search resources mentioned above. The new category of connections for this post will be created at the project level. I’d had mixed behavior with how effectively connection objects at the account can be used downstream by the projects.

APIM and Model Gateway Connections

The BYO AI Gateway feature uses two new types of connection categories: ApiManagement and ModelGateway. These objects are the glue that allow the Foundry native agents to route requests for models through an AI Gateway. When we’re connecting to an APIM instance, you should ideally use the ApiManagement category and when you’re connecting to a third-party category you’ll use the ModelGateway category.

As of the date of this blog post, these connection objects have the following schema (relevant properties to this discussion only):

name: The name of the connection (needs to be less than 60 characters in my testing)
properties: {
category: ApiManagement or ModelGateway
target: The URI you want the agent to connect to
authType: For ApiManagement this can be ApiKey or ProjectManagedIdentity
credentials: This will be populated with the value of the API key if using that authType
isSharedToAll: true or false if you want this shared across all projects
# ApiManagement category with static models
metadata: {
deploymentInPath: true or false
inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
# Models discussed in detail below
models: "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
}
# ApiManagement category with dynamic discovery
metadata: {
deploymentAPIVersion: ARM API version for CognitiveServices/accounts/deployments API calls
deploymentInPath: true or false
inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
}
# ModelGateway category with static models
metadata: {
deploymentInPath: true or false
inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
# Models discussed in detail below
models: "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
}
# ModelGateway category with dynamic models
metadata: {
deploymentInPath: true or false
inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
deploymentAPIVersion: ARM API version for CognitiveServices/accounts/deployments API calls
modelDiscovery: "{\"deploymentProvider\":\"AzureOpenAI\",\"getModelEndpoint\":\"/deployments/{deploymentName}\",\"listModelsEndpoint\":\"/deployments\"}"
}

I’ll walk through each of these properties in as much detail as I’ve been able to glean from them with my testing.

The category property is self-explanatory. You either set to this to ApiManagement (if using APIM) or Model Gateway (if using a third-party AI Gateway like a Kong or LiteLLM).

The target property is the URI you want the agent to try to connect to. As an example, if I create an API on my APIM instance for the v1 OpenAPI named openai-v1 my target would look like “https://myapim.azure-api.net/openai-v1/v1&#8221;. As of the date of this blog post, you MUST use the azure-api-net FQDN for the APIM. If you try to do a custom domain you’ll get an error back telling you that it’s not supported. I have a request into the product group to lift this limitation. I’ll update this if that is done. For third-party model gateway, this property serves the same purpose but can be any valid domain.

The authType property is going to be either ApiKey or ProjectManagedIdentity for an APIM connection. ProjectManagedIdentity will authenticate to the upstream APIM using the agent’s project’s Entra ID managed identity. When using ProjectManagedIdentity you must also specify the audience property and set it to cognitive services.azure.com if connecting to a backend Foundry resource hosting models. For a model gateway connection this will either be ApiKey or OAuth. Details on the OAuth setup can be found in the samples GitHub (I haven’t mucked with it yet). If you’re using the authType of ApiKey you additional need to pass the credentials property which includes a property of key with the API key similar to what you see below.

authType: ApiKey
credentials = {
key = MYAPIKEY
}

I haven’t messed extensively with the isSharedToAll property as of yet. For my use case I set this to false so each project got its own connection object. You may be able to create this object at the account level and set the isSharedToAll property, but I haven’t tested that yet. If you have, def let me know if that works.

Ok, now on to the property that can bring the most pain. Here we have the metadata property. This property is going to the main guts that makes this whole thing work. A few considerations, if doing this with Terraform or REST (can’t speak to Bicep or ARM), each of the properties I’m going to cover are CASE SENSITIVE. If you do the wrong casing, your connection object will not work. When connecting to an APIM or model gateway you can have Foundry either enumerate the models available (called dynamic discovery) or you can provide the exact models you want to expose (called static models).

Let’s first cover static models. Here is an example of me creating a connection to an APIM instance with static models using the authType or ProjectManagedIdentity. One thing to note is in my backend object in my APIM I’m appending /v1 to the backend path vs doing it in this connection object.

{
"id": "/subscriptions/X/resourceGroups/X/providers/Microsoft.CognitiveServices/accounts/X/projects/sampleproject1/connections/conn1apimgwstaticopenai-v1",
"name": "conn1apimgwstaticopenai-v1",
"properties": {
"audience": "https://cognitiveservices.azure.com",
"authType": "ProjectManagedIdentity",
"category": "ApiManagement",
"isSharedToAll": false,
"metadata": {
"deploymentInPath": "false",
"inferenceAPIVersion": null,
"models": "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
},
"target": "https://X.azure-api.net/openai-v1",
}

Since I’m using the v1 Azure OpenAI API, I don’t need to specify an inferenceAPIVersion. If I was using the classic API I’d need to specify the version (such as 2025-04-01-preview). Notice also I have set deploymentInPath to false. When set to true the connection will add the /deployments/deployment_name to the path. For the v1 API this isn’t required. Finally you got the models property. With a static model setup I list out the models I’m exposing to the connection. If you’re using Terraform, you MUST wrap the models in the jsonecode function. If you don’t, it will not work. The static model option is pretty helpful if you want to strictly control exactly what models the project is getting access to.

Let’s now switch over to dynamic discovery. Dynamic discovery requires you define a few additional operations inside of your API. The details can be found in this GitHub repo, but the basics of is you define an operation for a GET on a specific model and a LIST to find all the models available. These operations are management plane operations at the ARM API to retrieve deployment information. Here is an example of a setup with dynamic discovery using an APIM connection.

{
"id": "/subscriptions/X/resourceGroups/X/providers/Microsoft.CognitiveServices/accounts/X/projects/sampleproject1/connections/conn1apimgwdynamicopenai-v1",
"location": null,
"name": "conn1apimgwdynamicopenai-v1",
"properties": {
"audience": "https://cognitiveservices.azure.com",
"authType": "ProjectManagedIdentity",
"category": "ApiManagement",
"group": "AzureAI",
"isSharedToAll": false,
"metadata": {
"deploymentAPIVersion": "2024-10-01",
"deploymentInPath": "false",
"inferenceAPIVersion": null
},
"target": "https://X.azure-api.net/openai-v1",
},
"type": "Microsoft.CognitiveServices/accounts/projects/connections"
}

When doing the dynamic discovery, you’ll see the deploymentAPIVersion property set to the API version for the GET and LIST deployment operations of the ARM REST API. I added these operations into the API after I imported the v1 OpenAI spec. You can see an example in Terraform I put together in my lab repo. Dynamic discovery is a great solution when you want to the developer to have access to any new deployments you may push to the Foundry resources.

I’m not going to run through the ModelGateway connection categories because they will largely emulate what you see above with some minor differences. The official Foundry samples GitHub repo has the gory details. I also have examples in Terraform available in my own repo (if you dare subject yourself to reading my code).

Ok, so now you understand the basics of setting up the connection and what you need to do on the APIM side. For more details on setting up APIM you can reference this official repo.

Summing It Up

Ok, so you now you understand the basic connection object, how to set it up, and how it works. I’m going to cut it here and continue in another post where I’ll dig into the dirty details of how it looks to use this because I don’t want to overload your brain (and mine) with a super long post.

Before I jet I will want to provide some critical resources:

  1. My AMAZING peer Piotr Karpala has put together a repository with examples of this pattern (and some 3rd-party integrations) with Bicep. The stuff in there is gold. He was also my late night buddy helping me work through the quirks of this integration late at night. Couldn’t have gotten it done without him (or at least would have broken many keyboards).
  2. The Product Group’s official samples and explanations of the setup are located here. I’d highly recommending referencing them because they will always have more up to date instructions than my blog.
  3. I’ve put together some Terraform samples for my own purposes which are you welcome to reference, loot for your own means, and laugh at my pathetic coding ability. Check out this one for the Foundry portion and this one for the APIM portion.

And here are your tips for this post:

  1. RTFM. Seriously, read the official documentation. Today, this integration is challenging to put in place. If you try to lone wolf it, let me know how many keyboards end up being thrown through your window.
  2. If you’re coding in Terraform or making REST calls to create these connections, remember CASE SENSITIVITY matters. If you do the wrong case sensitivity, the resource will still create but it won’t work. You’ll get very frustrated trying to troubleshoot it.
  3. If you’re coding in Terraform don’t forget to use the jsonencode function on the models property. If you skip that, the resource will create but shit will not work.
  4. This is only supported for prompt agents today.
  5. Don’t forget this is public preview. So test it, but expect things to change and don’t throw this into production.

In the next post I’ll walk through how you can test the integration, some of the quirks and considerations for identity and authentication, and some of the neat APIM policy you can craft given some of the new information that is sent in the request.

See you next post!

Network Security Perimeters – NSPs in Action – AI Workload Example

Network Security Perimeters – NSPs in Action – AI Workload Example

This is part of my series on Network Security Perimeters:

  1. Network Security Perimeters – The Problem They Solve
  2. Network Security Perimeters – NSP Components
  3. Network Security Perimeters – NSPs in Action – Key Vault Example
  4. Network Security Perimeters – NSPs in Action – AI Workload Example
  5. Network Security Perimeters – NSPs for Troubleshooting

UPDATE 2/23/2026 – NSP support for Microsoft Foundry resources is generally available!

Hello again! Today I’ll be covering another NSP (Network Security Perimeters) use case, this time focused on AI (gotta drive traffic, am I right?). This will be the fourth entry in my NSP series. If you haven’t read at least the first and second post, you’ll want to do that before jumping into this one because, unlike my essays back in college, I won’t be padding the page count by repeating myself. Let’s get to it!

Use Case Background

Over the past year I’ve worked with peers helping a number of customers get a quick and simple RAG (retrieval augmented generation) workload into PoC (proof-of-concept). The goal of these PoCs were often to validate that the LLMs (large language models) could provide some level of business value when supplementing them with corporate data through a RAG-based pattern. Common use cases included things like building a chatbot for support staff which was supplemented with support’s KB (knowledge base) or chatbot for a company’s GRC (governance risk and compliance) team which was supplemented with corporate security policies and controls. You get the gist of it.

In the Azure realm this pattern is often accomplished using three core services. These services include the Azure OpenAI Service (now more typically AI Foundry), AI Search, and Azure Storage. In this pattern AI Search acts as the as the search index and optional vector database, Azure Storage stores the data in blob storage before it’s chunked and placed inside AI Search, and Azure OpenAI or AI Foundry hosts the LLM. Usage of this pattern requires the data be chunked (think chopped up into smaller parts before it’s stored as a record in a database while still maintaining the important context of the data). There are many options for chunking which are far beyond the scope of this post (and can be better explained by much smarter people), but in Azure there are three services (that I’m aware of anyway) that can help with chunking vs doing it manually. These include:

  1. Azure AI Document Intelligence’s layout model and chunking features
  2. Azure OpenAI / AI Foundry’s chat with your data
  3. Azure AI Search’s skillsets and built-in vectorization

Of these three options, the most simple (and point and click) options are options 2 and 3. Since many of these customers had limited Azure experience and very limited time, these options tended to serve for initial PoCs that then graduated to more complex chunking strategies such as the use of option 1.

The customer base that was asking for these PoCs fell into one or more of the these categories:

  1. Limited staff, resources, and time
  2. Limited Azure knowledge
  3. Limited Azure presence (no hybrid connectivity, no DNS infrastructure setup for support of Private Endpoints

All of these customers had minimum set of security requirements that included basic network security controls.

RAG prior to NSPs

While there are a few different ways to plumb these services together, these PoCs would typically have the services establish network flows as pictured below. There are variations to this pattern where the consumer may be going through some basic ChatBot app, but in many cases consumers would interact direct with the Azure OpenAI / AI Foundry Chat Playground (again, quick and dirty).

Network flows with minimalist RAG pattern

As you can see above, there is a lot of talk between the PaaS. Let’s tackle that before we get into human access. PaaS communication almost exclusively happens through the Microsoft public backbone (some services have special features as I’ll talk about in a minute). This means control of that inbound traffic is going to be done through the PaaS service firewall and trusted Azure service exception for Azure OpenAI / AI Foundry, AI Search, and Azure Storage (optionally using resource exception for storage). If you’re using the AI Search Standard or above SKU you get access to the Shared Private Access feature which allows you to inject a managed Private Endpoint (this is a Private Endpoint that gets provisioned into a Microsoft-managed virtual network allowing connectivity to a resource in your subscription) into a Microsoft-managed virtual network where AI Search compute runs giving it the ability to reach the resource using a Private Endpoint. While cool, this is more cost and complexity.

Outbound access controls are limited in this pattern. There are some data exfiltration controls that can be used for Azure OpenAI / AI Foundry which are inherited from the Cognitive Services framework which I describe in detail in this post. AI Search and Azure Storage don’t provide any native outbound network controls that I’m aware of. This lack of outbound network controls was a sore point for customers in these patterns.

For inbound network flows from human actors (or potentially non-human if there is an app between the consumer and the Azure OpenAI / AI Foundry service) you were limited to the service firewall’s IP whitelist feature. Typically, you would whitelist the IP addresses of forward web proxy in use by the company or another IP address where company traffic would egress to the Internet.

RAG design network controls prior to NSPs

Did this work? Yeah it did, but oh boy, it was never simple to approved by organizational security teams. While IP whitelisting is pretty straightforward to explain to a new-to-Azure customer, the same can’t be said for the trusted services exception, shared private access, and resource exceptions. The lack of outbound network controls for AI Search and Storage went over like a lead balloon every single time. Lastly, the lack of consistent log schema and sometimes subpar network-based logging (I’m looking at you AI Search) and complete lack of outbound network traffic logs made the conversations even more difficult.

Could NSPs make this easier? Most definitely!

RAG with NSPs

NSPs remove every single one of the pain points described above. With an NSP you get:

  1. One tool for controlling both inbound and outbound network controls (kinda)
  2. Standardized log schema for network flows
  3. Logging of outbound network calls

We go from the mess above to the much more simple design pictured below.

The design using NSPs

In this new design we create a Network Security Perimeter with a single profile. In this profile there is an access rule which allows customer egress IP addresses for human users or non-human (in case users interact with an app which interacts with LLM). Each resource is associated to that profile within the NSP which allows non-human traffic between PaaS services since it’s all within the same NSP. No additional rules are required which prevents the PaaS services from accepting or initiating any network flows outside of what the access rules and communication with each other within the NSP.

In this design you control your inbound IP access with a single access rule and you get a standard manner to manage outbound access. No more worries about whether the product group baked in an outbound network control, every service in the NSP gets one. Logging? Hell yeah we got your logging for both inbound and outbound in a standard schema.

Once it’s setup you get you can monitor both inbound and outbound network calls using the NSPAccessLogs. It’s a great way to understand under the hood how these patterns work because the NSP logs surface the source resource, destination resource, and the operation being performed as seen below.

NSP logs surfacing operations

One thing to note, at least in East US 2 where I did my testing, outbound calls that are actually allowed since all resources are within the NSP falsley record as hitting the DenyAll rule. Looking back at my notes, this has been an issue since back in March 2025 so maybe that’s just the way it records or the issue hasn’t yet been remediated.

The other thing to note is when I initially set this all up I got an error in both AI Foundry’s chunking/loading method and AI Search’s. The error complains that an additional header of xms_az_nwperimid was passed and the consuming app wouldn’t allow it. Oddly enough, a second attempt didn’t hit the same error. If you run into this error, try again and open a support ticket so whatever feature on the backend is throwing that error can be cleaned up.

Summing it up

So yeah… NSPs make PaaS to PaaS flows like this way easier for all customers. It especially makes implementing basic network security controls far more simple for customers new to Azure that may not have a mature platform landing zone sitting around.

Here are your takeaways for today:

  1. NSPs give you standard inbound/outbound network controls for PaaS and standardized log schema.
  2. NSPs are especially beneficial to new customers who need to execute quickly with basic network security controls.
  3. Take note as of the date of this blog Azure OpenAI Service support for NSPs in public preview. You will need to enable the preview flag on the subscription before you go mucking with it in a POC environment. Do not use it in production until it’s generally available. Instructions are in the link.
  4. I did basic testing for this post testing ingestion, searching, and submitting prompts that reference the extra data source property. Ensure you do your own more robust testing before you go counting on this working for every one of your scenarios.
  5. If you want to muck around with it yourself, you can use the code in this repo to deploy a similar lab as I’ve built above. Remember to enable the preview flag and wait a good day before attempting to deploy the code.

Well folks, that wraps up this post. In my final post on NSPs, I’ll cover a use case for NSPs to help assist with troubleshooting common connectivity issues.

Thanks!

Azure Authorization – Azure ABAC (Attribute-based Access Control)

This is part of my series on Azure Authorization.

  1. Azure Authorization – The Basics
  2. Azure Authorization – Azure RBAC Basics
  3. Azure Authorization – actions and notActions
  4. Azure Authorization – Resource Locks and Azure Policy denyActions
  5. Azure Authorization – Azure RBAC Delegation
  6. Azure Authorization – Azure ABAC (Attribute-based Access Control)

Welcome back fellow geeks.

I do a lot of learning and educational sessions with my customer base. The volume pretty much demands reusable content which means I gotta build decks and code samples… and worse maintain them. The maintenance piece typically consists of me mentally promising myself to update the content and kicking the can down the road for a few months. Eventually, I get around to updating the content.

This month I was doing some updates to my content around Azure Authorization and decided to spend a bit more time with Azure ABAC (Attribute-based access control). For those of you unfamiliar with Azure ABAC, well it’s no surprise because the use cases are so very limited as of today. Limited as the use cases are, it’s a worthwhile functionality to understand because Microsoft does use it in its products and you may have use cases where it makes sense.

The Dream of ABAC

Let’s first touch briefly on the differences between (RBAC) role-based access control and (ABAC) attribute-based access control. Attribute-based access control has been the dream for the security industry for as long as I can remember. RBAC has been the predominant authorization mechanism in a majority of applications over the years. The challenge with RBAC is it has typically translated to basic group membership where an application authorizes a user solely on whether or not the user is in a group. Access to these groups would typically come through some type of request for membership and implementation by a central governance team. Those processes have tended to be not super user friendly and the access has tended to be very course-grained.

ABAC meanwhile promised more fine-grained access based upon attributes of the security principal, resource, or whatever your mind can dream up. Sounds awesome right? Well it is, but it largely remained a dream in the mainstream world with a few attempts such as Windows Dynamic Access Control (Before you comment, yeah I get you may have had some cool apps doing this stuff years ago and that is awesome, but let’s stick with the majority). This began to change when cloud came around with the introduction of more modern protocols and standards such as SAML, OIDC, and OAuth. These protocols provide more flexibility with how the identity provider packages attributes about the user in the token delivered to the service provider/resource provider/what have you.

When it came to the Azure cloud, Microsoft went the traditional RBAC path for much of the platform. User or group gets placed in Azure RBAC role and user(s) gets access. I explain Azure RBAC in my other posts on RBAC. There is a bit of flexibility on the Entra ID side for the initial access token via Entra ID Conditional Access, but RBAC in the Azure realm. This was the story for many years of Azure.

In 2021 Microsoft decided something more flexible was needed and introduced Azure ABAC. The world rejoiced… right? Nah, not really. While the introduction of ABAC was awesome, its scope of use was and still is extremely limited. As of the date of this blog, ABAC is only usable for Azure Storage blob and queue operations. All is not lost though, there are some great use cases for this feature so it’s important to understand how it works.

How does ABAC work?

Alright, history lesson and complaining about limited scope aside, let’s now explore how the feature works.

ABAC is facilitated through an additional property on Azure RBAC Role Assignment resources. I’m going to assume you understand the ins and out of role assignments. If you don’t, check out my prior post on the topic. In its most simple sense, an Azure RBAC role assignment is the association of a role to a security principal granting that principal the permissions defined in the role over a particular scope of resources. As I’ve covered previously, role assignments are an Azure resource that have defined sets of properties. The properties we care about for the scope of this discussion are the conditionVersion and condition properties. The conditionVersion property will always have a value of 2.0 for now. The condition property is where we work our ABAC magic.

The condition property is made up of a series of conditions which each consist of an action and one or more expressions. The logic for conditions is kinda weird, so I’m walk you through it using some of the examples from documentation as well as complex condition I throw together. First, let’s look at the general structure.

Structure of conditions used in ABAC

In the above image you can see the basic building blocks of a condition. Looks super confusing and complicated right? I know it did to me at first. Thankfully, the kind souls who write the public documentation broke this down in a more friendly programming-like way.

Far more simple explanation of conditions

In each condition property we first have the action line where the logic looks to see if the action being performed by the security principal doesn’t (note the exclamation point which negates whats in the parentheses) match the action we’re applying the conditions to. You’ll commonly see a line like:

!(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})

This line is saying if the action isn’t blobs/read (which would be data plane call to read the contents of the blob) then the line should evaluate to true. If it evaluates to true, then the access is allowed and the expressions are not evaluated any further.

After this line we have the expression which is only evaluated when the first line evaluates to false (which in the example I just covered would mean the security principal is trying to read the content of a blob). The expressions support four categories of what Microsoft refers to as condition features. There are currently four features in various states of GA (general availability) and preview (refer to the documentation for those details). These four categories include:

  • Requests
  • Environment
  • Resource
  • Principal (security principal)

These four categories give you a ton of flexibility. Requests covers the details of the request to storage, for example such as limiting a user to specific blob prefixes based on the prefix within the request. Environment can be used to limit the user to accessing the resource from a specific Private Link Private Endpoint over Private Link in general (think defense-in-depth here). The resource feature exposes properties of the resource being accessed, which I find the most flexible thing to be blob index tags. Lastly, we have security principal and this is where you can muck around with custom security attributes in Entra ID (very cool feature if you haven’t touched it).

In a given condition we can have multiple expressions and within the condition property we can string together multiple conditions with AND and OR logic. I’m a big believer in going big or going home, so let’s take a look at a complex condition.

Diving into the Deep End

Let’s say I have a whole bunch of data I need to make available via a blobs in an Azure Storage Account. I have a strict requirement to use a single storage account and the blobs I’m going to store have different data classifications denoted by a blob index tag key named access_level. Blobs without this key are accessible by everyone while blobs classified high, medium, or low are only accessible by users with approval for the same or higher access levels (example: user with high access level can access high, medium, low, and data with no access level). Lastly, I have a requirement that data at the high access level can only be accessed during business hours.

I use a custom security attribute in Entra ID called accesslevel under an attribute set named organization to denote a user’s approved access level.

Here is how that policy would break down.

My first condition is built to allow users to read any blobs that don’t have the access_level tag.

# Condition that allows users within scope of the assignment access to documents that do not have an access level tag
(
  (
    # If the action being performed doesn't match blobs/read then result in true and allow access
    !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
  )
  OR 
  (
    # If the blob doesn't have a blob index tag with a key of access_level then allow access
    NOT @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags&$keys$&] ForAnyOfAnyValues:StringEquals {'access_level'}
  )
)

If the blob does have an access tag, I want to start incorporating my logic. The next condition I include allows users with the accesslevel security attribute set to high to read blobs with a blob index tag of access_level equal to low or medium. I also also allow them to read blobs tagged with high if it’s between 9AM and 5PM EST.

# Condition that allows users within scope of the assignment to access medium and low tagged data if they have a custom 
# security attribute of accesslevel set to high. High data can also be read within working hours
OR
(
 (
   # If the action being performed doesn't match blobs/read then result in true and allow access
   !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
 )
 OR 
 (
   # If the blob has an index tag of access_level with a value of medium or low allow the user access if they have a custom security
   # attribute of organization_accesslevel set to high
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'medium', 'low'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'high'
 )
 OR
 (
   # If the blob has an index tag of access_level with a value of high allow the user access if they have a custom security
   # attribute of organization_accesslevel set to high and it's within working hours
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'high'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'high'
   AND
   @Environment[UtcNow] DateTimeGreaterThan '2025-06-09T12:00:00.0Z'
   AND
   @Environment[UtcNow] DateTimeLessThan '2045-06-09T21:00:00.0Z'
 )
)

Next up is users with medium access level. These users are granted access to data tagged medium or low.

# Condition that allows users within scope of the assignment to access medium and low tagged data if they have a custom 
# security attribute of accesslevel set to medium
OR
(
  (
    # If the action being performed doesn't match blobs/read then result in true and allow access
    !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
  )
  OR 
  (
    # If the blob has an index tag of access_level with a value of medium or low allow the user access if they have a custom security
    # attribute of organization_accesslevel set to medium
    @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'medium', 'low'}
    AND
    @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'medium'
 )
)

Finally, I allow users with low access level to access data tagged as low.

# Condition that allows users within scope of the assignment to access low tagged data if they have a custom 
# security attribute of accesslevel set to low
OR
(
 (
   # If the action being performed doesn't match blobs/read then result in true and allow access
   !(ActionMatches{'Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read'} AND !SubOperationMatches{'Blob.List'})
 )
 OR 
 (
   # If the blob has an index tag of access_level with a value of low allow the user access if they have a custom security
   # attribute of organization_accesslevel set to low
   @Resource[Microsoft.Storage/storageAccounts/blobServices/containers/blobs/tags:access_level<$key_case_sensitive$>] ForAnyOfAnyValues:StringEquals {'low'}
   AND
   @Principal[Microsoft.Directory/CustomSecurityAttributes/Id:organization_accesslevel] StringEquals 'low'
 )
)

Notice how I separated each condition using OR. If the first condition resolves to false, then the next condition is evaluated until all access is granted or all conditions are exhausted. Neat right?

Summing it up

So why should you care about this if its use case is so limited? Well, you should care because that is ABAC’s use case today, and it would be expanded in the future. Furthermore, ABAC allows you to be more granular in how you grant access to data in Azure Storage (again, blob or queue only). You likely have use cases where this can provide another layer of security to further constrain a security principal’s access. You’ll also see these conditions used in Microsoft’s products such as AI Foundry.

The other reason it’s helpful to understand this language used for the condition, is conditions are expanding into other services such as Azure RBAC Delegation (which if you aren’t using you should be). While the language can be complex, it does make sense if you muck around with it a bit.

A final bit of guidance here, don’t try to write conditions by hand. Use the visual builder in the Azure Portal as seen below. It will help you get some basic conditions in place that you can further modify directly via the code view.

Azure Portal Condition Builder

Next time you’re locking down an Azure storage account, think about whether or not you can further restrict humans and non-humans alike based on the attributes discussed today. The main places I’ve seen this used are for user profiles, further restricting user access to specific subsets of data (similar to the one I walked through above), or even adding an additional layer of network security baked directly into the role assignment itself.

See you next post!