Blocking API Key Access in Azure OpenAI Service

Blocking API Key Access in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Hello folks! I’m back again with another post on the Azure OpenAI Service. I’ve been working with a number of Microsoft customers in regulated industries helping to get the service up and running in their environments. A question that frequently comes up in this conversations is “How do I prevent usage of the API keys?”. Today, I’m going to cover this topic.

I’ve covered authentication in the AOAI (Azure OpenAI Service) in a past post so read that if you need the gory details. For the purposes of this post, you need to understand that AOIA supports both API keys and AAD (Azure Active Directory) authentication. This dual support is similar to other Azure PaaS (platform-as-a-service) offerings such as Azure Storage, Azure CosmosDB, and Azure Search. When the AOAI instance is created, two API keys are generated which provide full permissions at the data plane. If you’re unfamiliar with the data plane versus management plane, check out my post on authorization.

Azure Portal showing AOAI API Keys

Given the API keys provide full permissions at the data plane monitoring and controlling their access is critical. As seen in my logging post monitoring the usage of these keys is no simple task since the built-in logging is minimal today. You could use a custom APIM (Azure API Management) policy to include a portion of the API key to track its usage if you’re using the advanced logging pattern, but you still don’t have any ability to restrict what the person/application can do within the data plane like you can when using AAD authentication and authorization. You should prefer AAD authentication and authorization where possible and tightly control API key usage.

In my authorization and logging posts I covered how to control and track who gets access to the API keys. I’ve also covered how APIM can be placed in front of an AOAI instance to enforce AAD authentication. If you block network access to the AOAI service to anything but APIM (such as using a Private Endpoint and Network Security Group) you force the usage of APIM which forces the use of AAD authentication preventing API keys from being used.

Azure OpenAI Service and Azure API Management Pattern

The major consideration of the pattern above is it breaks the Azure OpenAI Studio as of today (this may change in the future). The Azure OpenAI Studio is an GUI-based application available within the Azure Portal which allows for simple point-and-click actions within the AOAI data plane. This includes actions such as deploying models and sending prompts to a model through a GUI interface. While all this is available via API calls, you will likely have a user base that wants access a simple GUI to perform these types of actions without having to code to them. To work around this limitation you have to open up network access from the user’s endpoint to the AOAI instance. Opening up these network flows allows the user to bypass APIM which means the user could use an API key to make calls to the AOAI service. So what to do?

In every solution in tech (and life) there is a screwdriver and a hammer. While the screwdriver is the optimal way to go, sometimes you need the hammer. With AOAI the hammer solution is to block usage of API key-based authentication at the AOAI instance level. Since AOAI exists under the Azure Cognitive Services framework, it benefits from a poorly documented property called disableLocalAuth. Setting this property to true blocks the API key-based authentication completely. This property can be set at creation or after the AOAI instance has been deployed. You can set it via PowerShell or via a REST call. Below is code demonstrating how to set it using a call to the Azure REST API.

body=$(cat <<EOF
{
    "properties" : {
        "disableLocalAuth": true
    }
}

az rest --method patch --uri "https://management.azure.com/subscriptions/SUBSCRIPTION_ID/resourceGroups/RESOURCE_GROUP/providers/Microsoft.CognitiveServices/accounts/AOAI_INSTANCE_NAME?api-version=2021-10-01" --body $body              

The AOIA instance will take about 2-5 minutes to update. Once the instance finishes updating, all calls to it using API key-based authentication will receive an error such as seen below when using the OpenAI Python SDK.

You can re-enable the usage of API keys by setting the property back to false. Doing this will update the AOAI resource again (around 2-5 minutes) and the instance will begin accepting API keys. Take note that turning the setting off and then back on again WILL cycle the API keys so don’t go testing this if you have applications in production using API keys today.

Mission accomplished right? The user or application can only access the AOAI instance using AAD authentication which enforced granular Azure RBAC authroization. Heck, there is even an Azure Policy available you can use to audit whether AOAI instances have had this property set.

There is a major consideration with the above method. While you’ve blocked access to the API keys, you’re still created a way to circumvent APIM. This means you lose out on the advanced logging provided by APIM and you’ll have to live with the native logging. You’ll need to determine whether that risk is acceptable to your organization.

My suggestion would be to use this control in combination with strict authorization and network controls. There should be a very limited set of users with permissions directly on the AOAI resource and the direct network access to the resource should be tightly controlled. The network control could be accomplished by creating a shared jump host users that require this access could use. Key thing is you treat access to the Azure OpenAI Studio as an exception versus the norm. I’d imagine Microsoft will evolve the Azure OpenAI Studio deployment options over time and address the gaps in native logging. For today, this provides a reasonable compromise.

I did encounter one “quirk” with this option that is worth noting. The account I used to lab this all out had the Owner role assignment at the subscription level. With this account I was able to do whatever I wanted within the AOAI data layer when disableLocalAuth was set to false. When I set disableLocalAuth to true I was unable to make data plane calls (such as deploying new models). When I granted my user one of the data plane roles (such as Azure Cognitive Service OpenAI Contributor) I was able to perform data plane operations once again. It seems like setting this property to true enforces a rule which requires being granted specific data plane-level permissions. Make sure you understand this before you modify the property.

Well folks that concludes this blog post. Here are your key takeaways:

  1. API Key-based authentication can be blocked at the AOIA instance by setting the disableLocalAuth property to true. This setting can be set at deployment or post deployment and takes 2-5 minutes to take effect. Switching the value of this property from true to false will regenerate the API keys for the instance.
  2. The Azure OpenAI Studio requires the user’s endpoint have direct network access to the AOAI instance. This is because it uses the user’s endpoint to make specific API calls to the data plane. You can look at this yourself using debug mode in your browser or a local proxy like Fiddler. Direct network access to the AOAI instance means you will only have the information located in the native logs for the activities the user performs.
  3. Setting disableLocalAuth to true enforces a requirement to have specific data plane-level permissions. Owner on the subscription or resource group is not sufficient. Ensure you pre-provision your users or applications who require access to the AOAI instance with the built-in Azure RBAC roles such as Azure Cognitive Services OpenAI User or a custom role with equivalent permissions prior to setting the option to true.

Thanks folks and have a great weekend!

APIM and Azure OpenAI Service – Azure AD

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Hello folks!

I’m back with another entry on the Azure OpenAI Service (AOAI). In my previous posts, I’ve focused on the native security features that Microsoft provides to its customers to secure their instance of the service. However, in this post, I’ll be taking a slightly different approach. I’ll be walking you through a pattern that can be used to supplement those native features using Azure API Management (APIM)

For those who are unfamiliar with APIM, it is Azure’s API Gateway PaaS (platform-as-a-service) offering. Like any good API Gateway, it provides an abstraction layer away from backend APIs, which allows you to add additional authentication/authorization controls, throttling, transform requests, and log information from the requests and responses. In this post, I’ll be covering how the authentication/authorization controls can be used to supplement what is provided natively in AOAI. 

I’ve covered authentication in the AOAI in a previous post, refer to that post for the gory details. For the purposes of this post, you need to understand at the data plane it supports both Azure AD authentication/Azure RBAC authorization and authentication with two API keys created when the service is instantiated.

Azure OpenAI Service Authentication and Authorization

To my knowledge, there is no way to disable the usage of API keys. Moreover, as I’ve discussed in my logging post, it is extremely difficult to trace back to what is using the API keys because the source IP address is masked and the calls aren’t associated with specific API keys or Azure AD identities. This makes it critically important to control who has access to the API keys. In my post on authorization within the service, I cover this conversation in more detail, and yes, it can be done with Azure RBAC.

Sample log entry from Azure Open AI Service


Controlling access should be your first priority. However, wouldn’t it be great to restrict access to the service to Azure AD authentication only? This is where APIM comes in. APIM is placed between the application calling the AOAI service and the AOAI service. This establishes a man-in-the-middle scenario where APIM can analyze and modify the request and responses between the application and AOAI service.

APIM and AOAI Data Flow

The image above is an example of this pattern. Here, the calling application is provisioned with either a service principal (running outside of Azure) or a managed identity (running within Azure or integrated with Azure Arc). Instead of pointing the application directly to the Azure OpenAI Service, it is pointed to a custom domain configured on the APIM instance, and the APIM instance is configured to front the Azure OpenAI Service API. My peer Jake Wang put together some wonderful instructions on how to set this piece up in this repository.

Once APIM is set up to pass traffic along to the AOAI service, a custom APIM policy can be introduced to start controlling access. Since the goal is to limit access to the AOAI service to applications using an Azure AD identity, the validate-jwt policy can be used. This policy captures and extracts the JSON Web Token (bearer token) and parses the content within it to verify that the token was issued by the issuer specified in the policy. 

The policy would be structured as shown below. In this policy, any request made to the API must include a JWT issued by the Azure AD tenant (you can find your tenant ID here). Additionally, the policy filters to ensure that the token is intended for the Cognitive Services OAuth scope, which AOAI falls under. If the request doesn’t include the JWT issued by the tenant, the user receives a 403.

<!--
    This sample policy enforces Azure AD authentication and authorization to the Azure OpenAI Service. 
    It limits the authorization tokens issued by the organization's tenant for Cognitive Services.
    The authorization token is passed on to the Azure OpenAI Service ensuring authorization to the actions within
    the service are limited to the permissions defined in Azure RBAC.

    You must provide values for the AZURE_OAI_SERVICE_NAME and TENANT_ID parameters.
-->
<policies>
    <inbound>
        <base />
        <set-backend-service base-url="https://{{AZURE_OAI_SERVICE_NAME}}.openai.azure.com/openai" />
        <validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden">
            <openid-config url="https://login.microsoftonline.com/{{TENANT_ID}}/v2.0/.well-known/openid-configuration" />
            <issuers>
                <issuer>https://sts.windows.net/{{TENANT_ID}}/</issuer>
            </issuers>
            <required-claims>
                <claim name="aud">
                    <value>https://cognitiveservices.azure.com</value>
                </claim>
            </required-claims>
        </validate-jwt>
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>

If you followed the instructions in the repository I linked above, you can enforce this policy for the API you created as seen below.

APIM Policy In Place

Once the policy is in place, you can test it by attempting to authenticate to the APIM API endpoint and specifying an AOAI API key. In the image below, an attempt is made to call the endpoint with an API key.

APIM Denying Request with API Keys

Success! Even though the API key is valid, APIM is rejecting the request before it ever reaches the AOAI instance, preventing the API keys from being used. 

This pattern also passes the bearer token on to the AOAI service, so the RBAC you configure on your AOAI instance will be enforced. In my post on authorization, I provide some guidance on which built-in RBAC roles make since and which permissions you’ll want to carefully distribute.

What’s even cooler is that now that the application is forced to authenticate using Azure AD, the application ID can be extracted. If there are multiple applications hitting the same AOAI instance, different throttling can be applied on a per-application basis instead of having them share one big pool of request/token allowance at the AOAI service level

This can be achieved with a policy similar to the one shown below. This policy looks for specific app IDs in the bearer token and applies different throttling based on the application.

<!--
    This sample policy enforces Azure AD authentication and authorization to the Azure OpenAI Service. 
    It limits the authorization tokens issued by the organization's tenant for Cognitive Services.
    The authorization token is passed on to the Azure OpenAI Service ensuring authorization to the actions within
    the service are limited to the permissions defined in Azure RBAC.

    The sample policy also sets different throttling limits per application id. This is useful when an organization
    has multiple applications consuming the same instance of the Azure OpenAI Service. This sample shows throttling
    rules for two separate applications.

    You must provide values for the AZURE_OAI_SERVICE_NAME, TENANT_ID, and CLIENT_ID_APP parameters. You can add multiple
    lines for as many applications as you need to throttle.
-->
<policies>
    <inbound>
        <base />
        <set-backend-service base-url="https://{{AZURE_OAI_SERVICE_NAME}}.openai.azure.com/openai" />
        <validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden">
            <openid-config url="https://login.microsoftonline.com/{{TENANT_ID}}/v2.0/.well-known/openid-configuration" />
            <issuers>
                <issuer>https://sts.windows.net/{{TENANT_ID}}/</issuer>
            </issuers>
            <required-claims>
                <claim name="aud">
                    <value>https://cognitiveservices.azure.com</value>
                </claim>
            </required-claims>
        </validate-jwt>
        <choose>
            <when condition="@(context.Request.Headers.GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", string.Empty).Equals("{{CLIENT_ID_APP1}}"))">
                <rate-limit-by-key calls="1" renewal-period="60" counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", string.Empty))" increment-condition="@(context.Response.StatusCode == 200)" />
            </when>
        </choose>
        <choose>
            <when condition="@(context.Request.Headers.GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", string.Empty).Equals("{{CLIENT_ID_APP2}}"))">
                <rate-limit-by-key calls="10" renewal-period="60" counter-key="@(context.Request.Headers.GetValueOrDefault("Authorization","").Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", string.Empty))" increment-condition="@(context.Response.StatusCode == 200)" />
            </when>
        </choose>
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>

While the above is impressive, it only works if the application is restricted from direct access to the Azure OpenAI Service. To achieve this, I recommend creating a Private Endpoint for the AOAI service and wrapping a Network Security Group around the subnet (NSGs are now supported for private endpoints) to block access to the resources within the subnet to anything but the APIM instance. Keep in mind that the APIM instance needs to be able to access resources within the virtual network, which means that an APIM needs to be deployed in internal mode. The architecture could look similar to the image below.

APIM and Azure OpenAI Service with Private Networking

One thing to note is that if access is blocked as described above, it will break the AOAI studio feature within the Azure Portal. This is because calls to the data plane of the AOAI service are now blocked. A workaround could be to use a jump host or shared server if you need to continue supporting that feature. However, that opens up the risk that someone could write some code while on that machine and use the API keys. 

Let me sum up what we learned today:

  • APIM policies can be used to enforce Azure AD authentication and can block the use of API keys.
  • You must lock down the Azure OpenAI Service to just APIM to make this effective. Remember this will break access to the Studio within the Azure Portal.
  • Since you’re forcing Azure AD authentication, you can use the application id to add custom throttling.

That’s all for this post. The policy samples used in this blog have been uploaded to this repository on GitHub. Feel free to experiment with them and build upon them. If you end up building upon them and doing anything interesting, do reach out and let me know. I’m always interested in geeking out! In my next post, I’ll cover how to use an APIM policy to create custom logging that can be delivered to an Event Hub and consumed by the upstream service of your choice. Have a great week!

Logging in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Updates:

  • 1/18/2024 – Logs now include Entra ID security principal objectid property in RequestResponse log

Welcome back fellow geeks.

Over the past few weeks I’ve done a series of posts on the Azure OpenAI Service covering some of the security features of the service. In my first post I gave an overview of what security controls Microsoft makes available for customers to configure to secure their instance of the service. In the second and third posts I did deep dives into the authentication and authorization capabilities of the service. Tonight I’m going to cover the logging capabilities of the service.

Let’s jump right in!

The Azure OpenAI Service emits both logs and metrics. For the purposes of this post I’ll be covering logs. I’ll cover the metrics and monitoring of the service in another post if there is a community interest. Logs emitted by the service have been integrated with the diagnostic setting feature. For those unfamiliar with the diagnostic settings feature, it provides a very simple way to deliver logs and metrics emitted by an Azure service to an Azure Storage Account, Log Analytics Workspace, or to an Event Hub (common use case for passing on to a SIEM like Splunk). In the image below, you can see I’m sending all of the logs and metrics emitted from the service to a Log Analytics Workspace.

Diagnostic Settings

In the image above you can see that the Azure OpenAI Service emits three types of logs which include audit logs, request and response logs, and trace logs. As of the date of this blog all of these logs are sent to the AzureDiagnostics table if you opt to send this logs to a Log Analytics Workspace, so dust off your Kusto skills.

Let’s first take a look at the audit logs, because I know that’s where your security focused eyes darted to. I want to remind you this is a very new service and lots of improvements are coming. Yeah, I did that. I pulled a sales dude move. Seriously though, the audit logging is very limited and likely not what you’d hope for as of the date of this blog. The only events that seem to be logged to the Audit Log for the service are when a ListKeys operation is performed. The operation means a security principal accessed the API Keys. The API keys are used to authenticate to the data plane of the service and do not allow for granular authorization at that plane. Check out my last two posts on authentication and authorization if that sentence doesn’t make sense. Unfortunately, the identity that accessed the API key isn’t listed in the log entry which makes it pretty useless in its current state. Below is a sample entry.

Azure OpenAI Service Audit Log Entry Example

Making this even more useless, this operation is also logged in the Azure Activity Log. The log entry within the Activity Log does include the security principal that performed the action so you’ll want to watch for that activity there. I imagine over time the audit log will be improved to capture more operations and associate those operations to a security principal.

Activity Log Entry Showing List Keys Operation

Next I’m going to cover the Request and Response Logs. This log set is really interesting because likely your expectation is the same as mine was that these would include details around prompts sent to the models and information on the response such as the number of tokens consumed. While it does operations around requests for things such a completions or summarizations, it also captures a ton of other events that would likely be more suited for the audit log. Additionally, the data it captures about these actions is extremely limited.

Let’s take a look at a log entry where I requested the model complete a sentence for me. In my code I’m calling the API using an Azure AD service principal NOT an API key with the shattered hope that the log entry would capture the service principal I’m using.

1/18/2024 – The Entra ID object id is now included in the RequestResponse log entry! Hooray!

Prompt and Response Log Entry

In the above log entry we don’t get any information to correlate the operation back an entity even when using Azure AD authentication. All we can see is the completion action occurred at a specific time and resulted in a success status code. You’ll also see there is a CallerIPAddress field. This will include the first three octets of the IP address called the service but not the last octet. Kinda weird it’s being masked like this, but I guess that’s better than nothing? (Not really, but hey it’s a new service).

Before you ask, no, the content of the prompts and responses are not logged in any of these logs.

There is one additional field of relevance I couldn’t fit within the above screenshot and that’s properties_s. The only real useful information on this is total response time the service took to return an answer to the user. I hoped this would have had some information around tokens used, but sadly it does not.

properties_s field of a Prompt and Response Log Entry

Besides prompts and responses, this log seems to capture other data plane operations. This includes everything from activities users have performed around uploading files to the service to train fine-tuned models, activities around fine-tuned models (listing, creation, deletion), creation of embeddings, and management of models deployed to the service. Most of these operations should be in the Audit Log in my opinion. I’m not sure why they’re included in this log, but they are. No, none of these operations include details as to who performed these actions beyond the first three octets of the IP address.

Lastly, there is the trace log. I have no idea what’s logged in there because I have yet generate any trace log data. If you know what gets logged in there, let me know in the comments.

So yes folks, there are some serious gaps in the logging for the service today. However, the service is new and the underlining technology is still pretty new as well so we can’t expect perfection out of the gates. My advice to customers has been to build the logging they need into whatever application is fronting the user access and to lock the service down from an authorization perspective so that the only access to the service comes through that application.

My peer Jake Wang has come up with a creative solution to address some of the logging gaps in the service by placing an API Management instance in front of it. With this design anything communicating with the Azure OpenAI Service instance has to go through APIM. Within APIM you can do whatever fancy logging you want to do, toss in some additional throttling to specific user requests, and lots of other cool stuff. It’s a great workaround while the Product Group improves the native logging. Check out my recent post for some of the gotchas of APIM logging for the Azure OpenAI Service.

If you have a different API Gateway like Mulesoft you could use this same pattern with that instead of APIM.

Well folks that wraps things up. I hope you got some value out of this post and I’d encourage you to make your voices heard by submitting feedback to the product group on how you’d like to see the logging improved for the service.

Thanks for reading!

Authorization in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Hello folks!

The fun with the new Azure OpenAI Service continues! I’ve been lucky enough to have been tapped to help a number of Microsoft financial services customers with getting the Azure OpenAI Service in place with the appropriate infrastructure and security controls. In the process, I get to learn from a ton of smart people in the AI space. It’s truly been one of the highlights of my 20-year career.

Over the past few weeks I’ve been posting about what I’ve learned, and today I’m going to continue with that. In my first post on the service I gave a high level overview of the security controls Microsoft makes available to the customer to secure their instance of Azure OpenAI Service. In my second post I dove deep into how the service handles authentication and how Azure Active Directory (Azure AD) can be used to improve over the built-in API key-based authentication. Today I’m going to cover authorization and demonstrate how using Azure AD authentication lets you take advantage of granular authorization with Azure RBAC.

Let’s dig in!

As I covered in my last post, the Azure OpenAI Service has both a management plane and data plane. Each plane supports different types of authentication (process of verifying the identity of a user, process, or device, often as a prerequisite to allowing access to resources in an information system) and authorization (The right or a permission that is granted to a system entity to access a system resource). Operations such as swapping to a customer-managed key, enabling a private endpoint, or assigning a managed identity to the service occur within the management plane. Activities such as uploading training data or issuing a prompt to a model occur at the data plane. Each plane uses a different API endpoint. The image below will help you visualize the different planes.

Azure OpenAI Service Management and Data Planes

As illustrated above, authorization within the management plane is handled using Azure RBAC because authentication to that plane requires Azure AD-based authentication. Here we can limit the operations occurring at the management plane a security principal (user, service principal, managed identity, Azure Active Directory group (local or synchronized from on-premises) can perform by using Azure RBAC. For those of you coming from the AWS world, and where the Azure OpenAI Service may be your first venture into Azure, Azure RBAC is Azure’s authorization solution. It’s similar to an AWS IAM Policy. Let’s take a look at a built-in RBAC role that a customer might grant a data scientist who will be using the Azure OpenAI Service.

{
    "id": "/subscriptions/de90ea7d-a9c3-4957-8c96-XXXXXXXXXXXX/providers/Microsoft.Authorization/roleDefinitions/a97b65f3-24c7-4388-baec-2e87135dc908",
    "properties": {
        "roleName": "Cognitive Services User",
        "description": "Lets you read and list keys of Cognitive Services.",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.CognitiveServices/accounts/listkeys/action",
                    "Microsoft.Insights/alertRules/read",
                    "Microsoft.Insights/diagnosticSettings/read",
                    "Microsoft.Insights/logDefinitions/read",
                    "Microsoft.Insights/metricdefinitions/read",
                    "Microsoft.Insights/metrics/read",
                    "Microsoft.ResourceHealth/availabilityStatuses/read",
                    "Microsoft.Resources/deployments/operations/read",
                    "Microsoft.Resources/subscriptions/operationresults/read",
                    "Microsoft.Resources/subscriptions/read",
                    "Microsoft.Resources/subscriptions/resourceGroups/read",
                    "Microsoft.Support/*"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/*"
                ],
                "notDataActions": []
            }
        ]
    }
}

Let’s briefly walkthrough each property. The id property is the unique resource name assigned to this role definition. Next up we have the name property and description properties which need no explanations. The assignableScopes property determines at which scope an RBAC role can be assigned. Typical scopes include management groups, subscriptions, resource groups, and resources. Built-in roles will always have an assignable scope of “/” which denotes the RBAC role can be assigned to any management group, subscription, resource group, or role.

I’ll spend a bit of time on the permissions property. The permissions property contains a few different child properties including actions, notActions, dataActions, and notDataActions. The actions property lists the management plane operations allowed by the role while the dataActions lists the data plane operations allowed by the role. The notActions and notDataActions are interesting in that they are used to strip permissions out of the actions or dataActions. For example, say you granted a user full data plane operations to an Azure Key Vault but didn’t want them to have the ability to delete keys. You could to this by giving the user the dataAction of Microsoft.KeyVaults/* and notDataAction of Microsoft.KeyVaults/keys/purge/action. Take note this is NOT an explicit deny. If the user gets this permission in another way through assignment of a different RBAC role the user will be able to perform the action. At this time, Azure does not have a generally available feature that allows for an explicit deny like AWS IAM and what does exist in preview has an extremely narrow scope such that it isn’t very useful.

When you’re ready to assign a role to a security principal (user, service principal, managed identity, Azure Active Directory group (local or synchronized from on-premises) you create what is called a role assignment. A role assignment associates an Azure RBAC Role Definition to a security principal and scope. For example, in the below image I’ve created an RBAC Role Assignment for the Cognitive Services User Role at the resource group scope for the user Carl Carlson. This grants Carl the permission to perform the operations listed in the role definition above to any resource within the resource group, including the Azure OpenAI Resource.

Azure RBAC Role Assignment

Scroll back and take a look at the role definition, notice any risky permission? If you noticed the permission Microsoft.CognitiveServices/accounts/listkeys/action (remember that the Azure OpenAI Service falls under the Cognitive Services umbrella), grab yourself a cookie. As I’ve covered previously, every instance of the Azure OpenAI Service comes with two API keys. These API keys allow for authentication to the instance at the data plane level, can’t be limited in what they can do, and are very difficult to ever track back to who used them. You will want to very tightly control access to those API keys so be wary of who you give this role out to and may want to instead create a similar custom role but without this permission.

The are two other roles which are specific two the Azure OpenAI Service are the Cognitive Services OpenAI Contributor and Cognitive Services OpenAI User. Let’s look at the contributor role first.

{
    "id": "/providers/Microsoft.Authorization/roleDefinitions/a001fd3d-188f-4b5d-821b-XXXXXXXXXXXX",
    "properties": {
        "roleName": "Cognitive Services OpenAI Contributor",
        "description": "Full access including the ability to fine-tune, deploy and generate text",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.Authorization/roleAssignments/read",
                    "Microsoft.Authorization/roleDefinitions/read"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/accounts/OpenAI/*"
                ],
                "notDataActions": []
            }
        ]
    }
}

The big difference here is this role doesn’t grant much at the management plane. While this role may seem appealing to give to a data scientist because it doesn’t allow access to the API keys, it also doesn’t allow access to the instance metrics. I’ll talk about this more when I do a post on logging and monitoring in the service, but access to the metrics are important for the data scientists. These metrics allow them to see how much volume they’re doing with the service which can help them estimate costs and avoid hitting API limits.

Under the dataActions you can see this role allows all data plane operations. These operations include uploading training data for the creation of fine-tuned models. If you don’t want your users to have this access, then you can either strip the permissions Microsoft.CognitiveServices/accounts/OpenAI/files/import/action or grant the user the next role I’ll talk about.

One interesting thing to note is that while this role grants all data actions, which include data plane permissions around deployments, users with this role cannot deploy models to the instance. An error will be thrown that the user does not have the Microsoft.CognitiveServices/accounts/deployments/write permission. I’m not sure if this by design, but if anyone has a workaround for it, let me know in the comments. It would seem like if you want the user to deploy a model, you’ll need to model a custom role after this role and add that permissions.

The last role I’m going to cover is the Cognitive Services OpenAI User role. Let’s look at the permissions for this one.

{
    "id": "/providers/Microsoft.Authorization/roleDefinitions/5e0bd9bd-7b93-4f28-af87-XXXXXXXXXXXX",
    "properties": {
        "roleName": "Cognitive Services OpenAI User",
        "description": "Ability to view files, models, deployments. Readers can't make any changes They can inference",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.Authorization/roleAssignments/read",
                    "Microsoft.Authorization/roleDefinitions/read"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/accounts/OpenAI/*/read",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/completions/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/search/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/generate/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/completions/write",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/search/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/completions/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/embeddings/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/completions/write"
                ],
                "notDataActions": []
            }
        ]
    }
}

Like the contributor role, this role is very limited with management plane permissions. At the data plane level, this role really allows for issuing prompts and not much else. This role is great a non-human application role assigned via service principal or managed identity. It will allow the application to issue prompts and not much else. You don’t have to worry about a user exploiting this role to access training data you may have uploaded or making any modification to the Azure OpenAI Service instance.

Well folks that wraps this up. Let’s sum up what we’ve learned:

  • The Azure OpenAI Service supports fine-grained authorization through Azure RBAC at both the management plane and data plane when the security principal is authenticated through Azure AD.
  • Avoid using API keys where possible and leverage Azure RBAC for authorization. You can make it much more fine-grained, layer in the controls provided by Azure AD on top of it, and associate the usage of the service back to user (kinda as we’ll see in my post on logging).
  • Tightly control access to the API keys. I’d recommend any role you give to a data scientist or an application that you strip out the listkeys permissions.
  • I’d recommend creating a custom role for human users modeled after the Cognitive Services User role but without the listkeys permission. This will grant the user access to the full data plane and allow access to management plane pieces such as metrics. You can optionally be granular with your dataActions and leave out the files permissions to prevent human users from uploading training data.
  • I’d recommend using the built-in Cognitive Services OpenAI User role for service principals and managed identities assigned to applications. It grants only the permissions these applications are likely going to need and nothing more.
  • I’d avoid using notActions and notDataActions since it’s not an explicit deny and it’s very difficult to determine an effective user’s access in Azure without another tool like Entra Permissions Management.

Well folks, I hope this post has helped you better understand authorization in the service and how you could potentially craft it to align with least privilege.

Next post up will be around logging.

Have a great night!