Granular Chargebacks in Azure OpenAI Service

Granular Chargebacks in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Updates:

Yes folks, it’s time for yet another Azure OpenAI Service post. This time around I’m going to cover a pattern that can help with operationalizing the service by collecting and analyzing logging data for proper internal chargebacks to the many business units you likely have requesting the service. You’ll want to put on your nerd hat today because we’re going to need to dive a bit in the weeds for this one.

Let me first address why this post is even necessary. The capabilities provided by the AOAI (Azure OpenAI Service) have the feel of a core foundational technology, almost feeling as necessary as basic networking and PKI (public key infrastructure). The service has usages in almost every portion of the business and, very likely, every business unit is asking you for AI at this point.

Beyond the business demand, the architecture of the Azure OpenAI Service lends itself well to being centralized. Each instance offers the same set of static models and the data sent to and returned from the models is ephemeral. Unless you are creating fine-tuned models (which should be a very small percentage of customers), there isn’t any data stored by the service. Yes, there is default storage and human review of prompts and completions for abuse, but customers can opt out of this process. Additionally, as of the date of this blog, customers do not have access to those stored prompts and completions anyway so the risk of one compromise of those 30-days of stored prompts and completions due to a failed customer control doesn’t exist. Don’t get me wrong, there are legitimate reasons to create business unit specific instances for the service for edge use cases such as the creation of fine-tuned models. There are also good arguments to be made to create specific instances for compliance boundaries and separating production from non-production. However, you should be looking at consolidating instances where possible and providing it as a centralized core service.

Now if you go down the route I suggested above, you’ll run into a few challenges. Two of most significant challenges are throttling limits per instance and chargebacks. Addressing the throttling problem isn’t terribly difficult if you’re using the APIM (Azure API Management) pattern I mentioned in my last post. You can enforce specific limits on a per application basis when using Azure AD authentication at APIM on the frontend and you can use a very basic round robin-like load balancing APIM policy at the backend to scale across multiple Azure OpenAI Service instances. The chargeback problem is a bit more difficult to solve and that’s what I’ll be covering in the rest of the post.

The AOAI Service uses a consumption model for pricing which means the more you consume, the more you pay. If you opt to centralize the service, you’re going to need a way to know which app is consuming which amount of tokens. As I covered in my logging post, the native logging capabilities of the AOAI service are lacking as of the date of this blog post. The logs don’t include details as to who made a call (beyond an obfuscated IP address) or the number of tokens consumed by a specific call. Without this information you won’t be able to determine chargebacks. You should incorporate some of this logging directly into the application calling the AOAI service, but that logging will likely be application centric where the intention is to trace a specific call back to an individual user. For a centralized service, you’re likely more interested in handling chargebacks at the enterprise level and want to be able to associate specific token consumption back to a specific business unit’s application.

I took some time this week and thought about how this might be able to be done. The architecture below is what I came up with:

Azure OpenAI Service Chargeback Architecture

APIM and APIM custom policies are the key components of this architecture that make chargebacks possible. It is used to accomplish two goals:

  1. Enforce Azure AD Authentication and Authorization to the AOAI endpoint.
  2. Provide detailed logging of the request and response sent to the service.

Enforcing Azure AD authentication and authorization gives me the calling application’s service principal or managed identity identifier which allows me to correlate the application back to a specific business unit. If you want the details on that piece you can check out my last post. I’ve also pushed the custom APIM policy snippet to GitHub if you’d like to try it yourself.

The second goal is again accomplished through a custom APIM policy. Since APIM sits in the middle of the conversation it gets access to both the request from the application to the AOAI service and the response back. Within the response from a Completion or ChatCompletion the API returns the number of prompt, completion, and total tokens consumed by a specific request as can be seen below.

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "This is a test message.",
        "role": "assistant"
      }
    }
  ],
  "created": 1684425128,
  "id": "chatcmpl-7HaDAS0JUZKcAt2ch2GC2tOJhrG2Q",
  "model": "gpt-35-turbo",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 6,
    "prompt_tokens": 14,
    "total_tokens": 20
  }
}

Awesome, the information we need is there but how do we log it? For that, you can use an APIM Logger. An APIM Logger is an integration with a specific instance of Azure Application Insights or Azure Event Hub. The integration allows you to specify the logger in an APIM policy and send specific data to that integrated service. For the purposes of this integration I choose the Azure Event Hub. The reason being I wanted to allow logging large events (the integration supports up to 200KB messages) in case I wanted to capture the prompt or completion and I wanted the flexibility to integrate with an upstream service for ETL (extract, transform, load).

Setting the up the logger isn’t super intuitive if you want to use a managed identity for APIM to authenticate to the Azure Event Hub. Once the logger is created, you can begin calling it in the APIM policy. Below is an example of the APIM policy I used to parse the request and response and extract the information I was interested in.

        <log-to-eventhub logger-id="chargeback" partition-id="1">@{
                var responseBody = context.Response.Body?.As<JObject>(true);
                return new JObject(
                    new JProperty("event-time", DateTime.UtcNow.ToString()),
                    new JProperty("appid", context.Request.Headers.GetValueOrDefault("Authorization",string.Empty).Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", string.Empty)),
                    new JProperty("operation", responseBody["object"].ToString()),
                    new JProperty("model", responseBody["model"].ToString()),
                    new JProperty("modeltime", context.Response.Headers.GetValueOrDefault("Openai-Processing-Ms",string.Empty)),
                    new JProperty("completion_tokens", responseBody["usage"]["completion_tokens"].ToString()),
                    new JProperty("prompt_tokens", responseBody["usage"]["prompt_tokens"].ToString()),
                    new JProperty("total_tokens", responseBody["usage"]["total_tokens"].ToString())
                ).ToString();
        }</log-to-eventhub>

In the policy above I’m extracting the application’s client id from the access token generated by Azure Active Directory for access to the Azure OpenAI Service. Recall that I have the other policy snippet in place I mentioned earlier in this post in place to force the application to authenticate and authorize using Azure AD. I then grab the pieces of information from the response that I would find useful in understanding the costs of the service and each app’s behavior within the AOAI service.

Now that the logs are being streamed to the Event Hub, you need something to pick them up. You have a lot of options in this space. You could use Azure Data Factory, custom function, Logic App, SIEM like Splunk, and many others. What you choose to do here really depends on where you want to put the data and what you want to do with it prior to putting it there. To keep it simple for this proof-of-concept, I chose the built-in Azure Stream Analytics integration with Event Hub.

The integration creates a Stream Analytics Job that connects to the Event Hub, does small amount of transformation in setting the types for specific fields, and loads the data into a PowerBI dataset.

Azure Stream Analytics and Event Hub Integration

Once the integration was setup, the requests and responses I was making to the AOAI services began to populate in the Power BI dataset. I was then able to build some really basic (and very ugly) visuals to demonstrate the insights this pattern provides for chargebacks. Each graphic shows the costs accumulated by individual applications by their application id.

Power BI Report Showing Application by Application Costs

Pretty cool right? Simple, easy to implement, and decent information to work from.

Since this was a POC, I cut some corners on the reporting piece. For example, I hardcoded the model pricing into some custom columns. If I were to do this at the enterprise level, I’d be supplementing this information from data pulled from the Microsoft Graph and the Azure Retail Pricing REST API. From the Microsoft Graph I’d pull additional attributes about the service principal / managed identity such as a more human readable name. From the Azure Retail Pricing REST API I’d pull down the most recent prices on a per model basis. I’d also likely store this data inside something like Cosmos or Azure SQL to provide for more functionality. From a data model perspective, I’d envision a “enterprise-ready” data model version of the pattern looking like the below.

Possible data model

The key challenge I set out to address here was how to get the data necessary to do chargebacks and what could I do with that data once I got it. Mission accomplished!

Well folks, that covers it. I’d love to see someone looking for a side project with more data skills than me (likely any human being breathing air today) build out the more “full featured” solution using a similar data model to what I referenced above. I hope this pattern helps point your organization in the right direction and spurs some ideas as to how you could solve the ETL and analysis part within your implementation of this pattern.

I’m always interested in hearing about cool solutions. If you come up with something neat, please let me know in the comments or each out on LinkedIn.

Have a great week!

Logging in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Updates:

  • 1/18/2024 – Logs now include Entra ID security principal objectid property in RequestResponse log

Welcome back fellow geeks.

Over the past few weeks I’ve done a series of posts on the Azure OpenAI Service covering some of the security features of the service. In my first post I gave an overview of what security controls Microsoft makes available for customers to configure to secure their instance of the service. In the second and third posts I did deep dives into the authentication and authorization capabilities of the service. Tonight I’m going to cover the logging capabilities of the service.

Let’s jump right in!

The Azure OpenAI Service emits both logs and metrics. For the purposes of this post I’ll be covering logs. I’ll cover the metrics and monitoring of the service in another post if there is a community interest. Logs emitted by the service have been integrated with the diagnostic setting feature. For those unfamiliar with the diagnostic settings feature, it provides a very simple way to deliver logs and metrics emitted by an Azure service to an Azure Storage Account, Log Analytics Workspace, or to an Event Hub (common use case for passing on to a SIEM like Splunk). In the image below, you can see I’m sending all of the logs and metrics emitted from the service to a Log Analytics Workspace.

Diagnostic Settings

In the image above you can see that the Azure OpenAI Service emits three types of logs which include audit logs, request and response logs, and trace logs. As of the date of this blog all of these logs are sent to the AzureDiagnostics table if you opt to send this logs to a Log Analytics Workspace, so dust off your Kusto skills.

Let’s first take a look at the audit logs, because I know that’s where your security focused eyes darted to. I want to remind you this is a very new service and lots of improvements are coming. Yeah, I did that. I pulled a sales dude move. Seriously though, the audit logging is very limited and likely not what you’d hope for as of the date of this blog. The only events that seem to be logged to the Audit Log for the service are when a ListKeys operation is performed. The operation means a security principal accessed the API Keys. The API keys are used to authenticate to the data plane of the service and do not allow for granular authorization at that plane. Check out my last two posts on authentication and authorization if that sentence doesn’t make sense. Unfortunately, the identity that accessed the API key isn’t listed in the log entry which makes it pretty useless in its current state. Below is a sample entry.

Azure OpenAI Service Audit Log Entry Example

Making this even more useless, this operation is also logged in the Azure Activity Log. The log entry within the Activity Log does include the security principal that performed the action so you’ll want to watch for that activity there. I imagine over time the audit log will be improved to capture more operations and associate those operations to a security principal.

Activity Log Entry Showing List Keys Operation

Next I’m going to cover the Request and Response Logs. This log set is really interesting because likely your expectation is the same as mine was that these would include details around prompts sent to the models and information on the response such as the number of tokens consumed. While it does operations around requests for things such a completions or summarizations, it also captures a ton of other events that would likely be more suited for the audit log. Additionally, the data it captures about these actions is extremely limited.

Let’s take a look at a log entry where I requested the model complete a sentence for me. In my code I’m calling the API using an Azure AD service principal NOT an API key with the shattered hope that the log entry would capture the service principal I’m using.

1/18/2024 – The Entra ID object id is now included in the RequestResponse log entry! Hooray!

Prompt and Response Log Entry

In the above log entry we don’t get any information to correlate the operation back an entity even when using Azure AD authentication. All we can see is the completion action occurred at a specific time and resulted in a success status code. You’ll also see there is a CallerIPAddress field. This will include the first three octets of the IP address called the service but not the last octet. Kinda weird it’s being masked like this, but I guess that’s better than nothing? (Not really, but hey it’s a new service).

Before you ask, no, the content of the prompts and responses are not logged in any of these logs.

There is one additional field of relevance I couldn’t fit within the above screenshot and that’s properties_s. The only real useful information on this is total response time the service took to return an answer to the user. I hoped this would have had some information around tokens used, but sadly it does not.

properties_s field of a Prompt and Response Log Entry

Besides prompts and responses, this log seems to capture other data plane operations. This includes everything from activities users have performed around uploading files to the service to train fine-tuned models, activities around fine-tuned models (listing, creation, deletion), creation of embeddings, and management of models deployed to the service. Most of these operations should be in the Audit Log in my opinion. I’m not sure why they’re included in this log, but they are. No, none of these operations include details as to who performed these actions beyond the first three octets of the IP address.

Lastly, there is the trace log. I have no idea what’s logged in there because I have yet generate any trace log data. If you know what gets logged in there, let me know in the comments.

So yes folks, there are some serious gaps in the logging for the service today. However, the service is new and the underlining technology is still pretty new as well so we can’t expect perfection out of the gates. My advice to customers has been to build the logging they need into whatever application is fronting the user access and to lock the service down from an authorization perspective so that the only access to the service comes through that application.

My peer Jake Wang has come up with a creative solution to address some of the logging gaps in the service by placing an API Management instance in front of it. With this design anything communicating with the Azure OpenAI Service instance has to go through APIM. Within APIM you can do whatever fancy logging you want to do, toss in some additional throttling to specific user requests, and lots of other cool stuff. It’s a great workaround while the Product Group improves the native logging. Check out my recent post for some of the gotchas of APIM logging for the Azure OpenAI Service.

If you have a different API Gateway like Mulesoft you could use this same pattern with that instead of APIM.

Well folks that wraps things up. I hope you got some value out of this post and I’d encourage you to make your voices heard by submitting feedback to the product group on how you’d like to see the logging improved for the service.

Thanks for reading!

Authorization in Azure OpenAI Service

This is part of my series on GenAI Services in Azure:

  1. Azure OpenAI Service – Infra and Security Stuff
  2. Azure OpenAI Service – Authentication
  3. Azure OpenAI Service – Authorization
  4. Azure OpenAI Service – Logging
  5. Azure OpenAI Service – Azure API Management and Entra ID
  6. Azure OpenAI Service – Granular Chargebacks
  7. Azure OpenAI Service – Load Balancing
  8. Azure OpenAI Service – Blocking API Key Access
  9. Azure OpenAI Service – Securing Azure OpenAI Studio
  10. Azure OpenAI Service – Challenge of Logging Streaming ChatCompletions
  11. Azure OpenAI Service – How To Get Insights By Collecting Logging Data
  12. Azure OpenAI Service – How To Handle Rate Limiting
  13. Azure OpenAI Service – Tracking Token Usage with APIM
  14. Azure AI Studio – Chat Playground and APIM
  15. Azure OpenAI Service – Streaming ChatCompletions and Token Consumption Tracking
  16. Azure OpenAI Service – Load Testing

Hello folks!

The fun with the new Azure OpenAI Service continues! I’ve been lucky enough to have been tapped to help a number of Microsoft financial services customers with getting the Azure OpenAI Service in place with the appropriate infrastructure and security controls. In the process, I get to learn from a ton of smart people in the AI space. It’s truly been one of the highlights of my 20-year career.

Over the past few weeks I’ve been posting about what I’ve learned, and today I’m going to continue with that. In my first post on the service I gave a high level overview of the security controls Microsoft makes available to the customer to secure their instance of Azure OpenAI Service. In my second post I dove deep into how the service handles authentication and how Azure Active Directory (Azure AD) can be used to improve over the built-in API key-based authentication. Today I’m going to cover authorization and demonstrate how using Azure AD authentication lets you take advantage of granular authorization with Azure RBAC.

Let’s dig in!

As I covered in my last post, the Azure OpenAI Service has both a management plane and data plane. Each plane supports different types of authentication (process of verifying the identity of a user, process, or device, often as a prerequisite to allowing access to resources in an information system) and authorization (The right or a permission that is granted to a system entity to access a system resource). Operations such as swapping to a customer-managed key, enabling a private endpoint, or assigning a managed identity to the service occur within the management plane. Activities such as uploading training data or issuing a prompt to a model occur at the data plane. Each plane uses a different API endpoint. The image below will help you visualize the different planes.

Azure OpenAI Service Management and Data Planes

As illustrated above, authorization within the management plane is handled using Azure RBAC because authentication to that plane requires Azure AD-based authentication. Here we can limit the operations occurring at the management plane a security principal (user, service principal, managed identity, Azure Active Directory group (local or synchronized from on-premises) can perform by using Azure RBAC. For those of you coming from the AWS world, and where the Azure OpenAI Service may be your first venture into Azure, Azure RBAC is Azure’s authorization solution. It’s similar to an AWS IAM Policy. Let’s take a look at a built-in RBAC role that a customer might grant a data scientist who will be using the Azure OpenAI Service.

{
    "id": "/subscriptions/de90ea7d-a9c3-4957-8c96-XXXXXXXXXXXX/providers/Microsoft.Authorization/roleDefinitions/a97b65f3-24c7-4388-baec-2e87135dc908",
    "properties": {
        "roleName": "Cognitive Services User",
        "description": "Lets you read and list keys of Cognitive Services.",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.CognitiveServices/accounts/listkeys/action",
                    "Microsoft.Insights/alertRules/read",
                    "Microsoft.Insights/diagnosticSettings/read",
                    "Microsoft.Insights/logDefinitions/read",
                    "Microsoft.Insights/metricdefinitions/read",
                    "Microsoft.Insights/metrics/read",
                    "Microsoft.ResourceHealth/availabilityStatuses/read",
                    "Microsoft.Resources/deployments/operations/read",
                    "Microsoft.Resources/subscriptions/operationresults/read",
                    "Microsoft.Resources/subscriptions/read",
                    "Microsoft.Resources/subscriptions/resourceGroups/read",
                    "Microsoft.Support/*"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/*"
                ],
                "notDataActions": []
            }
        ]
    }
}

Let’s briefly walkthrough each property. The id property is the unique resource name assigned to this role definition. Next up we have the name property and description properties which need no explanations. The assignableScopes property determines at which scope an RBAC role can be assigned. Typical scopes include management groups, subscriptions, resource groups, and resources. Built-in roles will always have an assignable scope of “/” which denotes the RBAC role can be assigned to any management group, subscription, resource group, or role.

I’ll spend a bit of time on the permissions property. The permissions property contains a few different child properties including actions, notActions, dataActions, and notDataActions. The actions property lists the management plane operations allowed by the role while the dataActions lists the data plane operations allowed by the role. The notActions and notDataActions are interesting in that they are used to strip permissions out of the actions or dataActions. For example, say you granted a user full data plane operations to an Azure Key Vault but didn’t want them to have the ability to delete keys. You could to this by giving the user the dataAction of Microsoft.KeyVaults/* and notDataAction of Microsoft.KeyVaults/keys/purge/action. Take note this is NOT an explicit deny. If the user gets this permission in another way through assignment of a different RBAC role the user will be able to perform the action. At this time, Azure does not have a generally available feature that allows for an explicit deny like AWS IAM and what does exist in preview has an extremely narrow scope such that it isn’t very useful.

When you’re ready to assign a role to a security principal (user, service principal, managed identity, Azure Active Directory group (local or synchronized from on-premises) you create what is called a role assignment. A role assignment associates an Azure RBAC Role Definition to a security principal and scope. For example, in the below image I’ve created an RBAC Role Assignment for the Cognitive Services User Role at the resource group scope for the user Carl Carlson. This grants Carl the permission to perform the operations listed in the role definition above to any resource within the resource group, including the Azure OpenAI Resource.

Azure RBAC Role Assignment

Scroll back and take a look at the role definition, notice any risky permission? If you noticed the permission Microsoft.CognitiveServices/accounts/listkeys/action (remember that the Azure OpenAI Service falls under the Cognitive Services umbrella), grab yourself a cookie. As I’ve covered previously, every instance of the Azure OpenAI Service comes with two API keys. These API keys allow for authentication to the instance at the data plane level, can’t be limited in what they can do, and are very difficult to ever track back to who used them. You will want to very tightly control access to those API keys so be wary of who you give this role out to and may want to instead create a similar custom role but without this permission.

The are two other roles which are specific two the Azure OpenAI Service are the Cognitive Services OpenAI Contributor and Cognitive Services OpenAI User. Let’s look at the contributor role first.

{
    "id": "/providers/Microsoft.Authorization/roleDefinitions/a001fd3d-188f-4b5d-821b-XXXXXXXXXXXX",
    "properties": {
        "roleName": "Cognitive Services OpenAI Contributor",
        "description": "Full access including the ability to fine-tune, deploy and generate text",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.Authorization/roleAssignments/read",
                    "Microsoft.Authorization/roleDefinitions/read"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/accounts/OpenAI/*"
                ],
                "notDataActions": []
            }
        ]
    }
}

The big difference here is this role doesn’t grant much at the management plane. While this role may seem appealing to give to a data scientist because it doesn’t allow access to the API keys, it also doesn’t allow access to the instance metrics. I’ll talk about this more when I do a post on logging and monitoring in the service, but access to the metrics are important for the data scientists. These metrics allow them to see how much volume they’re doing with the service which can help them estimate costs and avoid hitting API limits.

Under the dataActions you can see this role allows all data plane operations. These operations include uploading training data for the creation of fine-tuned models. If you don’t want your users to have this access, then you can either strip the permissions Microsoft.CognitiveServices/accounts/OpenAI/files/import/action or grant the user the next role I’ll talk about.

One interesting thing to note is that while this role grants all data actions, which include data plane permissions around deployments, users with this role cannot deploy models to the instance. An error will be thrown that the user does not have the Microsoft.CognitiveServices/accounts/deployments/write permission. I’m not sure if this by design, but if anyone has a workaround for it, let me know in the comments. It would seem like if you want the user to deploy a model, you’ll need to model a custom role after this role and add that permissions.

The last role I’m going to cover is the Cognitive Services OpenAI User role. Let’s look at the permissions for this one.

{
    "id": "/providers/Microsoft.Authorization/roleDefinitions/5e0bd9bd-7b93-4f28-af87-XXXXXXXXXXXX",
    "properties": {
        "roleName": "Cognitive Services OpenAI User",
        "description": "Ability to view files, models, deployments. Readers can't make any changes They can inference",
        "assignableScopes": [
            "/"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.CognitiveServices/*/read",
                    "Microsoft.Authorization/roleAssignments/read",
                    "Microsoft.Authorization/roleDefinitions/read"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.CognitiveServices/accounts/OpenAI/*/read",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/completions/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/search/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/generate/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/engines/completions/write",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/search/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/completions/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/embeddings/action",
                    "Microsoft.CognitiveServices/accounts/OpenAI/deployments/completions/write"
                ],
                "notDataActions": []
            }
        ]
    }
}

Like the contributor role, this role is very limited with management plane permissions. At the data plane level, this role really allows for issuing prompts and not much else. This role is great a non-human application role assigned via service principal or managed identity. It will allow the application to issue prompts and not much else. You don’t have to worry about a user exploiting this role to access training data you may have uploaded or making any modification to the Azure OpenAI Service instance.

Well folks that wraps this up. Let’s sum up what we’ve learned:

  • The Azure OpenAI Service supports fine-grained authorization through Azure RBAC at both the management plane and data plane when the security principal is authenticated through Azure AD.
  • Avoid using API keys where possible and leverage Azure RBAC for authorization. You can make it much more fine-grained, layer in the controls provided by Azure AD on top of it, and associate the usage of the service back to user (kinda as we’ll see in my post on logging).
  • Tightly control access to the API keys. I’d recommend any role you give to a data scientist or an application that you strip out the listkeys permissions.
  • I’d recommend creating a custom role for human users modeled after the Cognitive Services User role but without the listkeys permission. This will grant the user access to the full data plane and allow access to management plane pieces such as metrics. You can optionally be granular with your dataActions and leave out the files permissions to prevent human users from uploading training data.
  • I’d recommend using the built-in Cognitive Services OpenAI User role for service principals and managed identities assigned to applications. It grants only the permissions these applications are likely going to need and nothing more.
  • I’d avoid using notActions and notDataActions since it’s not an explicit deny and it’s very difficult to determine an effective user’s access in Azure without another tool like Entra Permissions Management.

Well folks, I hope this post has helped you better understand authorization in the service and how you could potentially craft it to align with least privilege.

Next post up will be around logging.

Have a great night!

Protecting Azure Backups with Resource Guard – Part 2

Welcome to part 2 of my series on Azure Backup and Resource Guard. In my first post, I gave some background on the value proposition Resource Guard provides. In this post I’ll be walking through how to configure it and demonstrating it in action. I’ll be using the lab pictured below which is based off the Azure Backup demo lab I have up on GitHub.

Lab used to demonstrate Resource Guard

For this demonstration, I used my jogcloud.com Azure AD tenant as the primary tenant where the Recovery Service Vaults will be stored. I have a small environment in my at home lab that is configured with a Windows Active Directory forest that is synchronized to that tenant. The geekintheweeds.com Azure AD tenant acted as the secondary tenant containing the Resource Guard. Homer Simpson will act as the owner of the Resource Guard (perhaps he is someone in Information Security) in the geekintheweeds.com tenant and Maggie Simpson will own the subscription containing the workload and Recovery Services Vault (emulating a typical application owner) in the jogcloud.com tenant.

Since both users are sourced from my on-premises Windows Active Directory forest and synchronized to jogcloud.com, I used Azure AD’s B2B feature to invite Homer Simpson into the geekintheweeds.com tenant. Once added through the B2B process, I setup an RBAC assignment granting Homer Simpson Owner of the subscription containing the Resource Guard.

Role Assignment in geekintheweeds.com tenant

Next up I went to create the Resource Guard in the GIW tenant and received the error below when attempting to create the Resource Guard resource in the GIW Tenant.

Error creating Resource Guard

The error message was pretty useless (not atypical of Azure errors). In this case, this error is due to the Microsoft.DataProtection resource provider not being registered in this subscription. Once registering the resource provider in question, the Resource Guard was provisioned without issue. One thing to note is that while the Resource Guard can exist within a different resource group, subscription, or tenant, it must be in the same region as the Recovery Services Vault it is protecting.

Registering Microsoft.DataProtection resource provider

Once the Resource Guard resource was provisioned, I needed to give Maggie Simpson appropriate access over the Recovery Services Vault in the jogcloud.com tenant. For that, I provisioned a role assignment for the Contributor role on the subscription to the System Operators group synchronized from on-premises that Maggie is a member of.

Role Assignment in jogcloud.com tenant

Navigating to the Recovery Services Vault, Maggie Simpson is capable of modifying the soft delete feature (note it’s disabled for the purposes of this lab so the resources can be easily removed).

Prior to Resource Guard, Maggie Simpson can modify Soft Delete

Switching over to Homer Simpson and logging into the jogcloud.com tenant, I attached the Resource Guard to the Recovery Services Vault. The interface allowed me to select the tenant (directory) and the Resource Guard resource.

Enabling Resource Guard

I had configured the Resource Guard to protect all the operations it was able to. This is confirmed with the confirmation in the screenshot below.

Resource Guard configured to protect sensitive operations

Once the Resource Guard is enabled, navigating back to the Security Settings of the Recovery Services Vault displays a message that Resource Guard is now enabled.

Switching back to Maggie Simpson, I then attempted to modify a backup policy which is considered a sensitive operation and is restricted via the association to the Resource Guard. As expected, Maggie Simpson cannot make the modifications to the Recovery Services Vault without authenticating to the GIW tenant and being authorized on the Resource Guard.

Maggie Simpson blocked from using Resource Guard

Success! Here we demonstrated how we can restrict sensitive operations on an Azure Backup Vault even when a user has a high level of permissions on the vault itself. Resource Guard provides a great security mechanism to establish the blast radius that works best for your organization. I could even add Azure AD PIM to the mix in the support just-in-time access of the Resource Guard. The support for cross-tenant specifically is very unique because there are very few (if any) other Azure resources that allow you to leverage a cross-tenant security boundary.

If you’re using Azure Backup, you should be using Resource Guard as an additional security control to add to the controls you are enforcing with Azure RBAC and Azure Policy. With the introduction into preview of the immutable vaults, Microsoft is providing a variety of tools to take a defense-in-depth approach using the technical features that make the most sense to the needs of your organization. WIth that bit of sales speak, I’m out for the weekend.

Thanks for reading!