Microsoft Foundry – Publishing Agents To Teams Deep Dive – Part 2

Posted on May 22, 2026 by mattfeltonma

With Memorial Day weekend coming quickly, I wanted to get the second post to this series out before the knowledge my late nights with Red Eye coffee brought leaks from my brain. In my last post I did a walkthrough of the Publishing Agents To Teams feature of the Foundry Agent Service within Microsoft Foundry. In that post I covered the Portal experience, broke open some of the black box as to my understanding of the workflow that happens underneath when you push the publish button, and talked through the AI Bot Service’s role in the feature. For this post I’m going to cover a possible network architecture to support this feature when security controls are required around inbound and outbound network access (I mentioned a few last post), the network flow for that architecture, and some of the switches and knobs you can turn to add additional security beyond the basic layer 4 network controls. After that, I’m going to walk through a Jupyter Notebook I put together than shows you how to perform the steps behind the publish button programmatically. If you haven’t read my last post, Graeme’s blog post on this topic, and Moim’s blog post on reverse engineering Bot services you should do that before you try to tackle this one.

A Possible Architecture

As I covered in my last post, when we want to make an agent available in Teams we need Teams to be capable of reaching it. In this design, with Teams interacting with the AI Bot Service which relays the information to our agent, this means we need to make the agent’s messaging endpoint available to the Microsoft public backbone (i.e. it needs to be exposed via a public IP address). Graeme provided one architecture to accomplish this which will work for a number of folks. I foresee a few different architectural options:

APIM v2 configured for public inbound and regional vnet integration
APIM classic configured for external mode
App Gateway with a public listener with APIM v2 VNet Injected or PE + regional vnet integration behind it
App Gateway with a public listener with APIM classic VNet injection behind it
Firewall DNAT + APIM v2 VNet Injected or PE + regional vnet integration behind it
Firewall DNAT + APIM classic VNet injection behind it

For this post I’m going to focus on the 3rd option which has Application Gateway sitting in front of an v2 tier API Management. I like this pattern because I get the WAF, SNI, host-based routing, and path-based routing benefits of an App Gw (Application Gateway) and avoid slapping a public IP on my APIM (API Management). There is more complexity to this pattern, but more security and flexibility always comes with more complexity, right?

Generally my traffic will look something like the image you seen down below.

The green line is the incoming message from the Microsoft Teams. We see it is relayed from the Teams Service to the public IP address of the AppGw via the Bot Service. From there, we send it through the APIM and finally on to the Private Endpoint for Foundry which tunnels it on to the Microsoft-managed compute behind the Foundry Agent Service.

The blue line is the response from the agent. You’ll notice there are two blue lines. Based on the logs in my firewall when I tested this, I did not see the response traffic back to the Bot Service (this would be the endpoint in the serviceurl in the JWT received from the Bot Service which should be something like smba.trafficmanager.net). I’m making the assumption that this traffic isn’t egressed through the customer virtual network and instead flows out whatever path Microsoft is providing in the network where the managed compute lives that hosts the agent runtimes. Additionally, you’ll notice a blue line flowing through my virtual network and headed to an FQDN at tenant.api.powerplatform.com. I’m still trying to get clarification on if this flow is truly required and what it’s for.

The first instinct of us old networking farts is to look at this diagram and think this is asymmetric routing. However, in this situation it isn’t because the green and blue flows are separate TCP sessions because the message and response sequence is asynchronous.

Execution of the Architecture

Alright, you now have an understanding of the flow with this architecture. Let’s talk about the cool shit we can do with it. I’ve set the messaging endpoint in my Bot Service resource to https://agent.agw.jogcloud.com/agents/api/projects/sampleproject1/agents/test-manual-publish/endpoint/protocols/activityProtocol?api-version=2025-05-15-preview. What I’ve done is replace my FQDN with my AppGw’s FQDN and I appended /agents after the FQDN to ensure it routes to the proper API on my APIM.

Given we’re starting with AppGw we can use the WAF functionality to validate the source IP address is coming from the Teams service. A simple rule like the below will do that check.

Next, I want to validate the request header of x-ms-tenant-id to validate that the header is present and contains my tenant id.

Next up I have APIM. Here I’ve created an API with an operation named PublishedAgent. The operation is defined as you see below.

Within the operation, I’ve taken Graeme’s policy and made a small tweak to it to validate the serviceurl claim in the JWT and ensure it contains my tenant id.

			
<policies>
    <inbound>
        <base />
        <validate-jwt header-name="Authorization" require-scheme="Bearer">
            <openid-config url="https://login.botframework.com/v1/.well-known/openidconfiguration" />
            <audiences>
                <audience>8fd8ec07-ae24-4038-8771-6d4b85a4b19a</audience>
            </audiences>
            <issuers>
                <issuer>https://api.botframework.com</issuer>
            </issuers>
            <required-claims>
                <claim name="serviceurl" match="all">
                    <value>https://smba.trafficmanager.net/amer/6c80de31-d5e4-4029-93e4-5a2b3c0e1299/</value>
                </claim>
            </required-claims>
        </validate-jwt>
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>

		

If we bring it back up out of the weeds and to the high level, here is what we’re doing at each component in the flow.

So there you have it folks, that’s an architecture you could use and some of the details of getting it up and running. Now let’s bounce over and take a look at how to avoid the manual action of “pushing the pretty blue button” and look at how we’d publish a Foundry Agent programmatically.

Programmatic Setup

The kind folks over at the Foundry Agents PG (product group) put together a sample of the steps needed to do this programmatically with PowerShell and Bicep. Since I prefer good ole bash shell, Python, and Terraform I reworked their steps into a Jupyter Notebook which you can find here. There is a sample env file in the repository. You don’t need to populate the client id and client secret unless you want to play around with the commands in the appendix. Those are not required.

The first step in the process is creation of the Bot Service resource in Azure. As I covered in my last post, this resource mainly exists to store metadata about your bot (or agent in this scenario) that the AI Bot Service uses to relay data back and forth between Teams and the agent. You’ll want to create a new Bot Service which will require you have the specific permissions to do that (if you want to go the custom role) or something more generic like Contributor. You’ll also want to make sure the Bot.Service resource provider is registered in your subscription (pretty sure this requires Owner).

I’ve crafted a Terraform template for this step. Before you can create the Bot Service with the template, you need to collect some Entra ID-related information. First, you’ll need to fetch your Entra ID tenant ID. You can do this programmatically by running after logging into az cli using the command below.

az account show --query tenantId -o tsv

Now that you have you’re logged into az cli and you’ve grabbed the tenant id, your next step is to fetch the principal id (or appId) of the Entra ID Agent Identity associated with the Foundry Agent. You’ll associate this identity with the Bot Service resource. Before you do that, you’ll need to get fetch an access token with the appropriate scope.


from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv

# Get a token for Foundry scope
credential = DefaultAzureCredential()
scopes = ["https://ai.azure.com"]

user_token = credential.get_token(*scopes)

Next you can use this function to grab the principal_id property.

			
import os
import json
import requests
from dotenv import load_dotenv
# Load environmental variables
load_dotenv(override=True)
# Function that gets the agent object
def get_foundry_agent(account_name: str, project_name: str, agent_name: str, token: str):
    """This function retrieves a Foundry agent by name from a Foundry project
    Args:
        account_name (str): The name of the Foundry account
        project_name (str): The name of the Foundry project
        agent_name (str): The name of the Foundry agent to retrieve
        token (str): The authentication token to use for the API request
    Returns:
        dict: The Foundry agent details if found, otherwise None
    """
    response = requests.get(
        f"https://{account_name}.services.ai.azure.com/api/projects/{project_name}/agents/{agent_name}?api-version=v1",
        headers={
            "Content-Type": "application/json", 
            "Authorization": f"Bearer {token}"
        }
    ) 
    if response.status_code == 200:
        return response.json()
    else:
        logging.error(f"Failed to retrieve agent: {response.status_code} - {response.text}")
        return None
# Grab the principal_id of the Entra ID Agent Identity associated with the Foundry Agent  
foundry_account_name = os.getenv("FOUNDRY_ACCOUNT_NAME")
project_name = os.getenv("FOUNDRY_PROJECT_NAME")
agent_name = os.getenv("FOUNDRY_AGENT_NAME")
agent = get_foundry_agent(foundry_account_name, project_name, agent_name, user_token.token)
agent_principal_id = agent.get("instance_identity", {}).get("principal_id")
print(f"Foundry Agent Principal ID: {agent_principal_id}")
print(json.dumps(agent, indent=2))

		

Once you have the tenant id and principal id of the agent identity associated with your Foundry Agent, you are almost ready to create the Bot Service. The last step is formulating your messaging endpoint. It will look something like this:

https://FOUNDRY_ACCOUNT_NAME.services.ai.azure.com/api/projects/PROJECT_NAME/agents/AGENT_NAME/endpoint/protocols/activityProtocol?api-version=2025-05-15-preview

As I showed earlier, you can modify this to change the FQDN to point to your preferred ingress infrastructure and add pathing to the beginning to ensure proper routing through an API Gateway.

Now that you have everything ready to go you can run a Terraform template like the one located here. This will create the Bot Service and Teams channel child object and configure diagnostic settings with delivery to the specified (Log Analytics Workspace).

Once that is complete, you need enable the activity protocol support for your agent. You can do this using the code below:

			
import os
import json
import requests
from dotenv import load_dotenv
# Load environmental variables
load_dotenv(override=True)
# Function that enables the activity protocol for the agent and configures the required Bot Service authorization scheme
def enable_agent_activity_protocol(account_name: str, project_name: str, agent_name: str, token: str):
    """This function enables the activity protocol for a Foundry agent and configures the required Bot Service authorization scheme
    Args:
        account_name (str): The name of the Foundry account
        project_name (str): The name of the Foundry project
        agent_name (str): The name of the Foundry agent to retrieve
        token (str): The authentication token to use for the API request
    Returns:
        dict: The updated Foundry agent details if the update was successful, otherwise None
    """
    # 
    body = {
        "agent_endpoint": {
            "protocols": [
                "responses",
                "activity"
            ],
            "authorization_schemes": [
                {
                    "type": "Entra",
                    "isolation_key_source": {
                        "kind": "Entra"
                    }
                },
                {
                    "type": "BotServiceRbac"
                }
            ]
        }
    }
    response = requests.patch(
        f"https://{account_name}.services.ai.azure.com/api/projects/{project_name}/agents/{agent_name}?api-version=v1",
        headers={
            "Content-Type": "application/merge-patch+json", 
            "Authorization": f"Bearer {token}",
            "Foundry-Features": "AgentEndpoints=V1Preview"
        },
        json=body
    ) 
    if response.status_code == 200:
        return response.json()
    else:
        logging.error(f"Failed to enable agent activity protocol: {response.status_code} - {response.text}")
        return None
# Grab the principal_id of the Entra ID Agent Identity associated with the Foundry Agent  
foundry_account_name = os.getenv("FOUNDRY_ACCOUNT_NAME")
project_name = os.getenv("FOUNDRY_PROJECT_NAME")
agent_name = os.getenv("FOUNDRY_AGENT_NAME")
enabled_agent = enable_agent_activity_protocol(foundry_account_name, project_name, agent_name, user_token.token)
enabled_agent_guid = enabled_agent.get('versions', {}).get("latest", {}).get("agent_guid", {})
print(f"Enabled Agent GUID: {enabled_agent_guid}")
updated_agent_endpoint = enabled_agent.get('agent_endpoint', {})
print(f"Updated Agent Endpoint: {json.dumps(updated_agent_endpoint, indent=2)}")

		

At this point, you have the Bot Service setup and you’ve activated the activity protocol for the agent so its now listening for requests at the messaging endpoint. The last step in the process is to use the publish operation and you will need the Foundry User role for this (as far as I can tell).

What exactly this does is still a bit of a black box for me, but it seems like it’s creating some type of API object to represent the agent in M365 Agent Registry (soon to be rebranded to Agent 365 I’m sure). Some of the APIs I need to poke around with require an Agents 365 license. Once I get that, I’ll update this section with more detail if I find exactly what it’s doing.

import os
import json
import requests
from dotenv import load_dotenv

# Load environmental variables
load_dotenv(override=True)

def publish_agent_teams(
    subscription_id: str,
    resource_group: str,
    account_name: str, 
    project_name: str, 
    location: str,
    agent_name: str, 
    agent_guid: str,
    bot_id: str,
    app_publish_scope: str,
    publish_as_digital_worker: bool,
    app_version: str,
    short_description: str,
    full_description: str,
    developer_name: str,
    developer_website_url: str,
    privacy_url: str,
    terms_of_use_url: str,
    token: str
    ):
    """This function uses the Foundry API to publish a Foundry agent to Microsoft Teams
    Args:
        subscription_id (str): The Azure subscription ID where the Foundry account is provisioned
        resource_group (str): The name of the resource group where the Foundry account is provisioned
        account_name (str): The name of the Foundry account
        project_name (str): The name of the Foundry project
        location (str): The Azure region where the Foundry account is provisioned
        agent_name (str): The name of the Foundry agent to publish
        agent_guid (str): The GUID of the Foundry agent to publish
        bot_id (str): The Microsoft App ID of the Bot registered in Entra ID for this agent
        app_publish_scope (str): The scope to publish the Teams app to, either "Individual" or "Tenant"
        publish_as_digital_worker (bool): Whether to publish the agent as a Digital Worker in Teams, which surfaces it in the Power Virtual Agents app in addition to allowing it to be installed as a standard Teams app
        app_version (str): The version of the Teams app to publish
        short_description (str): A short description of the agent to display in Teams
        full_description (str): A full description of the agent to display in Teams
        developer_name (str): The name of the developer or organization that created the agent, to display in Teams
        developer_website_url (str): The URL for the developer's website, to display in Teams
        privacy_url (str): The URL for the privacy policy for this agent, to display in Teams
        terms_of_use_url (str): The URL for the terms of use for this agent, to display in Teams
        token (str): The Entra ID access token with the scope of https://ai.azure.com/.default to authenticate the API request
    Returns:
        dict: The response from the Foundry API if the publish was successful, otherwise None
    """

    body = {
        "subscriptionId": subscription_id,
        "agentGuid": agent_guid,
        "agentName": agent_name,
        "appRegistrationId": appRegistrationId,
        "botId": bot_id,
        "appPublishScope": app_publish_scope,
        "publishAsDigitalWorker": publish_as_digital_worker,
        "appVersion": app_version,
        "shortDescription": short_description,
        "fullDescription": full_description,
        "developerName": developer_name,
        "developerWebsiteUrl": developer_website_url,
        "privacyUrl": privacy_url,
        "termsOfUseUrl": terms_of_use_url
    }

    response = requests.post(
        url = f"https://{location}.api.azureml.ms/agent-asset/v2.0/subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.MachineLearningServices/workspaces/{account_name}@{project_name}@AML/microsoft365/publish",
        headers={
            "Content-Type": "application/json", 
            "Accept": "application/json",
            "Authorization": f"Bearer {token}",
        },
        json=body
    ) 

    if response.status_code == 200:
        print("Agent published successfully! Status code: 200")
    else:
        logging.error(f"Failed to publish agent: {response.status_code} - {response.text}")
        return None

publish_response = publish_agent_teams(
    subscription_id = os.getenv("FOUNDRY_SUBSCRIPTION_ID"),
    resource_group = os.getenv("FOUNDRY_RESOURCE_GROUP"),
    account_name = os.getenv("FOUNDRY_ACCOUNT_NAME"),
    project_name = os.getenv("FOUNDRY_PROJECT_NAME"),
    location = os.getenv("FOUNDRY_LOCATION"),
    agent_name = os.getenv("FOUNDRY_AGENT_NAME"),
    agent_guid = enabled_agent_guid,
    bot_id = enabled_agent_guid,
    app_publish_scope = "Tenant",
    publish_as_digital_worker = False,
    app_version = "1.0.0",
    short_description = "This is a sample agent published from Foundry to Teams",
    full_description = "This agent was created in Foundry and published to Microsoft Teams using the Foundry API.",
    developer_name = "Carl Carlson",
    developer_website_url = "https://www.example.com",
    privacy_url = "https://www.example.com/privacy",
    terms_of_use_url = "https://www.example.com/terms",
    token = user_token.token
)

This step is effectively the last step in the Foundry Portal publishing experience. If you installed it for an individual it will be immediately available for that user. If you publish it to the Teams App Catalog (tenant option) it will be put in a pending state until approved via the M365 Admin Portal.

And like magic, you have a programmatic way to emulate the magical blue button in the Foundry portal. If you’re curious as to what that API call is going to an AML (Azure Machine Learning) endpoint, that is because (today at least) Foundry is built on top of AML.

Summing it up

What I’ve hoped you gathered from here is publishing an agent to Teams isn’t as simple as pushing a button. Requirements needs to be gathered, a design needs to be worked out, services chosen, service properties chosen for security and scale, services load tested, and security controls properly implemented and any risks accepted.

You have a ton of flexibility with this design and my take is there is no optimal design. The optimal design is the one that provides you with the user experience you require aligned with the risks your org is willing to accept. If you’re building an agent that is hitting some public data source, maybe you don’t care about any of this infrastructure. Either way, do not just hit the publish button, group up with your peers across security, networking, operations, collaboration, and AI engineering and put your heads together to come up with a design you’re all happy with.

With that, I’m out for Memorial Day weekend. See you next time!

Microsoft Foundry – Publishing Agents To Teams Deep Dive – Part 1

Posted on May 20, 2026 by mattfeltonma

Welcome back folks! Today we’re gonna delve deep into the weeds to look at the current process for publishing Foundry agents to Microsoft Teams. I say current, because Microsoft Foundry and everything surrounding it lives in a dynamic world. Changes come fast and frequently. What I present today, may not be true in two weeks. I attempt to keep my posts up to date, but always remember to check the date of the post and review public documentation for the “official” word. With that disclaimer in play, let’s jump in.

The Background

Microsoft Foundry is a collection of a crapload of different services. I hesitate to call it a “product” because with how big the feature scope is it’s almost a platform rather than a product at this point. You have models-as-service, Foundry tools (FKA as AI Services FKA as Cognitive Services), Foundry Toolbox (LOVE this feature and will be writing something up about it soon), Content Understanding, Foundry IQ (not really Foundry IMO, more so AI Search but marketing loves the term Foundry), and Foundry Agent Service. I’m sure come Microsoft Build and Microsoft Ignite there will be even more in that umbrella. It’s an interesting journey on how this service came to be. You can take a read through my prior post which walks through the evolution of the service. For this post I’m going to focus specifically on Foundry Agents.

The Foundry Agent feature is probably one of the most dynamic features (or sub-service) of Microsoft Foundry because the technology area it supports changes daily which drives needs for the product to adapt and grow. Some of the benefits that pop into my mind of Foundry Agents vs running an agent on your own compute (in an on-premises Kubernetes cluster, in EKS, AKS, ACA, what have you) is:

Shift the management and scaling of the compute to Microsoft
Versioning out-of-the-box
Crank an agent out since 90% of the work is already done for you (Foundry Hosted Agents are another story)
Get access to all the Foundry Agent Tools out of the box (this benefit will likely be phased out with the introduction of Foundry Toolbox IMO)
Ability to directly publish the agent to Microsoft Teams without having to figure out that integration yourself

The final bullet point will be the focus of the rest of the post. Recently, I found time to play with that feature and decided to dive into it after a great blog post by my peer Graeme Foster. I highly recommend you take a read through Graeme’s post and treat his post as the “official” recommendation vs whatever I blather about here. The part that interested me about his write-up was the call out to Azure AI Bot Services. Bot Services is an Azure service I’ve touched a few times, but never really dove deeply into. Last year my buddy Mike Piskorski and I helped a customer onboard a Teams recording bot into a regulated organization’s network. We dove deeply into the networking side of things to get the traffic flowing, but never really dissected the inner workings of it (limited bandwidth, story of my life). Since exposing an agent direct to a user in Microsoft Teams so users can intact with it is a super common ask, and Foundry Agents provide for this out of the box, I thought it would be as good time as any to dig in.

The Portal Experience

The official documentation around the Portal process is documented here, but I wanted to dig into some the guts of it. Like Microsoft has done much of its existence, it likes to make things a “push of the blue button”. This integration is no different. After logging into the Foundry Portal and creating an agent I get a pretty Publish button in the top right hand corner as seen below.

I assume in my head, sweet, let’s do this! I hover over the button to click it, but oh no, the pop up below surfaces. For this button to be available to you, the user needs to have the Contributor or Owner at the resource group or subscription level. We’ll see why in a few minutes.

The pop up message tells me I can’t publish my Foundry resource if public network access is disabled (NOTE: this will be changing at some point). So what does this message mean? Without derailing this entire post with a deep dive into Foundry Agent networking, I’ll keep the explanation brief. For my Foundry deployment, I’ve chosen to block public inbound traffic to the Foundry resource and am forcing everything through a Private Endpoint. This control blocks whatever orchestration this button does today. One option is to enable public networking, push the button, and then disable public network access (not great). There is another programmatic option for customers that have public network access off and walk through in detail in my next post, but for now let’s enable public network access and step through the flow in the Portal.

Once public network access is re-enabled, I’m able to click the magical button. Pushing the button brings up the first window Publish to Teams and Microsoft 365. Here a few things happen. First, we’re given the option to customize what will ultimately be used to provide information about the agent to the Teams App Catalog or manifest file. I’m no Teams guy and won’t fake being one so my dumb guy explanation of the Teams App Catalog is its the central repository for Teams Apps (and agents) available for consumption by an organization’s Teams users. The manifest file is basically the same information put in json form that can be used to sideload the app (or agent in this use case) which is a way you can load the app into Teams for yourself and is typically used in testing scenarios before pushing up to the Teams App Catalog.

The other thing this step does is auto-provision an Bot Service resource in Azure. This is where the requirement for Contributor or Owner comes in (and is another reason why this GUI-drive process won’t work for most enterprises). It places it in the same resource group as the Foundry resource. This is probably not something you want happening automatically and this behavior is likely to change in the future allowing you to select a pre-created Bot Service resource. Think of the Bot Service resource as a metadata resource of sorts (again, my explanation and probably only 50% right, but I got a head nod on the explanation from my excellent and much smarter peer Shaun Callighan so there is that). It will help the Azure AI Bot Service facilitate the communication between Microsoft Teams and the Foundry Agent. I’ll dig into more details on later in this post.

I then hit the Next: Publish Options button and I’m faced with the Publish options window. Here I can choose to publish the agent to Teams just for my user or to publish it to the Teams App Catalog for all users (which will require a Teams administrator to approve). Optionally, I can download the Teams manifest file and further customize it (add a custom icon or something more fancy that is outside my limited Teams knowledge).

Publishing it to the Teams App Catalog will require an administrator to approve it in the Microsoft 365 Admin portal. The request will immediately appear in the Microsoft 365 Admin Portal in the request section as seen below. From there you’ll be given some options as to how you want to distribute it users across Microsoft Teams. After approval and installation, in my experience it can take a fair amount of time (6+ hours – 1 day) for the agent to be available to Teams users to use.

For the purposes of this walkthrough, I’m going to choose the Just you option and then I’ll hit the Publish button. Once complete, you’ll get a message indicating the publish was successful.

Bouncing over to the Microsoft Teams client, I see the new agent available to install.

Once it’s added and I send my first message to the agent, I’m prompted to sign in to Microsoft Foundry. This is triggered because the agent on the other side needs to know who I am in order to authorize me to access the agent.

If my user isn’t authorized (doesn’t hold the Foundry User (FKA as Azure AI User)) Azure RBAC role over the Foundry resource I’m denied and I can’t interact with the agent.

Understanding this user experience and RBAC requirement is important. While the Publish button can be used to push the agent as a Teams App to your users, the users themselves still need to hold the appropriate Azure RBAC role (Foundry User in this scenario or similar level permissions) to interact with them. While this is a tad annoying, it’s actually a nice belt and suspenders security control to ensure only trusted users get access to the agent.

Excellent, so we pushed a button and a lot of stuff happened. Well, what stuff happened? What if, like any normal enterprise, I don’t want to give my business unit contributor or owner at the Azure subscription level? What if, again like any normal enterprise, I have different groups in charge of Foundry, Azure general, and Teams? What if I want to do this programmatically? These were the questions on the top of my mind. So now let’s dive into the weeds and reverse engineer this process.

What the hell is happening when I push this button?

This is naturally what went through my head. Before I annoyed the awesome people within the Foundry product group (and yes, these are some of the nicest and smartest people at Microsoft I’ve dealt with in my years here) I wanted to see if I could figure it out myself.

My first step was to turn on debug mode in the browser and look at the network capture. My hope was that I’d see calls made to the Microsoft Graph API (for Teams stuff), the ARM API for Azure stuff, and Foundry data-plane API for data plane stuff. Instead of that, I saw calls made to what to the following endpoints:

Press Publish to Teams and Microsoft 365 button
- https://ai.azure.com/nextgen/api/query?listBotServicesResolver -> As best I can tell, this is querying to see if a Bot Service already exist.
- https://ai.azure.com/nextgen/api/createBotService -> This makes the call that triggers the creation of the Bot Service resource in Azure.
Press Next: Publish Options
- https://ai.azure.com/nextgen/api/prepareAgentsForTeams -> Generates the manifest file that you can download?
Press Publish
- https://ai.azure.com/nextgen/api/publishAgentToTeams
- https://ai.azure.com/nextgen/api/getPublishedAppDetails

What this told me was the Product Group has built their own orchestration layer on top of whatever is being done to the Microsoft Graph, ARM, and Foundry data-plane APIs. This didn’t get me any closer to figuring out what was going on. I had theories, but no way to validate them. At that point I went to the product group and one of those wonderful human beings began to peel away those layers of the onion by providing a programmatic way to run through process. I read through her code, converted it from PowerShell/Bicep into Python and Terraform and documented this high-level process. I’ll share and walk through all of this in my next post.

This is very high-level (we’ll look at the code-based implementation next post) but it’s the best I could piece together from the programmatic steps. It’s likely missing some steps because the one step I’m not super clear on is the step labeled Pending approval in M365 Admin Portal. The reason that piece is a bit unclear for me is two fold. One, even programmatically, this is done through a Foundry API hiding what’s done in other APIs from me. Two, I’m fairly certain it’s using new features the Microsoft Entra Agent Registry (now a part of Agent 365) and those APIs are largely locked behind Agent 365 licensing which I’m still waiting on approval for my tenant. My theory at this point is the Foundry API call is creating an agent instance within the Entra ID Agent Registry and/or something with CoPilot packages. I’ll add more detail to this if I get more insight into it once I get access to Agent 365.

Either way, once I had that high level workflow out of my brain and on digital paper, I was ready to take the product groups PowerShell / Bicep and rework it into a Jupyter Notebook which I’ll run through next post. Before I go there, I wanted to spend a bit of time on the Bot Service piece since that has always been a real mystery for me and many of my Azure peers.

What does the Bot Service do?

The Azure AI Bot Service (as it’s now called) has historically been used to build bots that can be exposed to Teams. I’m sure there are many other smarter uses, but that’s where I’ve seen it typically pop up in my time at Microsoft. As I mentioned earlier, my buddy Mike and I worked on helping a customer integrate a Teams Recording Bot that used the Bot Service. Today, the hot usage is exposing agents built within CoPilot Studio or Foundry to Teams as chat bots.

From an infra guy’s view, the Bot Service has always been this thing I knew existed, kinda understood how it worked from a network perspective and what it delivered form a value perspective, but really only focused on getting the traffic from Teams to the Bot Service into the application running the Bot Service Framework. Typically, this required exposing an application deployed to the customer’s private network to the Microsoft public backbone so it the Bot Service Connector (they relay piece of the service, my take) can hit the Bot app. This process would typically require placing the application behind a firewall with DNAT, behind an App Gateway (for l7 load balancing, WAF, and header checks) or some other layer 7 load balancer), behind FrontDoor in combination with PrivateLink, or behind something more complex such as a layer 7 load balancer in combination with an API Gateway (such as API Management) to do additional validation of the JWT as mentioned in Graeme’s blog I linked at the beginning of this post. Great, we got packets flowing, but much of the service was still a black box. I wanted to know more.

In my searching of the web, I came across an absolutely amazing blog post by Moim Hossain. Moim goes into an insane amount of detail as to how the Bot Service words under the hood. I’m not going to repeat everything he says, because I really anyone using Azure that will touch the Bot Service should read Moim’s rundown. It is THAT good.

Based on Moim’s blog (yeah I’m going to force you to read it if you want the details), I put together the high level flow of how I believe the Bot Service works. Likely missing pieces, but I feel like it’s more than what’s out there today.

After reading Moim’s blog and referencing the flow above we can see that the Bot Service is acting as a type of relay between the Microsoft Teams Service and the underling Bot Application. We can make a reasonable assumption (key word assumption) that the Foundry Agent integration is working somewhat similar, but with some differences given the additional RBAC check and nature of their push button integration.

If we crack open a Bot Service resource Foundry creates automatically, we can get some insights into how the Bot Service is being configured.

First we see that the messaging endpoint (the endpoint where the Bot Service Connector delivers messages it receives from Teams to) is the set in a format of https://FOUNDRY_ACCOUNT_NAME.services.ai.azure.com/api/projects/PROJECT_NAME/agents/AGENT_NAME/endpoint/protocols/activityprotocol?api-version=2025-11-15-preview. If we disable public network access and lock our Foundry resource behind a Private Endpoint, this URI would be unreachable by the Bot Service Connector. This creates the requirement I discussed earlier where we have to put some other infrastructure in front of those endpoint to make it available to the public IP world and mitigate the risks to do so (such as Graeme’s suggestions in his blog). Enabling Private Link for the Bot Service will not rid you of this requirement because that feature is centered around the use case of Direct Line which is more so used for custom built bots running in App Services (layman’s view) and only locks down the inbound access to the Bot Service. The integration with Teams requires the Bot Service stay public network access enabled, which means the traffic to the Bot App (agent) is going to come from the Microsoft public backbone driving this additional infrastructure requirement. This is where we’d modify the messaging endpoint to point to some other public IP-facing infra and route it to the Foundry Private Endpoint accordingly.

The next thing worth looking at is the Microsoft App ID. If we query this via the az cli with az ad sp show –id, we’ll see this maps to the Entra ID Agent Identity of the Foundry Agent. Anytime you create a Foundry Agent, an Entra ID Agent Identity Blueprint and Entra ID Agent Identity is provisioned inside of Entra ID. I plan on covering Entra ID Agent Identities in another post in the future, but for now think of them as a subclass of a service principal designed to cater to the security needs and ephemeral nature of agents. One of the best write-ups you can find online on this topic right now is from Christian Post. His series on the topic is worth a read.

By setting the Microsoft App ID to the agent’s Entra ID Agent Identity, we tie the bot service to the agent. We’ll see how this comes below.

Let’s take a look at a message coming from the Bot Service Connector into the agent. I captured this using an App Gateway + APIM pattern (similar to what is in Graeme’s post) and turned on request/response logging and captured all of the headers.

What we get is this:

			
[
  {
    "headers": {
      "Authorization": {
        "type": "Bearer",
        "token": {
          "header": {
            "alg": "RS256",
            "kid": "SERixAMWrs46-gqrTrtMrkfbnuE",
            "x5t": "SERixAMWrs46-gqrTrtMrkfbnuE",
            "typ": "JWT"
          },
          "payload": {
            "serviceurl": "https://smba.trafficmanager.net/amer/6c80de31-d5e4-4029-93e4-5a2b3c0e1299/",
            "nbf": 1778985043,
            "exp": 1778988643,
            "iss": "https://api.botframework.com",
            "aud": "a6790ff6-8752-4654-8ad8-4129842d1042"
          },
          "signature": "hU7bOD-Awszt9zN07bwk7XtQ_E6hT1QGYgBQnDGxz75BF-QvO6gXBjCmo7FTjWizVXccen3mRi5xvSUIWO-vmrydJ9x5nSNaaVvIsHJm8T2agY3iOFDy_0Ii1t3uevJyiRqM_3T8Zi82T8P3umK6x3arkRbBzCWWQHJJs53pYm9m1lKyBax4jddjA3zBWltdcEtZixUEr9L73Qkoj4jU6d-QHyOxKAZnSJCaKzgAhtVOyQDMHU04PnPDNVKEQ5Efb2e5dx4Nqg2HoH1XQraa3zmE5_BGpIx1lIWPXA0oLaDLVnAhDEsS65H4mm48xCsR3l6VKgJc15pLPauTb5SoUw"
        }
      },
      "Content-Length": "1098",
      "Content-Type": "application/json; charset=utf-8",
      "Host": "apim-example5ji.apim.XXXX.com",
      "Max-Forwards": "10",
      "User-Agent": "Microsoft-SkypeBotApi (Microsoft-BotFramework/3.0)",
      "X-AppGW-Trace-Id": "8ca4348bdd71a1c004935143c7cf7cb0",
      "X-ORIGINAL-HOST": "agent.XXXX.com",
      "x-ms-conversation-id": "a:1Xl9msHS_A_eeI1hPHlVR_8OIrMzE90dFdnC8eYmn8UyRlMA4-VaE-Z5omzp-U8cu-PyufpeI08o9sxtVj2S_Wq_beuvR8VGThDKyePyyll8UqG3Wg7ZMmI5OBsVZMWY8",
      "x-ms-tenant-id": "6c80de31-d5e4-4029-93e4-5a2b3c0e1299",
      "MS-CV": "6KZTHFyq5rFY/64p/ywW+A.1.1.1.1.1011601833.1.1",
      "X-FORWARDED-PROTO": "https",
      "X-FORWARDED-PORT": "443",
      "X-Forwarded-For": "52.112.116.120:15428;10.0.12.5",
      "X-Original-URL": "/foundry/api/projects/sampleproject1/agents/published-agent-1/endpoint/protocols/activityprotocol?api-version=2025-11-15-preview",
      "X-ARR-LOG-ID": "633dc336-3a11-4095-9540-d39a0cd99dc4",
      "CLIENT-IP": "[fd40:5f98:1f:9145:6e4f:200:a00:c05]:48814",
      "DISGUISED-HOST": "apim-example5ji.apim.XXXX.com",
      "X-SITE-DEPLOYMENT-ID": "apimwebappF7goUDzbkpZ1MlRL4Klo4uAVLlNNaKbE2UiIMOxN__d61e",
      "WAS-DEFAULT-HOSTNAME": "apimwebappf7goudzbkpz1mlrl4klo4uavllnnakbe2uiimoxn.azurewebsites.net",
      "X-MS-PRIVATELINK-ID": "520132703",
      "X-AppService-Proto": "https",
      "X-ARR-SSL": "4096|256|CN=R12;O=Let's Encrypt;C=US|CN=apim-example5ji.apim.XXXX.com",
      "X-Forwarded-TlsVersion": "1.3",
      "X-WAWS-Unencoded-URL": "/foundry/api/projects/sampleproject1/agents/published-agent-1/endpoint/protocols/activityprotocol?api-version=2025-11-15-preview",
      "X-Azure-JA4-Fingerprint": "t13d311100_e8f1e7e78f70_a11995863d32"
    },
    "severity": "Information",
    "timestamp": "2026-05-17T02:30:43.8931561Z",
    "source": "request-headers"
  }
]

		

In the above I decoded the JWT included in the authorization header. In the JWT we can see that it’s been issued by https://api.botframework.com which jives to Moim’s blog in that the Bot Service has its own STS (security token service) which is used to generate access tokens to authenticate downstream to the Bot. The serviceUrl tells the agent where to send the response it generates for the user’s question. You’ll see that my tenant id is appended to this URL. The audience in this case maps to the published-agent-1’s Entra ID Agent Identity. As Graeme recommends, you can craft a simple policy in APIM to validate this information to assure the access token is coming from a trusted tenant and is intended for the agent its being sent to.

We also have a header called x-ms-tenant-id (thanks to my peer Shaun Callighan for pointing this out to me) which could also be checked at the App Gateway (or similar L7 gateway) to do some degree of validation. Not as good as a JWT validation Graeme suggested, but it’s something if a full fledged API Gateway is too much for you.

Summing it up for now

Okay, my brain is fried and I’m sure yours is too so I’m going to save walking through the programmatic process for tomorrow. For that post I’ll focus less on the whats and whys and more so on how the hows. At this point you should have a reasonable good understanding of what this button actually does and why this button will not be an option for most enterprises. The few callouts I’ll make:

The publish button in the Foundry Portal (today) requires the user to have Contributor or Owner on the resource group or subscription because it automatically deploys a Bot Service resource to the resource group the Foundry resource is in. Likely a no go from the start for most enterprises.
The publish button in Foundry Portal (today) requires the Foundry resource have public network access enabled. Likely a no go from the start for most enterprises.
Once the agent is published to teams, the users interacting with it require the Foundry User role in order to interact with the agent. If they don’t have it, they’ll get an authorization failure when trying to chat with the bot.
When public network access is disabled for the Foundry resource, you’ll need to find a way to make that endpoint accessible using a public IP. You have lots of patterns available to you here. At layer 4, you should be able to lock down inbound traffic to Teams traffic at 52.112.0.0/14 and 52.122.0.0/15. If you’re ingressing via a firewall, it’s a simple firewall rule. If you’re ingressing from an Application Gateway you can use a WAF rule. Wait for this to be officially documented in the Microsoft public documentation.

In combination with the above, you should ideally go the route Graeme suggested and to incorporate some piece of infrastructure, such as APIM, that can do full validation of the JWT. Header validation of the x-ms-tenant-id is an option, but it’s not to the level of mitigation that full JWT validation is. Patterns for this include:
- APIM v2 configured for public inbound and regional vnet integration
- APIM classic configured for external mode
- App Gateway with a public listener with APIM v2 VNet Injected or PE + regional vnet integration behind it (my preference)
- App Gateway with a public listener with APIM classic VNet injection behind it (my preference)
- Firewall DNAT + APIM v2 VNet Injected or PE + regional vnet integration behind it
- Firewall DNAT + APIM classic VNet injection behind it
There is an outbound flow from the Foundry Agent subnet (assuming you’re using Foundry Agents with VNet injection) that is sent to and endpoint at tenant.api.powerplatform.com (mine was il-6c80de31d5e4402993e45a2b3c0e12.99.tenant.api.powerplatform.com) where the 6c…….12.99 was my tenant id). I’m still trying to get clarity as to what this call is for. I’ll update this once I get it. I’d expect to see traffic to the endpoint in the serviceUrl but it looks like that traffic is flowing out the Microsoft side vs being tunneled into the customer virtual network even with Vnet injection (not uncommon for Foundry Agents w/ VNet injection).
There is a programmatic way to do this without having to use the publish button which I’ll cover next post.

See you next post folks!

Microsoft Foundry – BYO AI Gateway – Part 3

Posted on April 19, 2026 by mattfeltonma

Hello once again folks! Today I’m going to add yet another post to my BYO AI Gateway feature of Microsoft Foundry series. In my first post I gave a background on the use case for this feature, in the second post I walked the concepts required to understand the feature, the resources involved in the setup, and the schema of those resource objects. In this post I’m going to walk through the architecture I setup to play with this feature, why I made the choices I did, and dig into some of the actual Terraform code I put together to set this whole thing up. Let’s dive in!

The foundational architecture

When I wanted to experiment with this feature I wanted to test it in an architecture that is typical to my customer base. For this I chose the classic tried and true hub and spoke architecture. I opted out of VWAN and went with a traditional virtual network model because I prefer the visibility and control to that model during experimentation. When the hub becomes a managed VWAN Hub, I get that fancy overlay which makes invisible some of the magic of what is happening underneath. This model enables me to do packet captures at every step and manage routing at a very granular level, which is a must when playing with cutting edge features.

For this setup I have a lab I built out in Terraform which gives me that hub and spoke architecture, centralized DNS resolution, logging, and access to multiple regions. The multiple regions piece of the puzzle is key because feature availability across Foundry features and APIM v2 SKUs are still in flux. The lab also uses three spoke virtual networks. This gives allows me to plop pieces in different spokes to see how things behave and track traffic patterns. It also gives me flexibility when I need to wait for purge operations like when purging a Microsoft Foundry resource configured with a standard agent setup and clearing the lock on the delegated subnet for the VNet injection model. If you’ve mucked around with this you know sometimes it can be 15 minutes and sometimes it can be 2 days.

I drop one of three spokes into one of the “hero” regions. This is a region that gets new features sooner than ours. For example, in this lab I drop it into East US 2 while the hub and other two spokes go in West US 3 (where I’m less likely to run into an quota or capacity issues). East US 2 gives me the option to deploy APIM v2 Standard SKU. In the next section I’ll explain why I’m going with v2 for this experimentation.

AI Gateway Architecture

For an AI Gateway I decided to use APIM. My buddy Piotr Karpala has a great repository of 3rd-party AI Gateway solutions if you want to test this with something outside of APIM. I’m going to plop this into the “hero” region spoke in East US 2 to so I can deploy a v2 Standard SKU. The reason I’m using a v2 SKU is it provides another networking model that the classic SKUs do not, and that is Private Endpoint and VNet integration. In this model I block public traffic to the APIM service, create a Private Endpoint to enable private inbound access, and setup VNet integration to a delegated subnet to keep outbound traffic from any of the APIM instances flowing through my virtual network so I can mediate it and optionally inspect it. While the Private Endpoint is only supported for the Gateway and not the Developer Portal, I don’t care in this instance because I don’t plan on using the Developer Portal on an APIM acting as an AI Gateway. You can also create a private endpoint for a APIM v2 service instance that uses VNet injection, but it requires the Premium SKU and I’m super cheap, so I opted out of that.

**APIM v2 with Private Endpoint and VNet Integration**

The reason I picked this networking model for APIM is it makes it easy for me to inject the service into a Microsoft Foundry account configured with a standard agent and the managed virtual network model. In a future post I’ll dive more into the managed virtual network model. For now, just be aware that is exists, it’s in preview, and it doesn’t have many of the limitations the Foundry Agent Service VNet injection model has. There are considerations no doubt, but my personal take is it’s the better of the two strategically.

On the APIM instance I configured two backend objects, one for each Foundry instance. The backends are organized into a pooled backend so I could load balance across the two Foundry instances to maximize my TPM (tokens per minute). I defined four APIs. Two APIs support the Azure OpenAI inferencing and authoring API, one supports the Azure OpenAI v1 API, and the last is a simple custom Hello World API I use to test connectivity. I use two APIs for the Azure OpenAI inferencing and authoring API because one is designed to support APIM as an AI Gateway uses some custom policy snippets and the other is very generic and is used to test model gateway connections from Foundry purely so I’m familiar with the basics of them.

Foundry Architecture

The Foundry architecture is quite simple. I deployed a single instance of Foundry configured to support standard agents and using a VNet injection model. A subnet is delegated in a different spoke to support the agent vnet injection and supporting Private Endpoints are deployed to a separate subnet in that same virtual network.

The whole setup looks something like the below:

Setting up the AI Gateway

At this point you should have a good understanding of what I’m working with. Let’s talk button pushing. The first thing you’ll need to do is get your AI Gateway setup. To setup the APIM instance I using the Terraform AzureRM and AzApi providers. Like I mentioned above, it was setup as a v2 with the standard SKU public network access disabled, inbound access restricted to private endpoints and outbound access configured for VNet integration. You can find the whole of the code in my lab repository if you’re curious. For the purposes of the post, I’ll only be including the relevant snippets.

One critical thing to take note of is whatever networking model you choose for APIM for this integration, you need to use a certificate issued by a trusted public CA (certificate authority). This is required because at the date of this post, the agent service does not support certificates issued by private CAs. Reason being, you have no ability to inject that root and intermediate certs into the trusted store of the agent compute. For this lab I used the Terraform Acme and Cloudflare providers. It’s actually not bad at all to have a fresh cert provisioned directly as part of the pipeline for labbing and the like, and best part is it’s free for cheap people like myself. There is a sample of that code in the repo.

As I mentioned in my last post, the BYO AI Gateway integration with Foundry supports static or dynamic setup. In the static model you define the models directly in the connection metadata you want to be made available to the connection (see my last post for an example). In the dynamic model the models can be fetched by an API call to the management.azure.com API. This latter option requires additional operations be defined in the API such as what you see below.

			
## Create an operation to support getting a specific deployment by name when using the Foundry APIM connection
##
resource "azurerm_api_management_api_operation" "apim_operation_openai_original_get_deployment_by_name" {
  depends_on = [
    azurerm_api_management_api.openai_original
  ]
  operation_id        = "get-deployment-by-name"
  api_name            = azurerm_api_management_api.openai_original.name
  api_management_name = azurerm_api_management.apim.name
  resource_group_name = azurerm_resource_group.rg_ai_gateway.name
  display_name        = "Get Deployment by Name"
  method              = "GET"
  url_template        = "/deployments/{deploymentName}"
  template_parameter {
    name     = "deploymentName"
    required = true
    type     = "string"
  }
}
## Create an operation to support enumerating deployments when using the Foundry APIM connection
##
resource "azurerm_api_management_api_operation" "apim_operation_openai_original_list_deployments_by_name" {
  depends_on = [
    azurerm_api_management_api_operation_policy.apim_policy_openai_original_get_deployment_by_name
  ]
  operation_id        = "list-deployments"
  api_name            = azurerm_api_management_api.openai_original.name
  api_management_name = azurerm_api_management.apim.name
  resource_group_name = azurerm_resource_group.rg_ai_gateway.name
  display_name        = "List Deployments"
  method              = "GET"
  url_template        = "/deployments"
}

		

You then define a policy for that operation to configure it to call the correct endpoint via the ARM API like below. Notice I used the authentication-managed-identity policy snippet to use the APIM managed identity to call the Foundry resource to fetch deployment information. If you’re sharing the API across backends, make sure all backends have all the same models deployed. If not, you’ll need to incorporate some additional logic to hit the backend for each pool to ensure you don’t return models that don’t exist in a specific backend. This will require your APIM instance managed identity to have at least the Azure RBAC Reader role over the Foundry resources.

			
## Create an policy for the get deployment by name operation to route to the Foundry APIM connection
##
resource "azurerm_api_management_api_operation_policy" "apim_policy_openai_original_get_deployment_by_name" {
  depends_on = [
    azurerm_api_management_api_operation.apim_operation_openai_original_get_deployment_by_name,
  ]
  api_name            = azurerm_api_management_api.openai_original.name
  operation_id        = azurerm_api_management_api_operation.apim_operation_openai_original_get_deployment_by_name.operation_id
  api_management_name = azurerm_api_management.apim.name
  resource_group_name = azurerm_resource_group.rg_ai_gateway.name
  xml_content = <<XML
    <policies>
      <inbound>
        <authentication-managed-identity resource="https://management.azure.com/" />
        <rewrite-uri template="/deployments/{deploymentName}?api-version=${local.ai_services_arm_api_version}" copy-unmatched-params="false" />
        <!--Specify a Foundry deployment that has the models deployed -->
        <set-backend-service base-url="https://management.azure.com${azurerm_cognitive_account.ai_foundry_accounts[keys(local.ai_foundry_regions)[0]].id}" />
      </inbound>
      <backend>
        <base />
      </backend>
      <outbound>
        <base />
      </outbound>
      <on-error>
        <base />
      </on-error>
    </policies>
  XML
}
## Create an policy for the list deployments operation to route to the Foundry APIM connection
##
resource "azurerm_api_management_api_operation_policy" "apim_policy_openai_original_list_deployments_by_name" {
  depends_on = [
    azurerm_api_management_api_operation.apim_operation_openai_original_list_deployments_by_name
  ]
  api_name            = azurerm_api_management_api.openai_original.name
  operation_id        = azurerm_api_management_api_operation.apim_operation_openai_original_list_deployments_by_name.operation_id
  api_management_name = azurerm_api_management.apim.name
  resource_group_name = azurerm_resource_group.rg_ai_gateway.name
  xml_content = <<XML
    <policies>
      <inbound>
        <authentication-managed-identity resource="https://management.azure.com/" />
        <rewrite-uri template="/deployments?api-version=${local.ai_services_arm_api_version}" copy-unmatched-params="false" />
        <!--Azure Resource Manager-->
        <set-backend-service base-url="https://management.azure.com${azurerm_cognitive_account.ai_foundry_accounts[keys(local.ai_foundry_regions)[0]].id}" />
      </inbound>
      <backend>
        <base />
      </backend>
      <outbound>
        <base />
      </outbound>
      <on-error>
        <base />
      </on-error>
    </policies>
  XML
}

		

In my lab, I defined these two operations for both the classic (OpenAI Inferencing and Authoring API) and v1 API. This allowed me to mess around with both static and dynamic APIM and Model Gateway connections.

Once you get Foundry hooked into APIM using this integration (and I’ll cover the Foundry part in the next post), you get access to some pretty neat information in the headers. As of the date of this post, these will be some of the headers you’ll see. You’ll notice my x-forwarded-for path includes my endpoint’s IP address as well as the IP of the container running in the managed Microsoft-compute environment (notice that is using CGNAT IP space which clears up why CGNAT is unsupported to be used by the customer when using agent with VNet injection). The x-ms-foundry-project-id is the unique project GUID of the project the agent was created under (could be useful for throttling and logging). The x-ms-foundry-agent-id is the unique agent identifier of the specific revision of the agent (again useful for logging and throttling). The x-ms-client-request-id is actually the Foundry project managed identity, not the agent identity which is important to note. If you want to use Entra for the BYO AI Gateway APIM connection, you’re going to be limited to this or API key. There is a connection authentication option to use the agent’s actual Entra ID Agent Identity, but I’ve only used that for the MCP Server feature of Foundry, never for this so I’m not sure if it works or is supported.

			
{
  "Authorization": "Bearer REDACTED",
  "Content-Length": "474",
  "Content-Type": "application/json; charset=utf-8",
  "Host": "apimeusXXXXX.azure-api.net",
  "Max-Forwards": "10",
  "Correlation-Context": "leaf_customer_span_id=173926958944XXXXXX",
  "traceparent": "00-62ff160923b2c1724242c037be40e7cb-4f1b402461aXXXXX-01",
  "X-Request-ID": "96534855-a35a-481a-886d-XXXXXXXXXXXX",
  "x-ms-client-request-id": "76ddf586-260b-4e37-8f4c-XXXXXXXXXXXX",
  "openai-project": "sampleproject1",
  "x-ms-foundry-agent-id": "TestAgent-ai-gateway-static:5",
  "x-ms-foundry-model-id": "conn1apimgwstaticopenai/gpt-4o",
  "x-ms-foundry-project-id": "455cbebf-a0bc-425e-99f6-XXXXXXXXXXX",
  "x-forwarded-for": "100.64.9.87;10.0.9.213:10095",
  "x-envoy-external-address": "100.64.9.87",
  "x-envoy-expected-rq-timeout-ms": "1800000",
  "x-k8se-app-name": "j8820ec0658b4aeXXXXX-dataproxy--vuww7ja",
  "x-k8se-app-namespace": "wonderfulsky-a2fXXXXX",
  "x-k8se-protocol": "http1",
  "x-k8se-app-kind": "web",
  "x-ms-containerapp-name": "j8820ec0658b4aeXXXXX-dataproxy",
  "x-ms-containerapp-revision-name": "j8820ec0658b4aeXXXXX-dataproxy--vuww7ja",
  "x-arr-ssl": "2048|256|CN=Microsoft Azure RSA TLS Issuing CA 04;O=Microsoft Corporation;C=US|CN=*.azure-api.net;O=Microsoft Corporation;L=Redmond;S=WA;C=US",
  "x-forwarded-proto": "https",
  "x-forwarded-path": "/v1/https/apimeusXXXXX.azure-api.net/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
  "X-ARR-LOG-ID": "76ddf586-260b-4e37-8f4c-XXXXXXXXXXXX",
  "CLIENT-IP": "10.0.9.213:10095",
  "DISGUISED-HOST": "apimeusXXXXX.azure-api.net",
  "X-SITE-DEPLOYMENT-ID": "apimwebappXXXXXX6OTVsZqxOcTZLpubQ9iNmzQ8kzMOmkEhw",
  "WAS-DEFAULT-HOSTNAME": "apimwebappXXXXXX6otvszqxoctzlpubq9inmzq8kzmomkehw.apimaseXXXXXXX6otvszqxoctz.appserviceenvironment.net",
  "X-AppService-Proto": "https",
  "X-Forwarded-TlsVersion": "1.3",
  "X-Original-URL": "/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
  "X-WAWS-Unencoded-URL": "/openai/deployments/gpt-4o/chat/completions?api-version=2025-03-01-preview",
  "X-Azure-JA4-Fingerprint": "t13d1113h2_d3731e0d3936_XXXXXXXXXXXX"
}

		

Using the information above, I crafted the policy below. It’s nothing fancy, but shows an example of throttling based on the project id and logging the agent identifier via the token metrics policy to potentially make chargeback more granular. Either way, these additional headers give you more to play with.

			
## Create an API Management policy for the OpenAI v1 API
##
resource "azurerm_api_management_api_policy" "apim_policy_openai_v1" {
  depends_on = [
    azurerm_api_management_api.openai_v1
  ]
  api_name            = azurerm_api_management_api.openai_v1.name
  api_management_name = azurerm_api_management.apim.name
  resource_group_name = azurerm_resource_group.rg_ai_gateway.name
  xml_content = <<XML
    <policies>
      <inbound>
          <base />
          <!-- Evaluate the JWT and ensure it was issued by the right Entra ID tenant -->
          <validate-jwt header-name="Authorization" failed-validation-httpcode="403" failed-validation-error-message="Forbidden">
              <openid-config url="https://login.microsoftonline.com/${var.entra_id_tenant_id}/v2.0/.well-known/openid-configuration" />
              <issuers>
                  <issuer>https://sts.windows.net/${var.entra_id_tenant_id}/</issuer>
              </issuers>
          </validate-jwt>
          <!-- Extract the Entra ID application id from the JWT -->
          <set-variable name="appId" value="@(context.Request.Headers.GetValueOrDefault("Authorization",string.Empty).Split(' ').Last().AsJwt().Claims.GetValueOrDefault("appid", "none"))" />
          <!-- Extract the Agent ID from the x-ms-foundry-agent-id header. This is only relevant for Foundry native agents -->
          <set-variable name="agentId" value="@(context.Request.Headers.GetValueOrDefault("x-ms-foundry-agent-id", "none"))" />
          <!-- Extract the project GUID from the x-ms-foundry-project-id header. This is only relevant for Foundry native agents -->
          <set-variable name="projectId" value="@(context.Request.Headers.GetValueOrDefault("x-ms-foundry-project-id", "none"))" />
          <!-- Extract the Foundry Project name from the "openai-project" header. This is only relevant for Foundry native agents -->
          <set-variable name="projectName" value="@(context.Request.Headers.GetValueOrDefault("openai-project", "none"))" />
          <!-- Extract the deployment name from the uri path -->
          <set-variable name="uriPath" value="@(context.Request.OriginalUrl.Path)" />
          <set-variable name="deploymentName" value="@(System.Text.RegularExpressions.Regex.Match((string)context.Variables["uriPath"], "/deployments/([^/]+)").Groups[1].Value)" />
          <!-- Set the X-Entra-App-ID header to the Entra ID application ID from the JWT -->
          <set-header name="X-Entra-App-ID" exists-action="override">
              <value>@(context.Variables.GetValueOrDefault<string>("appId"))</value>
          </set-header>
          <set-header name="X-Foundry-Agent-ID" exists-action="override">
              <value>@(context.Variables.GetValueOrDefault<string>("agentId"))</value>
          </set-header>
          <set-header name="X-Foundry-Project-Name" exists-action="override">
              <value>@(context.Variables.GetValueOrDefault<string>("projectName"))</value>
          </set-header>
          <set-header name="X-Foundry-Project-ID" exists-action="override">
              <value>@(context.Variables.GetValueOrDefault<string>("projectId"))</value>
          </set-header>
          <choose>
            <!-- If the request isn't from a Foundry native agent and is instead an application or external agent -->
            <when condition="@(context.Variables.GetValueOrDefault<string>("agentId") == "none" && context.Variables.GetValueOrDefault<string>("projectId") == "none")">
              <!-- Throttle token usage based on the appid -->
              <llm-token-limit counter-key="@(context.Variables.GetValueOrDefault<string>("appId","none"))" estimate-prompt-tokens="true" tokens-per-minute="10000" remaining-tokens-header-name="x-apim-remaining-token" tokens-consumed-header-name="x-apim-tokens-consumed" />
              <!-- Emit token metrics to Application Insights -->
              <llm-emit-token-metric namespace="openai-metrics">
                  <dimension name="model" value="@(context.Variables.GetValueOrDefault<string>("deploymentName","None"))" />
                  <dimension name="client_ip" value="@(context.Request.IpAddress)" />
                  <dimension name="appId" value="@(context.Variables.GetValueOrDefault<string>("appId","00000000-0000-0000-0000-000000000000"))" />
              </llm-emit-token-metric>
            </when>
            <!-- If the request is from a Foundry native agent -->
            <otherwise>
              <!-- Throttle token usage based on the agentId -->
              <llm-token-limit counter-key="@($"{context.Variables.GetValueOrDefault<string>("projectId")}_{context.Variables.GetValueOrDefault<string>("agentId")}")" estimate-prompt-tokens="true" tokens-per-minute="10000" remaining-tokens-header-name="x-apim-remaining-token" tokens-consumed-header-name="x-apim-tokens-consumed" />
              <!-- Emit token metrics to Application Insights -->
              <llm-emit-token-metric namespace="llm-metrics">
                  <dimension name="model" value="@(context.Variables.GetValueOrDefault<string>("deploymentName","None"))" />
                  <dimension name="client_ip" value="@(context.Request.IpAddress)" />
                  <dimension name="agentId" value="@(context.Variables.GetValueOrDefault<string>("agentId","00000000-0000-0000-0000-000000000000"))" />
                  <dimension name="projectId" value="@(context.Variables.GetValueOrDefault<string>("projectId","00000000-0000-0000-0000-000000000000"))" />
              </llm-emit-token-metric>
            </otherwise>
          </choose>
          <choose>
            <!-- If the request is from a Foundry native agent -->
            <when condition="@(context.Variables.GetValueOrDefault<string>("agentId") != "none" && context.Variables.GetValueOrDefault<string>("projectId") != "none")">
            <authentication-managed-identity resource="https://cognitiveservices.azure.com/" />
            </when>
          </choose>
          <set-backend-service backend-id="${module.backend_pool_aifoundry_instances_openai_v1.name}" />
      </inbound>
      <backend>
          <forward-request />
      </backend>
      <outbound>
          <base />
      </outbound>
  </policies>
XML
}

		

Summing it up

I was going to go crazy and incorporate the Foundry setup and testing into this post as well but decided against it. There is a point when the brain melts and if mine is already melting, yours may be as well. I’ll walk through those pieces in the next post. You have a few main takeaways. First, let’s review the high level setup of your AI Gateway.

Create your backends that point to the Microsoft Foundry endpoints.
Import the relevant API. If at all possible, go with the v1 API. It will support access to other models besides OpenAI models and additional features.
Add the GET and LIST operations and define the relevant policies if you’re planning on supporting dynamic models vs static. Dynamic seems to make more sense to me, but I haven’t seen enough orgs adopt this yet to form a good opinion.
Craft your custom policies. I highly recommend you regularly review the headers being passed. They could change and even better data may be added to them.

Next, let’s talk about key gotchas.

The certificate used on your AI Gateway MUST be issued from a well-known public CA in order for it to be trusted by the agent running in Foundry comptue. If it isn’t, this integration will fail and may not fail in a way that is obvious the TLS session failure between the agent compute and the AI Gateway is to blame.
If you’re using APIM, think about the Private Endpoint and VNet integration pattern if you’re capable of using v2. If it won’t work for you, or you’re still using the classic SKU, if you want to support managed VNet you’ll need to incorporate an Application Gateway in front of your AI Gateway likely. This means more operational overhead and costs.
While every Foundry Agent (v2) is given an Entra ID Agent Identity created from the Entra ID Agent Blueprint associated to the project, when using the ProjectManagedIdentity authentication type, you’ll see the project’s managed identity in the logs. If you’re able to test with the agent identity authentication type, let me know.
Really noodle on how you can use the project headers for throttling and possibly chargeback. It makes a ton of sense if you’re aligning your Foundry account and project model correctly.

See you next post!

Microsoft Foundry – BYO AI Gateway – Part 2

Posted on March 9, 2026 by mattfeltonma

Hello again! Today I’m going to continue my series on Microsoft Foundry’s new support for the BYO AI Gateway. In my past few posts I’ve walked through the evolution of Foundry and covered at a high level what an AI Gateway is and the problem this feature solves. In this post we’re gonna get down and dirty with the technical details on setting this up within Microsoft Foundry. I’ll do a follow-up post to focus on the APIM (API Management) configuration. Grab your coffee and put on your thinking music (for me that is some Blink and Third Eye Blind. Yeah, I’m old.).

Let’s get to it!

Current State Architecture

My customer base is primarily in the regulated industry so most of my customers are still at the experimentation state with the Foundry Agent Service. Given these customers have strict security requirements they are largely using the agent service with the standard agent configuration. In this configuration the outbound traffic (subsets of it, but that is a much larger conversation) can be tunneled through the customer virtual network for centralized logging, mediation, and facilitating access to private resources (again, with limitations today) through what the product group calls VNet injection but I’d say is more closely described as VNet integration via a delegated subnet. Threads (conversations in v2 agents) and agent metadata are stored in a Cosmos DB, vector stores created by an agent from tools such as the File Search tool are stored in AI Search, and files uploaded to the Foundry resource by users are stored in a Storage Account. These resources are all provisioned by the customer into the customer subscription and fully managed by the customer (RBAC, encryption, HA settings, etc). Private Endpoints for each resource are created within the customer’s virtual network and made accessible from the agent delegated subnet. The whole environment looks similar to what you see below.

**Foundry Agent Service – Standard Agent Configuration with VNet Injection**

As I covered in my last post, as of the date of this post Foundry native agents can only consume models deployed to their own Foundry resource. This creates an issue for customers wanting the governance of the models, visibility into the use of the LLMs, and improvements security posture and operational optimizations an AI Gateway can provide when it sits between the agent and the model. For now, customers are working around doing this using what I refer to as external agents. External agents run outside of Microsoft Foundry on customer-managed compute like an on-premises Kubernetes cluster or an Azure Function deployed to the customer subscription. The downfall of this direction is these external agents live on compute customers have to manage and can’t access many of the tools available to Foundry-native agents. This is the problem the BYO AI Gateway feature is attempting to fix.

Foundry resource architecture

Here is where the new connection type introduced in Foundry comes to the rescue. Before I dive into the details of that, I think it’s helpful to level set a bit on the resource hierarchy within Foundry. At the top is the top-level Azure resource referred to as the Foundry service which under the hood is a Cognitive Services account. The relevant resources for this discussion are below the account resource and are projects, deployments, and connections. Projects serve a few purposes with two of them being logical boundaries around connections (at the management plane) and agents (at the data plane) provisioned under the projects. Deployments of models (such as GPT-5) are children of the account and are made available to all projects within the account. The account can also have connections objects which can be shared across projects.

For the purposes of this discussion, I’m going to focus on the connection objects. Connection objects can be created at the account level and project level as discussed above. In the standard agent configuration, you’ll create a number of different connections during setup including connections to Cosmos, AI Search, and Azure Storage. Additional common connections could be to an App Insights instance for tracing or a Grounding With Bing Search resource to use with the Grounding with Bing tool. Connection objects will contain some type of pointer, like a URI and a credential. That credential is usually API Key, some Entra ID-based authentication mechanism, or general OAuth.

Connections are created at the account level when the Foundry account itself needs to access them. This could be for the usage of Content Understanding, to a Key Vault for storing connection secrets (API keys) in a customer subscription, an an App Insights instance used for tracing. From what I’ve observed, you will create connections at the account level if they need to be shared across all projects OR they’re used by the Foundry resource in general vs some type of project construct. Connections used by projects can also be created at the project level. When you provision a standard agent for example, you’ll create connection objects to the Cosmos DB, Storage Account, and AI Search resources mentioned above. The new category of connections for this post will be created at the project level. I’d had mixed behavior with how effectively connection objects at the account can be used downstream by the projects.

APIM and Model Gateway Connections

The BYO AI Gateway feature uses two new types of connection categories: ApiManagement and ModelGateway. These objects are the glue that allow the Foundry native agents to route requests for models through an AI Gateway. When we’re connecting to an APIM instance, you should ideally use the ApiManagement category and when you’re connecting to a third-party category you’ll use the ModelGateway category.

As of the date of this blog post, these connection objects have the following schema (relevant properties to this discussion only):

			
name: The name of the connection (needs to be less than 60 characters in my testing)
properties: {
  category: ApiManagement or ModelGateway
  target: The URI you want the agent to connect to
  authType: For ApiManagement this can be ApiKey or ProjectManagedIdentity
  credentials: This will be populated with the value of the API key if using that authType
  isSharedToAll: true or false if you want this shared across all projects
  # ApiManagement category with static models
  metadata: {
    deploymentInPath: true or false
    inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
    # Models discussed in detail below
    models: "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
  }
  # ApiManagement category with dynamic discovery
  metadata: {
    deploymentAPIVersion: ARM API version for CognitiveServices/accounts/deployments API calls
    deploymentInPath: true or false
    inferenceAPIVersion: API version used for inferencing  (not used if using OpenAI v1 API)
  }
  # ModelGateway category with static models
  metadata: {
    deploymentInPath: true or false
    inferenceAPIVersion: API version used for inferencing  (not used if using OpenAI v1 API)
    # Models discussed in detail below
    models: "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
  }
  # ModelGateway category with dynamic models
  metadata: {
    deploymentInPath: true or false
    inferenceAPIVersion: API version used for inferencing (not used if using OpenAI v1 API)
    deploymentAPIVersion: ARM API version for CognitiveServices/accounts/deployments API calls
    modelDiscovery: "{\"deploymentProvider\":\"AzureOpenAI\",\"getModelEndpoint\":\"/deployments/{deploymentName}\",\"listModelsEndpoint\":\"/deployments\"}"
}

		

I’ll walk through each of these properties in as much detail as I’ve been able to glean from them with my testing.

The category property is self-explanatory. You either set to this to ApiManagement (if using APIM) or Model Gateway (if using a third-party AI Gateway like a Kong or LiteLLM).

The target property is the URI you want the agent to try to connect to. As an example, if I create an API on my APIM instance for the v1 OpenAPI named openai-v1 my target would look like “https://myapim.azure-api.net/openai-v1/v1”. As of the date of this blog post, you MUST use the azure-api-net FQDN for the APIM. If you try to do a custom domain you’ll get an error back telling you that it’s not supported. I have a request into the product group to lift this limitation. I’ll update this if that is done. For third-party model gateway, this property serves the same purpose but can be any valid domain.

The authType property is going to be either ApiKey or ProjectManagedIdentity for an APIM connection. ProjectManagedIdentity will authenticate to the upstream APIM using the agent’s project’s Entra ID managed identity. When using ProjectManagedIdentity you must also specify the audience property and set it to cognitive services.azure.com if connecting to a backend Foundry resource hosting models. For a model gateway connection this will either be ApiKey or OAuth. Details on the OAuth setup can be found in the samples GitHub (I haven’t mucked with it yet). If you’re using the authType of ApiKey you additional need to pass the credentials property which includes a property of key with the API key similar to what you see below.

			
authType: ApiKey
credentials = {
  key = MYAPIKEY
}

I haven’t messed extensively with the isSharedToAll property as of yet. For my use case I set this to false so each project got its own connection object. You may be able to create this object at the account level and set the isSharedToAll property, but I haven’t tested that yet. If you have, def let me know if that works.

Ok, now on to the property that can bring the most pain. Here we have the metadata property. This property is going to the main guts that makes this whole thing work. A few considerations, if doing this with Terraform or REST (can’t speak to Bicep or ARM), each of the properties I’m going to cover are CASE SENSITIVE. If you do the wrong casing, your connection object will not work. When connecting to an APIM or model gateway you can have Foundry either enumerate the models available (called dynamic discovery) or you can provide the exact models you want to expose (called static models).

Let’s first cover static models. Here is an example of me creating a connection to an APIM instance with static models using the authType or ProjectManagedIdentity. One thing to note is in my backend object in my APIM I’m appending /v1 to the backend path vs doing it in this connection object.

			
{
      "id": "/subscriptions/X/resourceGroups/X/providers/Microsoft.CognitiveServices/accounts/X/projects/sampleproject1/connections/conn1apimgwstaticopenai-v1",
      "name": "conn1apimgwstaticopenai-v1",
      "properties": {
        "audience": "https://cognitiveservices.azure.com",
        "authType": "ProjectManagedIdentity",
        "category": "ApiManagement",
        "isSharedToAll": false,
        "metadata": {
          "deploymentInPath": "false",
          "inferenceAPIVersion": null,
          "models": "[{\"name\":\"gpt-4o\",\"properties\":{\"model\":{\"format\":\"OpenAI\",\"name\":\"gpt-4o\",\"version\":\"2024-08-06\"}}}]"
        },
        "target": "https://X.azure-api.net/openai-v1",
      }

		

Since I’m using the v1 Azure OpenAI API, I don’t need to specify an inferenceAPIVersion. If I was using the classic API I’d need to specify the version (such as 2025-04-01-preview). Notice also I have set deploymentInPath to false. When set to true the connection will add the /deployments/deployment_name to the path. For the v1 API this isn’t required. Finally you got the models property. With a static model setup I list out the models I’m exposing to the connection. If you’re using Terraform, you MUST wrap the models in the jsonecode function. If you don’t, it will not work. The static model option is pretty helpful if you want to strictly control exactly what models the project is getting access to.

Let’s now switch over to dynamic discovery. Dynamic discovery requires you define a few additional operations inside of your API. The details can be found in this GitHub repo, but the basics of is you define an operation for a GET on a specific model and a LIST to find all the models available. These operations are management plane operations at the ARM API to retrieve deployment information. Here is an example of a setup with dynamic discovery using an APIM connection.

			
{
      "id": "/subscriptions/X/resourceGroups/X/providers/Microsoft.CognitiveServices/accounts/X/projects/sampleproject1/connections/conn1apimgwdynamicopenai-v1",
      "location": null,
      "name": "conn1apimgwdynamicopenai-v1",
      "properties": {
        "audience": "https://cognitiveservices.azure.com",
        "authType": "ProjectManagedIdentity",
        "category": "ApiManagement",
        "group": "AzureAI",
        "isSharedToAll": false,
        "metadata": {
          "deploymentAPIVersion": "2024-10-01",
          "deploymentInPath": "false",
          "inferenceAPIVersion": null
        },
        "target": "https://X.azure-api.net/openai-v1",
      },
      "type": "Microsoft.CognitiveServices/accounts/projects/connections"
    }

		

When doing the dynamic discovery, you’ll see the deploymentAPIVersion property set to the API version for the GET and LIST deployment operations of the ARM REST API. I added these operations into the API after I imported the v1 OpenAI spec. You can see an example in Terraform I put together in my lab repo. Dynamic discovery is a great solution when you want to the developer to have access to any new deployments you may push to the Foundry resources.

I’m not going to run through the ModelGateway connection categories because they will largely emulate what you see above with some minor differences. The official Foundry samples GitHub repo has the gory details. I also have examples in Terraform available in my own repo (if you dare subject yourself to reading my code).

Ok, so now you understand the basics of setting up the connection and what you need to do on the APIM side. For more details on setting up APIM you can reference this official repo.

Summing It Up

Ok, so you now you understand the basic connection object, how to set it up, and how it works. I’m going to cut it here and continue in another post where I’ll dig into the dirty details of how it looks to use this because I don’t want to overload your brain (and mine) with a super long post.

Before I jet I will want to provide some critical resources:

My AMAZING peer Piotr Karpala has put together a repository with examples of this pattern (and some 3rd-party integrations) with Bicep. The stuff in there is gold. He was also my late night buddy helping me work through the quirks of this integration late at night. Couldn’t have gotten it done without him (or at least would have broken many keyboards).
The Product Group’s official samples and explanations of the setup are located here. I’d highly recommending referencing them because they will always have more up to date instructions than my blog.
I’ve put together some Terraform samples for my own purposes which are you welcome to reference, loot for your own means, and laugh at my pathetic coding ability. Check out this one for the Foundry portion and this one for the APIM portion.

And here are your tips for this post:

RTFM. Seriously, read the official documentation. Today, this integration is challenging to put in place. If you try to lone wolf it, let me know how many keyboards end up being thrown through your window.
If you’re coding in Terraform or making REST calls to create these connections, remember CASE SENSITIVITY matters. If you do the wrong case sensitivity, the resource will still create but it won’t work. You’ll get very frustrated trying to troubleshoot it.
If you’re coding in Terraform don’t forget to use the jsonencode function on the models property. If you skip that, the resource will create but shit will not work.
This is only supported for prompt agents today.
Don’t forget this is public preview. So test it, but expect things to change and don’t throw this into production.

In the next post I’ll walk through how you can test the integration, some of the quirks and considerations for identity and authentication, and some of the neat APIM policy you can craft given some of the new information that is sent in the request.

See you next post!

Journey Of The Geek

The chronicles of a Bostonian tech geek navigating through life and technology

Tag Archives: security

Microsoft Foundry – Publishing Agents To Teams Deep Dive – Part 1

The Background

The Portal Experience

What the hell is happening when I push this button?

What does the Bot Service do?

Summing it up for now

Microsoft Foundry – BYO AI Gateway – Part 3

The foundational architecture

AI Gateway Architecture

Foundry Architecture

Setting up the AI Gateway

Summing it up

Microsoft Foundry – BYO AI Gateway – Part 2

Current State Architecture

Foundry resource architecture

APIM and Model Gateway Connections

Summing It Up