A Pattern for App Services with Private Endpoints

A Pattern for App Services with Private Endpoints

Hello again fellow geeks.

Last year Microsoft announced the general availability of Private Endpoint support for App Services. This was a feature I was particularly excited about because when combined with regional Vnet integration, it offered an alternative to an ASE (App Service Environment) for customers who wanted to run internally-facing applications or who didn’t feel comfortable exposing their publicly facing application directly via a public IP.

Over the past year I created a few different labs that demonstrated these features including a web app and function app. Each lab was built around a hub and spoke architecture where the application was serving as an internally-facing application. Given my recent post on Azure Firewall Premium, and the fact I still had the lab environment up and running, I thought it would be interesting to switch the virtual machine out and switch App Services in, which resulted in the lab environment pictured below.

Lab environment

I established the following goals for the lab environment:

  1. Perform IDPS on traffic to the web app
  2. Mediate and inspect traffic initiated from the web app to third-party APIs
  3. Expose a web application to the Internet without giving it a direct public IP address

To accomplish these goals I was able to re-use much of the pattern I described in my last post.

The first step in the process was to create a very simple hub and spoke architecture as pictured above. The environment would consist of a transit Vnet (virtual network), shared services Vnet, and spoke Vnet. The transit Vnet would contain the guts of the solution with an Azure Firewall Premium SKU instance and Azure Application Gateway v2 instance. In the shared services Vnet I used a Windows Server VM (virtual machine) running the DNS Server service and providing DNS resolution for the environment. I could have taken a shortcut here and used Azure Firewall’s DNS proxy capability, but I like to leave the options open for conditional forwarding which Azure Firewall and Azure Private DNS do not support at this time. Finally, in the spoke Vnet I created two subnets. One subnet hosted the private endpoint for App Services and the other subnet was delegated to App Services to support regional Vnet integration.

Once all components were in place, I deployed a simple Python web application I wrote. All it does it queries two public APIs, one for the current time and the other for a random Breaking Bad quote (greatest show of all time!). I took the cheating way out and deployed the app directly via Visual Studio using the Azure App Service add-in prior to deploying the private endpoint and regional Vnet integration capabilities. This allowed me to test the app to validate it was still working as intended while also allowing me to deploy the app from my home machine. Deploying a private endpoint for app services locks down not only access to the running application but also to the SCM (source control management) endpoint that is used for deploying code to the app.

The next step was to create the Azure Private DNS Zone for app services which is named privatelink.azurewebsites.net. I then linked this zone to my shared services Vnet and setup the DNS server running in that Vnet with a standard forward to 168.63.129.16. This setup allows DNS queries made to the Private DNS zone to be resolved by the DNS server. I’ve written extensively about how DNS works with Azure Private Link so I won’t go into any more detail on that flow. Last step in the DNS process is to configure each Vnet to use the DNS server IP in its DNS Server settings. This also needs to configured directly on the Azure Firewall.

Now that the infrastructure would be able to resolve queries to the Private DNS Zone used by App Services, it was time to create the private endpoint for the web app. This registered the appropriate DNS records in the Private DNS Zone for my web app. With the private endpoint created, the last step was to enable Vnet integration.

Private DNS Zone with records for App Service instance

At this point I had all the guts of my solution and it was time to configure traffic to flow the way I wanted it to flow. Before I could begin work in Azure I needed to do the work outside of Azure. This involved creating an A record in my DNS hosting provider to point the public DNS name of my web app to the AGW’s (Application Gateway) public IP. This name can be whatever you want as long as you provide a certificate to the AGW such that it will identify itself as that name to the user. The AGW configuration is very straightforward and this tutorial will get you most of the way there. One thing to note is you’ll need to set the backend to point to the name given to the app service. This allows you to take advantage of the certificate Microsoft automatically provisions to the given app service.

App Gateway backend config

Since I wanted to route traffic destined for the web app through the Azure Firewall Premium instance, I needed to ensure the AGW trusted the certificate served up by Azure Firewall. This is done by modifying the HTTP setting used in the AGW rule for the app. Here you can upload the root certificate that issued the Azure Firewall Premium intermediate certificate.

App Gateway HTTP Setting

Now that the AGW is configured, I needed to create a route table for the subnet the AGW has its private IP address in. In this route table I disabled BGP (border gateway protocol) propagation to ensure the default route since this AGW v2 requires a default route pointing directly to the Internet. Now this is where it gets interesting from a routing perspective. Whenever you create a private endpoint for a service, a system route is added to the route table of all the subnets within the private endpoint’s vnet (virtual network) as well as any vnet the private endpoint’s vnet is peered with. If I want traffic from the AGW to route through the Azure Firewall, I need to override that system route with a /32 UDR. As you can imagine, this can become extremely tedious at scale and even risks hitting the max routes per route table depending on the scale we’re talking about. On the positive side, the issues around this are something Microsoft is aware of so hopefully that means this will be addressed at some point. In the meantime you’ll need to use the /32 route in a pattern such as this.

App Gateway route table

Azure Firewall Premium needs to be configured to allow traffic from the AGW to the web app and inspect this traffic. You can check out my last post for instructions on configured Azure Firewall to perform these activities.

Excellent, work is done right? Nope! If we influence routing on one side, we have to ensure the other side routes the same way. Toss a route table on the private endpoint subnet you say? Sorry, that isn’t supported my friend. Instead I needed to enable the Azure Firewall instance to SNAT (source NAT) traffic to the spoke Vnet. This ensures routing is symmetric and will eliminate any potential connection issues by creating only the UDR on the AGW subnet.

Incoming traffic flow

Lastly, I needed to create and apply a route table to the subnet I delegated to App Services for regional Vnet integration. This route table can be very simple and be configured with a single UDR for the default route pointing to the Azure Firewall private IP. This results in the traffic flow below:

Outgoing Internet traffic flow

This one was another fun pattern to solve. I got a chance to mess around more with AGW and also get a pattern I had theorized would work actually working. I find implementing the pretty pictures I draw helps drive home the benefits and considerations of such a pattern. With this pattern specifically, it suffers from similar considerations as the pattern in my last post. These include:

  • Challenges with observability
  • Operational overhead of certificate management
  • Possible latency issues depending on latency requirements and traffic patterns

This pattern tacks on more operational overhead by requiring that /32 route be added to the AGW route table each time a new app service is provisioned with a private endpoint. Observability further suffers when tracing a packet flow end-to-end due to the additional layer of SNAT required via Azure Firewall. One thing to note about this is you may be able to get around the SNAT requirement for web-specific traffic because of the transparent proxy functionality behind Azure Firewall application rules. I want to highlight I have not tested this myself and I typically tend to SNAT for this use case in my lab because many of my customers may use similar patterns but with a 3rd party firewall.

So there you go folks, another pattern to add to your inventory. Hopefully we’ll see the /32 issue issue private endpoints resolved sometime in the near future. Have a great weekend!

What If… Volume 2

Hi there folks!

I’ve been busy lately buried in learning and practicing Kubernetes in preparation for the Certified Kubernetes Administrator exam. Tonight I’m taking a break to bring you another entry into the “What If” series I started a few months back.

Let’s get right to it.

What if I need to access a Private Endpoint in a subscription associated with a different Azure AD tenant and I have an existing Azure Private DNS Zone already?

I’ve been helping a good friend who recently joined Microsoft to support his customer as he gets up to speed on the Azure platform. This customer consists of two very large organizations which have a high degree of independence. Each of these organizations have their own Azure AD tenant and their own Azure footprint. One organization is further along in their cloud journey than the other.

Organization A (new to Azure) needed to consume some data that existed in an Azure SQL database in an Azure subscription associated with Organization B’s tenant. Both organizations have strict security and compliance requirements so they are heavy users of Azure PrivateLink Endpoints. A site-to-site VPN (virtual private network) connection was established between the two organizations to facilitate network communication between the Azure environments.

Customer Environment

The customer environment looked similar to the above where a machine on-premises in Organization A needed to access the Azure SQL database in Organization B. If you look closely, you probably see the problem already. From a DNS perspective, we have two Azure Private DNS Zones for privatelink.database.windows.net. This means we have two authorities for the same zone.

My peer and I went back on forth with a few different solutions. One solution seemed obvious in that organization A would manually create an A record in their Azure Private DNS zone pointing to IP of the PrivateLink Endpoint in Organization B. Since the organizations had connectivity between the two environments, this would technically work. The challenge with this pattern is it would introduce a potential bottleneck depending on the size of the VPN pipe. It could also lead to egress costs for Organization A depending on how the VPN connection was implemented.

The other option we came up with was to create a Private Endpoint in Organization A’s Azure subscription which would be associated with the Azure SQL instance running in Organization B’s Azure subscription. This would avoid any egress costs, we wouldn’t be introducing a potential bottleneck, and we’d avoid the additional operational head of having to manually manage the A record in Organization A’s Azure Private DNS Zone. Neither of us had done this before and while it seemed to be possible based on Microsoft’s documentation, the how was a bit lacking when talking PaaS services.

To test this I used two separate personal tenants I keep to test scenarios that aren’t feasible to test with internal resources. My goal was to build an architecture like the below.

Target architecture

So was it possible? Why yes it was, and an added bonus I’m going to tell you how to do it.

When you create a Private Endpoint through the Azure Portal, there is a Connection Method radio button seen below. If you’re creating the Private Endpoint for a resource within the existing tenant you can choose the Connect to an Azure resource in my directory option and you get a handy guided selection tool. If you want to connect to a resource outside your tenant, you instead have to select the Connect to an Azure resource by resource ID or alias. In this field you would end the full resource ID of the resource you’re creating the Private Endpoint for, which in this case is the Azure SQL server resource id. You’ll be prompted to enter the sub-resource which for Azure SQL is SqlServer. Proceed to create the Private Endpoint.

Private Endpoint Creation

After the Private Endpoint has been created you’ll observe it has a Connection status of Pending. This is part of the approval workflow where someone with control over the resource in the destination tenant needs to approve of the connection to the Azure SQL server.

Private Endpoint in pending status

If you jump over to the other resource in the target tenant and select the Private endpoint connections menu option you’ll see there is a pending connection that needs approval along with a message from the requestor.

Private Endpoint request

Select the endpoint to approve and click the approve button. At that point the Private Endpoint in the requestor tenant and you’ll see it has been approved and is ready for use.

This was a fun little problem to work through. I was always under the assumption this would work, the documentation said it would work, but I’m a trust but verify type of person so I wanted to see and experience it for myself.

I hope you enjoyed the post and learned something new. Now back to practicing Kubernetes labs!

Interesting behaviors with Private Endpoints

Interesting behaviors with Private Endpoints

Hi folks!

Working for and with organizations in highly regulated industries like federal and state governments and commercial banks often necessitates diving REALLY deep into products and technologies. This means peeling back the layers of the onion most people do not. The reason this pops up is because these organizations tend to have extremely complex environments due the length of time the organization has existed and the strict laws and regulations they must abide by. This is probably the reason why I’ve always gravitated towards these industries.

I recently ran into an interesting use case where that willingness to dive deep was needed.

A customer I was working with was wrapping up its Azure landing zone deployment and was beginning to deploy its initial workloads. A number of these workloads used Microsoft Azure PaaS (platform-as-a-service) services such as Azure Storage and Azure Key Vault. The customer had made the wise choice to consume the services through Azure Private Endpoints. I’m not going to go into detail on the basics of Azure Private Endpoints. There is plenty of official Microsoft documentation that can cover the basics and give you the marketing pitch. You can check out my pasts posts on the topic such as my series on Azure Private DNS and Azure Private Endpoints.

This particular customer chose to use them to consume the services over a private connection from both within Azure and on-premises as well as to mitigate the risk of data exfiltration that exists when egressing the traffic to Internet public endpoints or using Azure Service Endpoints. One of the additional requirements the customer had as to mediate the traffic to Azure Private Endpoints using a security appliance. The security appliance was acting as a firewall to control traffic to the Private Endpoints as well to perform deep packet inspection sometime in the future. This is the requirement that drove me down into the weeds of Private Endpoints and lead to a lot of interesting observations about the behaviors of network traffic flowing to and back from Private Endpoints. Those are the observations I’ll be sharing today.

For this lab, I’ll be using a slightly modified version of my simple hub and spoke lab. I’ve modified and added the following items:

  • Virtual machine in hub runs Microsoft Windows DNS and is configured to forward all DNS traffic to Azure DNS (168.63.129.16)
  • Virtual machine in spoke is configured to use virtual machine in hub as a DNS server
  • Removed the route table from the spoke data subnet
  • Azure Private DNS Zone hosting the privatelink.blob.core.windows.net namespace
  • Azure Storage Account named mftesting hosting some sample objects in blob storage
  • Private Endpoint for the mftesting storage account blob storage placed in the spoke data subnet
Lab environment

The first interesting observation I made was that there was a /32 route for the Private Endpoint. While this is documented, I had never noticed it. In fact most of my peers I ran this by were never aware of it either, largely because the only way you would see it is if you enumerated effective routes for a VM and looked closely for it. Below I’ve included a screenshot of the effective routes on the VM in the spoke Virtual Network where the Private Endpoint was provisioned.

Effective routes on spoke VM

Notice the next hop type of InterfaceEndpoint. I was unable to find the next hop type of InterfaceEndpoint documented in public documentation, but it is indeed related to Private Endpoints. The magic behind that next hop type isn’t something that Microsoft documents publicly.

Now this route is interesting for a few reasons. It doesn’t just propagate to all of the route tables of subnets within the Virtual Network, it also propagates to all of the route tables in directly peered Virtual Networks. In the hub and spoke architecture that is recommended for Microsoft Azure, this means that every Private Endpoint you create in a spoke Virtual Network is propagated to as a system route to route tables of each subnet in the hub Virtual Network. Below you can see a screen of the VM running in the hub Virtual Network.

Effective routes on hub VM

This can make things complicated if you have a requirement such as the customer I was working with where the customer wants to control network traffic to the Private Endpoint. The only way to do that completely is to create a /32 UDRs (user defined routes) in every route table in both the hub and spoke. With a limit of 400 UDRs per route table, you can quickly see how this may break down at scale.

There is another interesting thing about this route. Recall from effective routes for the spoke VM, that there is a /32 system route for the Private Endpoint. Since this is the most specific route, all traffic should be routed directly to the Private Endpoint right? Let’s check that out. Here I ran a port scan against the Private Endpoint using nmap using the ICMP, UDP, and TCP protocols. I then opened the Log Analytics Workspace and ran a query across the Azure Firewall logs for any traffic to the Private Endpoint from the VM and lo and behold, there is the ICMP and UDP traffic nmap generated.

Captured UDP and ICMP traffic

Yes folks that /32 route is protocol aware and will only apply to TCP traffic. UDP and ICMP traffic will not be affected. Software defined networking is grand isn’t it? 🙂

You may be asking why the hell I decided to test this particular piece. The reason I followed this breadcrumb was my customer had setup a UDR to route traffic from the VM to an NVA in the hub and attempted to send an ICMP Ping to the Private Endpoint. In reviewing their firewall logs they saw only the ICMP traffic. This finding was what drove me to test all three protocols and make the observation that the route only affects TCP traffic.

Microsoft’s public documentation mentions that Private Endpoints only support TCP at this time, but the documentation does not specify that this system route does not apply to UDP and ICMP traffic. This can result in confusion such as it did for this customer.

So how did we resolve this for my customer? Well in a very odd coincidence, a wonderful person over at Microsoft recently published some patterns on how to approach this problem. You can (and should) read the documentation for the full details, but I’ll cover some of the highlights.

There are four patterns that are offered up. Scenario 3 is not applicable for any enterprise customer given that those customers will be using a hub and spoke pattern. Scenario 1 may work but in my opinion is going to architect you into a corner over the long term so I would avoid it if it were me. That leaves us with Scenario 2 and Scenario 4.

Scenario 2 is one I want to touch on first. Now if you have a significant background in networking, this scenario will leave you scratching your head a bit.

Microsoft Documentation Scenario 2

Notice how a UDR is applied to the subnet with the VM which will route traffic to Azure Firewall however, there is no corresponding UDR applied to the Private Endpoint. Now this makes sense since the Private Endpoint would ignore the UDR anyway since they don’t support UDRs at this time. Now you old networking geeks probably see the problem here. If the packet from the VM has to travel from A (the VM) to B (stateful firewall) to C (the Private Endpoint) the stateful firewall will make a note of that connection in its cache and be expecting packets coming back from the Private Endpoint representing the return traffic. The problem here is the Private Endpoint doesn’t know that it needs to take the C (Private Endpoint) to B (stateful firewall) to A (VM) because it isn’t aware of that route and you’d have an asymmetric routing situation.

If you’re like me, you’d assume you’d need to SNAT in this scenario. Oddly enough, due the magic of software defined routing, you do not. This struck me as very odd because in scenario 3 where everything is in the same Virtual Network you do need to SNAT. I’m not sure why this is, but sometimes accepting magic is part of living in a software defined world.

Finally, we come to scenario 4. This is a common scenario for most customers because who doesn’t want to access Azure PaaS services over an ExpressRoute connection vs an Internet connection? For this scenario, you again need to SNAT. So honestly, I’d just SNAT for both scenario 2 and 4 to make maintain consistency. I have successfully tested scenario 2 with SNAT so it does indeed work as you expect it would.

Well folks I hope you found this information helpful. While much of it is mentioned in public documentation, it lacks the depth that those of us working in complex environments need and those of us who like to geek out a bit want.

See you next post!