Azure VWAN and Private Endpoint Traffic Inspection – Findings

Posted on August 20, 2023 by mattfeltonma

Today I’m taking a break from my AI series to cover an interesting topic that came up at a customer.

My customer base exists within the heavily regulated financial services industry which means a strong security posture. Many of these customers have requirements for inspection of network traffic which includes traffic between devices within their internal network space. This requirement gets interesting when talking inspection of traffic destined for a service behind a Private Endpoint. I’ve posted extensively on Private Endpoints on this blog, including how to perform traffic inspection in a traditional hub and and spoke network architecture. One area I hadn’t yet delved into was how to achieve this using Azure Virtual WAN (VWAN).

VWAN is Microsoft’s attempt to iterate on the hub and spoke networking architecture and make the management and setup of the networking more turnkey. Achieving that goal has been an uphill battle for VWAN with it historically requiring very complex architectures to achieve the network controls regulated industries strive for. There has been solid progress over the past few months with routing intent and support for additional third-party next generation firewalls running in the VWAN hub such as Palo Alto becoming available. These improvements have opened the doors for regulated customers to explore VWAN Secure Hubs as a substitute for a traditional hub and spoke. This brings us to our topic: How do VWAN Secure Hubs work when there is a requirement to inspect traffic destined for a Private Endpoint?

My first inclination when pondering this question was that it would work in the same way a traditional hub and spoke works. In past posts I’ve covered that pattern. You can take a look at this repository I’ve put together which walks through the protocol flows in detail if you’re curious. The short of it is inspection requires enabling network policies for the subnet the Private Endpoints are deployed to and SNATing at the firewall. The SNATing is required at the firewall because Private Endpoints do not obey user-defined routes defined in a route table. Without the SNAT you get asymmetric routing that becomes a nightmare of troubleshooting to identify. Making it even more confusing, some services like Azure Storage will magically keep traffic symmetric as I’ve covered in past posts. Best practice for traditional hub and spoke is SNATing for firewall inspection with Private Endpoints.

My first stop was to read through the Microsoft documentation. I came across this article first which walks through traffic inspection with Azure Firewall with a VWAN Secure Hub. As expected, the article states that SNAT is required (yes I’m aware of the exception for Azure Firewall Application Rules, but that is the exception and not the rule and very few in my customer space use Azure Firewall). Ok great, this aligns with my understanding. But wait, this article about Secure Hub with routing intent does not mention SNAT at all. So is SNAT required or not?

When public documentation isn’t consistent (which of course NEVER happens) it’s time to lab and see what we see. I threw together a single region VWAN Secure Hub with Azure Firewall, enabled routing intent for both Internet and Private traffic, and connected my home lab over a S2S VPN. I created then Private Endpoint for a Key Vault and Azure SQL resource. Per the latter article mentioned above, I enabled Private Endpoint Network Policies for the snet-svc subnet in the spoke virtual network. Finally, I created a single Network Rule allowing traffic for 443 and 1433 from my lab to the spoke virtual network. This ensured I didn’t run into the transparent proxy aspect of Application Rules throwing off my findings.

If you were doing this in the “real world” you’d setup a packet capture on the firewall and validate you see both sides of the conversation. If you’ve used Azure Firewall, you’re well aware it does not yet support packet captures making this impossible. Thankfully, Microsoft has recently introduced Azure Firewall Structure Fir e wall Logs which include a log called Azure Firewall Flow Trace Log. This log will show you the gooey details of the TCP conversation and helps to fill the gap of troubleshooting asymmetric traffic while Microsoft works on offering a packet capture capability (a man can dream, can’t he?).

While the rest of the Azure Firewall Structured Logs need nothing special to be configured, the Flow Trace Logs do (likely because as of 8/20/2023 they’re still in public preview). You need to follow the instructions located within this document. Make sure you give it a solid 30 minutes of completing the steps to enable the feature before you enable the log through the diagnostic settings of the Azure Firewall. Also, do not leave this running. Beyond the performance hit that can occur because of how chatty this log is, you could also be in a world of hurt for a big Log Analytics Workspace bill if you leave it running.

Once I had my lab deployed and the Flow Trace Logs working, I next went ahead testing using the Test-NetConnection PowerShell cmdlet from a Windows machine in my home lab. This is a wonderful cmdlet if you need something built-in to Windows to do a TCP Ping.

**Testing Azure SQL via Private Endpoint**

In the above image you can see that the TCP Ping to port 1433 of an Azure SQL database behind a Private Endpoint was successful. Review of the Azure Firewall Network Logs showed my Network Rule firing which tells me the TCP SYN at least passed through providing proof that Private Endpoint Network Policies were successfully affecting the traffic to the Private Endpoint.

What about return traffic? For that I went to the Flow Trace Logs. Oddly enough, the firewall was also receiving the SYN-ACK back from the Private Endpoint all without SNAT being configured. I repeated the test for a Azure Key Vault behind a Private Endpoint and observed the same behavior (and I’ve confirmed in the Azure Key Vault needs SNAT for return traffic in the past in a standard hub and spoke).

So is SNAT required or not? You’re likely expecting me to answer yes or no. Well today I’m going to surprise you with “I don’t know”. While testing with these two services in this architecture seemed to indicate it was not, I’ve circulated these findings within Microsoft and the recommendation to SNAT to ensure flow symmetry remains. As I’ve documented in prior posts, not all Azure services behave the same way with traffic symmetry and Azure Private Endpoints (Azure Storage for example) and for consistent purposes you should be SNATing. Do not rely on your testing of a few services in a very specific architecture as being gospel. You should be following the practices outlined in the documentation.

I feel like I’m ending this blog Sopranos-style with fade to black, but sometimes even tech has mystery. In this post you got a taste of how Flow Trace Logs can help troubleshoot traffic symmetry issues when using Azure Firewall and you learned that not all things in the cloud work the way you expect them to work. Sometimes that is intentional and sometimes it’s not intentional. When you run into this type of situation where behavior you’re observing doesn’t match documentation, it’s always best to do what is documented (in this case you should be doing SNAT). Maybe it’s something you’re doing wrong (this is me we’re talking about) or maybe you don’t have all the data (I tested 2 of 100+ services). If you go with what you experience, you risk that undocumented behavior being changed or corrected and then being in a heap of trouble in the middle of the night (oh the examples I could give of this across my time at cloud providers over a glass of Titos).

Well folks, that wraps things up. TLDR; SNAT until the documentation says otherwise regardless of what you experience.

Thanks!

Application Gateway and Private Link

Posted on December 19, 2022 by mattfeltonma

Welcome back fellow geeks!

Over the past few years I’ve written a ton on Private Endpoints for PaaS (platform-as-a-service) services Microsoft provides. I haven’t written anything about the Private Link service that powers the Private Endpoints. There is a fair amount of community knowledge and documentation on building a Private Link service behind an Azure Load Balancer, but far less on how to do it behind an Application Gateway (Adam Stuart’s video on it is a wonderful resource). Today, I’m going to make an attempt at furthering that collective community knowledge with a post on the feature and give you access to a deployable lab you can use to replicate what I’ll be writing about in this post. Keep in mind the service is still in public preview, so remember to check the latest documentation to validate the correctness of what I discuss below.

Let’s get to it!

I’ll be using a lab environment that I’ve built which mimics a typical enterprise environment. The lab uses a hub-and-spoke architecture where on-premises connectivity and centralized mediation and optional inspection is provided in a transit virtual network which is peered to all spoke virtual network. A shared services virtual network provides core infrastructure services such as DNS. The other spoke contains the workload which is a simple Python application deployed in Azure App Services.

The App Service has been configured to inject both its ingress and egress traffic into the virtual network using a combination of Private Endpoints and Regional VNet Integration. An Application Gateway has been placed in front of the App Service and has been deployed with both a public listener (listening on 8443) and a private listener (listening on 443). The application is accessible to internal clients (such as the VMs in the shared service virtual network) by issuing an HTTP request to https://www.jogcloud.com. Azure Private DNS provides the necessary DNS resolution for internal clients.

The deployed Python application retrieves the current time from a public API (assuming the API is up) and returns the source IP on the HTTP request as well as the X-Forwarded-For header. I’ll use this application to show some of the caveats of this pattern that are worth knowing if you ever plan to operationalize it.

To maintain visibility and control of traffic coming in either publicly or privately to the application, the route table assigned to the Application Gateway subnet is configured to route traffic through the Azure Firewall instance in the hub before allowing the traffic to the App Service. This pattern allows for democratization of Application Gateway while maintaining the ability to exercise additional IDS/IPS (intrusion detection/intrusion prevention) via the security appliance in the hub.

Imagine this application is serving up confidential data and you need to provide a partner organization with access. Your information security team does not want the partner accessing the application over the Internet due to the sensitivity of the information the partner will be accessing. While direct connectivity with the partner is an option, it would likely result in a significant amount of design to ensure the partner’s network only knows about the application IP space and appropriate firewall rules are in place to limit access to the Application Gateway endpoint. In this scenario, your organization will be the provider and the customer’s organization will be the consumer. I don’t know about you, but I’ve been in this situation a lot of times in my past. Back in the day (yeah I’m old, what of it?) you’d have to go the direct connectivity route and you’d spend months putting together a design and getting it approved by the powers that be. Let’s now look at how the new Private Link feature of Application Gateway can make this whole problem a lot easier to solve.

Assume this partner has a presence in Azure so we don’t have to get into the complexity of alternatives (such as building an isolated virtual network with VPN Gateway the partner connects to). The service could be exposed to the customer using the architecture below. Note that I’ve trimmed down the provider environment to show only the workload virtual network and illustrated a few compute services on the consumer end that are capable of accessing services exposed through Private Endpoints.

In the above image you will notice a new subnet in the provider’s virtual network. This subnet is used for the Private Link configuration. Traffic entering the provider environment will be NATed to an IP within this subnet. You can opt to use an existing subnet, but I’d recommend dedicating a subnet instead vs mixing it within the any of the application tier subnets.

There are considerations when sizing the subnet. Each IP allocated to the subnet can be used to service 64,000 connections and you can have up to eight IP addresses as of today allowing you to escape with a /28 (5 IP addresses reserved by Azure + 8 IPs for PrivateLink configuration). Just remember this is preview so that limit could be changed in the future. For the purposes of this post I used a /24 since I’m terrible at subnetting.

**New subnet for Private Link Configuration**

It’s time to create the Private Link configuration now that the subnet is in place. This can be done in all the usual ways (Portal, CLI, PowerShell, REST). When using the Portal you will need to navigate to the Application Gateway instance you’re using, select the Private Link menu item and select the option to add a new Private Link configuration.

Private Link Configuration Setup

On the next screen you will need to select the subnet you’ll use for the Private Link configuration. You will also pick the listener you want to expose and determine the number of IPs you want to allocate to the service. Note that both the public and private listeners are available. If you’re exposing a service within your virtual network, you’ll likely be creating these with private listeners almost exclusively. A use case for a public listener might be a single client wants a more consistent network experience provided by their ExpressRoute or VPN connectivity into Azure vs going over the Internet.

Private Link configuration

Once completed, you can freely create Private Endpoints for your service within the same tenant. Within the same tenant, your Private Link service will be detected when creating a Private Endpoint as seen below. All that is left for you to do is create a DNS entry that matches the FQDN you are presenting within the certificates loaded on your Application Gateway. At this point you should be saying, “That’s all well and good Matt, but my use case is providing this to a consumer in a DIFFERENT tenant.” Let’s explore that scenario.

**Creating Private Endpoint in same tenant**

I switched to a subscription in a separate Azure AD tenant which would represent the consumer. In this tenant I created a virtual network with a single subnet with the IP space of 10.1.0.0/16 which overlaps with the provider’s network demonstrating that overlapping IP space doesn’t matter with Private Link. In that subnet I placed a VM running Ubuntu that I would use to SSH in. I created this resources in the Australia East region to demonstrate that the service exposed via Private Link can have Private Endpoints created for it in any other Azure region. Connections made through the Private Endpoint will ride the Azure backbone to the destined service.

Once the basics were in place for testing, I then created the Private Endpoint for the provider service within the consumer’s network. This can be done through the Private Link Center blade using the Private Endpoint menu item in the Azure Portal as seen below.

On the resource screen you will need to provide the resource id of the Application Gateway and the listener name. This is additional information you would need to pass to the consumer of any Application Gateway Private Link enabled service.

**Private Endpoint Creation – Resource**

Bouncing back to the provider tenant, I navigated back to the Application Gateway resource and the Private Link menu item under the Private endpoint connections section. Private Endpoint creation for Private Link services across tenant work via request and approval process. Here I was able to approve the association of the consumer’s Private Endpoint with the Private Link service in the provider tenant.

**Approval of Private Endpoint association**

Once approved, I bounced back to the consumer tenant and grabbed the IP address assigned to the Private Endpoint that was created. I then SSH’d into the Ubuntu VM and created a DNS entry in the host file of the VM for the service I was consuming. In this scenario, I had created a listener on the Application Gateway which handles all requests from *.jogcloud.com. Once the DNS record was created, I then used curl to issue a request to the application. Success!

**Successful access of application from consumer**

The application spits back the client IP and X-Forwarded-For header of the HTTP request. Ignore the client IP of 169.254.129.1, that is appearing due to the load balancer component of the App Service. Focus instead on the X-Forwarded-For. Notice that the first value in the header is the NATd IP from the subnet that was dedicated to the Private Link service. The next IP in line is the private IP address of the Azure Firewall instance. As I mentioned earlier, the Application Gateway is configured to send incoming traffic through the Azure Firewall instance for additional inspection before passing on to the App Service instance. The Azure Firewall is configured for NAT to ensure traffic symmetry in this scenario.

What I want you to take away from the above is that the Private Link service is NATing the traffic, so unless the consumer has a forward web proxy on the other end appending to the X-Forwarded-For header (or potentially other headers to aid with identification), troubleshooting a user’s connection will take careful correlation of requests across App Gateway, Azure Firewall, and the underlining application logs. In the below image, you can see I used curl to add a value to the X-Forwarded-For header which was carried on through the request.

**Request with X-Forwarded-For value added by consumer**

What I love about this integration is it’s very simple to setup and it allows a whole bunch of additional security controls to be introduced into the flow such as the Application Gateway WAF or a firewall’s IDS/IPS features.

Here are some key takeaways for you to ponder on over this holiday break:

For HTTP/HTTPS traffic, the Application Gateway Private Link pattern allows for the introduction of additional security controls into the network flow beyond what you’d get with a Private Link service fronted by an Azure Standard Load Balancer
Setup for the consumer is very simple. All you need to do is provide them with the resource id of the application gateway and the listener name. They can then use the native Private Endpoint creation experience to setup access to your service.
Don’t forget the importance of ensuring the customer trusts the certificate the Application Gateway is providing and can reach applicable CRL/OCSP endpoints if you’re using them. Best bet is to use a trusted 3rd party certificate authority.
DNS DNS DNS. The customer will need to manage the relevant DNS records on their end. You will want to ensure they know which FQDNs you are including within your certificate so the records they create match those FQDNs. If there is a mismatch, any secure session setup will fail.

With that said, feel free to give the feature a try. You can use the lab I’ve posted on GitHub and the steps I’ve outlined in this blog to experiment with the service yourself.

Have a happy holiday!

A Pattern for App Services with Private Endpoints

Posted on July 8, 2021 by mattfeltonma

Hello again fellow geeks.

Last year Microsoft announced the general availability of Private Endpoint support for App Services. This was a feature I was particularly excited about because when combined with regional Vnet integration, it offered an alternative to an ASE (App Service Environment) for customers who wanted to run internally-facing applications or who didn’t feel comfortable exposing their publicly facing application directly via a public IP.

Over the past year I created a few different labs that demonstrated these features including a web app and function app. Each lab was built around a hub and spoke architecture where the application was serving as an internally-facing application. Given my recent post on Azure Firewall Premium, and the fact I still had the lab environment up and running, I thought it would be interesting to switch the virtual machine out and switch App Services in, which resulted in the lab environment pictured below.

I established the following goals for the lab environment:

Perform IDPS on traffic to the web app
Mediate and inspect traffic initiated from the web app to third-party APIs
Expose a web application to the Internet without giving it a direct public IP address

To accomplish these goals I was able to re-use much of the pattern I described in my last post.

The first step in the process was to create a very simple hub and spoke architecture as pictured above. The environment would consist of a transit Vnet (virtual network), shared services Vnet, and spoke Vnet. The transit Vnet would contain the guts of the solution with an Azure Firewall Premium SKU instance and Azure Application Gateway v2 instance. In the shared services Vnet I used a Windows Server VM (virtual machine) running the DNS Server service and providing DNS resolution for the environment. I could have taken a shortcut here and used Azure Firewall’s DNS proxy capability, but I like to leave the options open for conditional forwarding which Azure Firewall and Azure Private DNS do not support at this time. Finally, in the spoke Vnet I created two subnets. One subnet hosted the private endpoint for App Services and the other subnet was delegated to App Services to support regional Vnet integration.

Once all components were in place, I deployed a simple Python web application I wrote. All it does it queries two public APIs, one for the current time and the other for a random Breaking Bad quote (greatest show of all time!). I took the cheating way out and deployed the app directly via Visual Studio using the Azure App Service add-in prior to deploying the private endpoint and regional Vnet integration capabilities. This allowed me to test the app to validate it was still working as intended while also allowing me to deploy the app from my home machine. Deploying a private endpoint for app services locks down not only access to the running application but also to the SCM (source control management) endpoint that is used for deploying code to the app.

The next step was to create the Azure Private DNS Zone for app services which is named privatelink.azurewebsites.net. I then linked this zone to my shared services Vnet and setup the DNS server running in that Vnet with a standard forward to 168.63.129.16. This setup allows DNS queries made to the Private DNS zone to be resolved by the DNS server. I’ve written extensively about how DNS works with Azure Private Link so I won’t go into any more detail on that flow. Last step in the DNS process is to configure each Vnet to use the DNS server IP in its DNS Server settings. This also needs to configured directly on the Azure Firewall.

Now that the infrastructure would be able to resolve queries to the Private DNS Zone used by App Services, it was time to create the private endpoint for the web app. This registered the appropriate DNS records in the Private DNS Zone for my web app. With the private endpoint created, the last step was to enable Vnet integration.

Private DNS Zone with records for App Service instance

At this point I had all the guts of my solution and it was time to configure traffic to flow the way I wanted it to flow. Before I could begin work in Azure I needed to do the work outside of Azure. This involved creating an A record in my DNS hosting provider to point the public DNS name of my web app to the AGW’s (Application Gateway) public IP. This name can be whatever you want as long as you provide a certificate to the AGW such that it will identify itself as that name to the user. The AGW configuration is very straightforward and this tutorial will get you most of the way there. One thing to note is you’ll need to set the backend to point to the name given to the app service. This allows you to take advantage of the certificate Microsoft automatically provisions to the given app service.

Since I wanted to route traffic destined for the web app through the Azure Firewall Premium instance, I needed to ensure the AGW trusted the certificate served up by Azure Firewall. This is done by modifying the HTTP setting used in the AGW rule for the app. Here you can upload the root certificate that issued the Azure Firewall Premium intermediate certificate.

Now that the AGW is configured, I needed to create a route table for the subnet the AGW has its private IP address in. In this route table I disabled BGP (border gateway protocol) propagation to ensure the default route since this AGW v2 requires a default route pointing directly to the Internet. Now this is where it gets interesting from a routing perspective. Whenever you create a private endpoint for a service, a system route is added to the route table of all the subnets within the private endpoint’s vnet (virtual network) as well as any vnet the private endpoint’s vnet is peered with. If I want traffic from the AGW to route through the Azure Firewall, I need to override that system route with a /32 UDR. As you can imagine, this can become extremely tedious at scale and even risks hitting the max routes per route table depending on the scale we’re talking about. On the positive side, the issues around this are something Microsoft is aware of so hopefully that means this will be addressed at some point. In the meantime you’ll need to use the /32 route in a pattern such as this.

Azure Firewall Premium needs to be configured to allow traffic from the AGW to the web app and inspect this traffic. You can check out my last post for instructions on configured Azure Firewall to perform these activities.

Excellent, work is done right? Nope! If we influence routing on one side, we have to ensure the other side routes the same way. Toss a route table on the private endpoint subnet you say? Sorry, that isn’t supported my friend. Instead I needed to enable the Azure Firewall instance to SNAT (source NAT) traffic to the spoke Vnet. This ensures routing is symmetric and will eliminate any potential connection issues by creating only the UDR on the AGW subnet.

Lastly, I needed to create and apply a route table to the subnet I delegated to App Services for regional Vnet integration. This route table can be very simple and be configured with a single UDR for the default route pointing to the Azure Firewall private IP. This results in the traffic flow below:

This one was another fun pattern to solve. I got a chance to mess around more with AGW and also get a pattern I had theorized would work actually working. I find implementing the pretty pictures I draw helps drive home the benefits and considerations of such a pattern. With this pattern specifically, it suffers from similar considerations as the pattern in my last post. These include:

Challenges with observability
Operational overhead of certificate management
Possible latency issues depending on latency requirements and traffic patterns

This pattern tacks on more operational overhead by requiring that /32 route be added to the AGW route table each time a new app service is provisioned with a private endpoint. Observability further suffers when tracing a packet flow end-to-end due to the additional layer of SNAT required via Azure Firewall. One thing to note about this is you may be able to get around the SNAT requirement for web-specific traffic because of the transparent proxy functionality behind Azure Firewall application rules. I want to highlight I have not tested this myself and I typically tend to SNAT for this use case in my lab because many of my customers may use similar patterns but with a 3rd party firewall.

So there you go folks, another pattern to add to your inventory. Hopefully we’ll see the /32 issue issue private endpoints resolved sometime in the near future. Have a great weekend!

Azure Firewall and TLS Inspection

Posted on July 5, 2021 by mattfeltonma

Welcome back fellow geeks!

I recently had a customer that was interested in staying as purely cloud native as possible, which included any centralized firewall that would be in use. Microsoft has offered up Azure Firewall for a while now and it is a great solution if you’re looking for a very basic fully-managed firewall. Here are some of the neater features of the solution:

Fully managed by Microsoft with strong SLAs
Central management of north/south and east/west traffic in a hub in spoke architecture vs the operational overhead caused by trying to manage these flows solely with NSGs (network security groups)
Transitive routing device for hub and spoke architectures to allow communication between spoke to on-premises and spoke to spoke
Stateful firewall
Logging of DNS queries via DNS Proxy capability

Unfortunately this basic feature set rarely satisfied the more regulated customer base I tend to work with. Many of these customers went with the full featured security appliances such as those offered by Palo Alto, Fortigate, and the like. One of the largest gaps in Azure Firewall when compared to the 3rd party vendors was the lack DPI (deep packet inspection) and IDS (intrusion detection system) / IPS (intrusion prevention system) capabilities. Microsoft heard the feedback from its customers and back in February of 2021 made the Azure Firewall Premium SKU available in public preview with a collection of features such as TLS (transport layer security) Inspection, IDPS (intrusion detection prevention system), URL filtering, and improved web category filtering. The addition of these capabilities now has made Azure Firewall a much more appealing cloud-native solution.

I had yet to spend any significant time experimenting with the Premium SKU (I make it a habit to not invest a ton of time into preview features). However, this customer gave me the opportunity to dive into the TLS inspection and IDPS capabilities. These capabilities will be the subject of this post and I’ll spend some time describing the architectural pattern I built out and experimented with.

This particular customer had a requirement to perform DPI and IDPS on incoming web-based traffic from the Internet. I asked the customer to provide the control set they needed to satisfy such that we could map those controls to the technical controls available in Azure-native services. The hope was, since this was web-based traffic only and used across multiple regions, we might be able to satisfy all the controls via a WAF (web application firewall) such as Azure Front Door and supplement with layer 7 load balancing with Azure Application Gateway within a given region. The rest of the traffic, non-web, would be delivered to a firewall running in parallel. This pattern is becoming more common place as the WAFs grow in functionality and feature set.

Unfortunately the above pattern was not an option for the customer because they wanted to maintain a centralized funnel for all traffic via a firewall. This is not an uncommon ask. This meant I had to get the traffic coming in from the WAF to funnel through Azure Firewall. For this pattern to work end to end I would need layer 7 load balancing so that meant I needed an Application Gateway as well. The question was do I place the Azure Firewall before or after the Application Gateway? For the answer to this question I went to the Microsoft documentation. Typically the public documentation leaves a lot to be desired when it comes to identifying the benefits and considerations of a particular pattern (oh how I long for the days of Technet-quality documentation), however the documentation around these patterns is stellar.

After quickly reading the benefits and considerations about the two patterns, the decision looked like it was made for me. The pattern where the firewall is placed after the application gateway aligned with my customer’s use case. It specifically covered TLS inspection and IDPS through Azure Firewall Premium. Curious as to why this TLS inspection at Azure Firewall wasn’t mentioned in the other use case where Azure Firewall is placed in front of Application Gateway, I went down a rabbit hole.

My first stop was the public documentation for the Azure Firewall Premium SKU. Since the feature is still public preview there are a fair amount of limitations but none of the limitations that weren’t planned to be fixed by GA (general availability) looked like show stoppers. However, in the section for TLS Inspection, I noticed this blurb, “Azure Firewall Premium terminates outbound and east-west TLS connections.” I reached out to some internal communities within Microsoft and confirmed that “at this time” Azure Firewall isn’t capable of performing TLS Inspection on traffic coming in the public interface. This limitation meant that I had to get the traffic received from the WAF over to the internal interface and the best way to do this was to intake it from the WAF through the Application Gateway. This would be the pattern I’d experiment with.

To keep things simple, I focused on a single region and didn’t include a WAF. Load balancing across regions could be done with the customer’s 3rd party WAF where the WAF would resolve to the appropriate regional Application Gateway v2 instance public IP depending on the load balancing pattern (such as geo-location) the customer was using. Once the traffic is received from the WAF, the Application Gateway terminates the TLS session so that it can inspect the URL and host headers and direct the traffic to the appropriate backend, which in this case was the single web server running IIS (Internet Information Services) in a peered spoke virtual network.

To ensure the traffic leaving the Application Gateway funnels through the Azure Firewall instance, I attached a route table to the Application Gateway. This route table was configured with BGP propagation disabled (to ensure a default route couldn’t be accidentally propagated in) and with a single UDR (user-defined route) which contained the spoke’s virtual network CIDR (Classless Inter-Domain Routing) block with a next hop of the Azure Firewall private IP address. Since UDRs take precedence over system routes, this route would invalidate the system route for the peering. On the web server subnet in the spoke I had a similar route table with a single UDR which contained the transit virtual network CIDR block with a next hop of the Azure Firewall private IP address. This ensured that any communication between the two would flow through the Azure Firewall.

To ensure DNS would resolve, I created an A record in public DNS for sample1.geekintheweeds.com pointing to the public IP address of the Application Gateway. Within Azure, I built a Windows server, installed the DNS service, and created a forward lookup zone for geekintheweeds.com with an A record named sample1 pointing to the web server. Within each virtual network I configured the Windows Server IP address in the DNS Server settings (note that if you do this after you provision Application Gateway, you’ll need to stop and start it). Within the Azure Firewall, I set it to use the server as its DNS server.

Now that the necessary plumbing was in place I needed to put in the appropriate certificates for the Application Gateway, Azure Firewall, and the Web Server. This is where this setup can get ugly. Since this was a lab, I generated all of my certificates from a private CA (certificate authority) I have running in my home lab. Since these CAs are only used for testing the CA issues all certificates without a CDP (certificate revocation distribution point) to keep things simple by avoiding requiring any of those network flows for revocation check lookups. In a production environment you’d want to issue the certificates for the Application Gateway from a trusted public CA so you don’t have to worry about CDPs/OCSP (online certificate status protocol) exposing the endpoints for these flows. For Azure Firewall and the web server you’d be fine using certificates issued by a private CA as long as you ensured appropriate validation endpoints were available.

The Application Gateway and web server will use your standard web server certificate. The Azure Firewall is a different story. To support TLS Interception you’ll need to provide it with an intermediary certificate. This type of certificate allows Azure Firewall to generate certificates on the fly to impersonate the services it’s intercepting traffic for. This link explains the finer details of the requirements. Also note that the certificate needs to be imported to an instance of Key Vault which Azure Firewall accesses using a user-assigned managed identity.

Once the Application Gateway was configured in a similar manner as outlined here, I was good to test. Accessing the server from a machine running in my home lab successfully displayed the standard IIS website as hoped! When I viewed the Azure Firewall logs the full URL being accessed by the user was visible proving TLS inspection was working as expected. Success!

Log entry from Azure Firewall showing full URL

The patterns works but there are a number of considerations.

The biggest consideration being Azure Firewall Premium is still in public preview. Regardless of what you may hear from those within Microsoft or outside of Microsoft, DO NOT USE PUBLIC PREVIEW FEATURES IN PRODUCTION. I’d go as far as cautioning against even using them in planned designs. By the time these new products or features make it to GA, they can and often do change, sometimes for the better and sometimes for the worst. If you choose to use these products or features in an upcoming design, make sure to have a plan B if the product doesn’t hit GA or hits GA without the features you need. Remember that Microsoft’s targeted release dates are often moving targets that accelerate or decelerate depending on the feedback from public preview. Unless you have a contractual agreement with a vendor to deliver upon a specific date and there are real penalties to the vendor for not doing that, you should have a fully vetted and tested plan B ready to go.

Outside of my ranting about usage of public preview products and features, here are some other considerations with this pattern:

Challenges with observability
Operational overhead of certificate management
Possible latency issues depending on latency requirements and traffic patterns

In this design the Application Gateway will be SNATing the traffic it receives from the Internet users. To understand the session end to end and correlate the logs from the WAF to the Application Gateway, through the Firewall, to the web server, you’ll need to ensure you’re capturing the x-forwarded-for header and using it to identify the user’s original source IP. This will definitely add complexity to the observability of the environment. Tack on the many mediation points, and identifying where traffic is getting rejected (WAF, App Gateway, firewall, NSG, local machine firewall) will require a strong logging and correlation system.

This pattern is going to require at least 3 separate certificates which will likely be a mix of certificates issued by both public CAs and private CAs. Certificate lifecycle management is a significantly challenging operational task and is often the cause of service outages. If you opt to use a pattern such as this, you’ll need to ensure your operational monitoring and alerting processes around certificate lifecycle management are solid. In addition, you will also need to manage revocation network flows. In a past life, these were the flows I observed that would often bite organizations.

Lastly, this pattern involves a lot of hops where the traffic is being decrypted and re-encrypted. This requires compute time which could add to latency. These latency issues could be impactful depending on the latency requirements of the application and what type and volume of data is flowing between the user and web server.

I have to admit I enjoyed labbing this on out. These days I usually spend a majority of my time in governance conversations focusing on people and process. Getting back into the technology and spending some time playing with the Azure Firewall Premium SKU and Application Gateway was a great learning experience. It will be interesting to see over time how well it will compete against the behemoths of the industry such as Palo Alto.

Over the next week I’ll be making some small tweaks to this design to see whether I can stick the Key Vault behind a Private Endpoint (documentation is unclear as to whether or not this is supported) and messing with the logs provided by both the Azure Firewall and Application Gateway to see how challenging correlation of sessions is.

I hope you have a great long weekend and see you next post!

Journey Of The Geek

The chronicles of a Bostonian tech geek navigating through life and technology

Tag Archives: azure firewall

Azure VWAN and Private Endpoint Traffic Inspection – Findings

Application Gateway and Private Link

A Pattern for App Services with Private Endpoints

Azure Firewall and TLS Inspection