Interesting behaviors with Private Endpoints

Interesting behaviors with Private Endpoints

Hi folks!

Working for and with organizations in highly regulated industries like federal and state governments and commercial banks often necessitates diving REALLY deep into products and technologies. This means peeling back the layers of the onion most people do not. The reason this pops up is because these organizations tend to have extremely complex environments due the length of time the organization has existed and the strict laws and regulations they must abide by. This is probably the reason why I’ve always gravitated towards these industries.

I recently ran into an interesting use case where that willingness to dive deep was needed.

A customer I was working with was wrapping up its Azure landing zone deployment and was beginning to deploy its initial workloads. A number of these workloads used Microsoft Azure PaaS (platform-as-a-service) services such as Azure Storage and Azure Key Vault. The customer had made the wise choice to consume the services through Azure Private Endpoints. I’m not going to go into detail on the basics of Azure Private Endpoints. There is plenty of official Microsoft documentation that can cover the basics and give you the marketing pitch. You can check out my pasts posts on the topic such as my series on Azure Private DNS and Azure Private Endpoints.

This particular customer chose to use them to consume the services over a private connection from both within Azure and on-premises as well as to mitigate the risk of data exfiltration that exists when egressing the traffic to Internet public endpoints or using Azure Service Endpoints. One of the additional requirements the customer had as to mediate the traffic to Azure Private Endpoints using a security appliance. The security appliance was acting as a firewall to control traffic to the Private Endpoints as well to perform deep packet inspection sometime in the future. This is the requirement that drove me down into the weeds of Private Endpoints and lead to a lot of interesting observations about the behaviors of network traffic flowing to and back from Private Endpoints. Those are the observations I’ll be sharing today.

For this lab, I’ll be using a slightly modified version of my simple hub and spoke lab. I’ve modified and added the following items:

  • Virtual machine in hub runs Microsoft Windows DNS and is configured to forward all DNS traffic to Azure DNS (168.63.129.16)
  • Virtual machine in spoke is configured to use virtual machine in hub as a DNS server
  • Removed the route table from the spoke data subnet
  • Azure Private DNS Zone hosting the privatelink.blob.core.windows.net namespace
  • Azure Storage Account named mftesting hosting some sample objects in blob storage
  • Private Endpoint for the mftesting storage account blob storage placed in the spoke data subnet
Lab environment

The first interesting observation I made was that there was a /32 route for the Private Endpoint. While this is documented, I had never noticed it. In fact most of my peers I ran this by were never aware of it either, largely because the only way you would see it is if you enumerated effective routes for a VM and looked closely for it. Below I’ve included a screenshot of the effective routes on the VM in the spoke Virtual Network where the Private Endpoint was provisioned.

Effective routes on spoke VM

Notice the next hop type of InterfaceEndpoint. I was unable to find the next hop type of InterfaceEndpoint documented in public documentation, but it is indeed related to Private Endpoints. The magic behind that next hop type isn’t something that Microsoft documents publicly.

Now this route is interesting for a few reasons. It doesn’t just propagate to all of the route tables of subnets within the Virtual Network, it also propagates to all of the route tables in directly peered Virtual Networks. In the hub and spoke architecture that is recommended for Microsoft Azure, this means that every Private Endpoint you create in a spoke Virtual Network is propagated to as a system route to route tables of each subnet in the hub Virtual Network. Below you can see a screen of the VM running in the hub Virtual Network.

Effective routes on hub VM

This can make things complicated if you have a requirement such as the customer I was working with where the customer wants to control network traffic to the Private Endpoint. The only way to do that completely is to create a /32 UDRs (user defined routes) in every route table in both the hub and spoke. With a limit of 400 UDRs per route table, you can quickly see how this may break down at scale.

There is another interesting thing about this route. Recall from effective routes for the spoke VM, that there is a /32 system route for the Private Endpoint. Since this is the most specific route, all traffic should be routed directly to the Private Endpoint right? Let’s check that out. Here I ran a port scan against the Private Endpoint using nmap using the ICMP, UDP, and TCP protocols. I then opened the Log Analytics Workspace and ran a query across the Azure Firewall logs for any traffic to the Private Endpoint from the VM and lo and behold, there is the ICMP and UDP traffic nmap generated.

Captured UDP and ICMP traffic

Yes folks that /32 route is protocol aware and will only apply to TCP traffic. UDP and ICMP traffic will not be affected. Software defined networking is grand isn’t it? 🙂

You may be asking why the hell I decided to test this particular piece. The reason I followed this breadcrumb was my customer had setup a UDR to route traffic from the VM to an NVA in the hub and attempted to send an ICMP Ping to the Private Endpoint. In reviewing their firewall logs they saw only the ICMP traffic. This finding was what drove me to test all three protocols and make the observation that the route only affects TCP traffic.

Microsoft’s public documentation mentions that Private Endpoints only support TCP at this time, but the documentation does not specify that this system route does not apply to UDP and ICMP traffic. This can result in confusion such as it did for this customer.

So how did we resolve this for my customer? Well in a very odd coincidence, a wonderful person over at Microsoft recently published some patterns on how to approach this problem. You can (and should) read the documentation for the full details, but I’ll cover some of the highlights.

There are four patterns that are offered up. Scenario 3 is not applicable for any enterprise customer given that those customers will be using a hub and spoke pattern. Scenario 1 may work but in my opinion is going to architect you into a corner over the long term so I would avoid it if it were me. That leaves us with Scenario 2 and Scenario 4.

Scenario 2 is one I want to touch on first. Now if you have a significant background in networking, this scenario will leave you scratching your head a bit.

Microsoft Documentation Scenario 2

Notice how a UDR is applied to the subnet with the VM which will route traffic to Azure Firewall however, there is no corresponding UDR applied to the Private Endpoint. Now this makes sense since the Private Endpoint would ignore the UDR anyway since they don’t support UDRs at this time. Now you old networking geeks probably see the problem here. If the packet from the VM has to travel from A (the VM) to B (stateful firewall) to C (the Private Endpoint) the stateful firewall will make a note of that connection in its cache and be expecting packets coming back from the Private Endpoint representing the return traffic. The problem here is the Private Endpoint doesn’t know that it needs to take the C (Private Endpoint) to B (stateful firewall) to A (VM) because it isn’t aware of that route and you’d have an asymmetric routing situation.

If you’re like me, you’d assume you’d need to SNAT in this scenario. Oddly enough, due the magic of software defined routing, you do not. This struck me as very odd because in scenario 3 where everything is in the same Virtual Network you do need to SNAT. I’m not sure why this is, but sometimes accepting magic is part of living in a software defined world.

Finally, we come to scenario 4. This is a common scenario for most customers because who doesn’t want to access Azure PaaS services over an ExpressRoute connection vs an Internet connection? For this scenario, you again need to SNAT. So honestly, I’d just SNAT for both scenario 2 and 4 to make maintain consistency. I have successfully tested scenario 2 with SNAT so it does indeed work as you expect it would.

Well folks I hope you found this information helpful. While much of it is mentioned in public documentation, it lacks the depth that those of us working in complex environments need and those of us who like to geek out a bit want.

See you next post!

Force Tunneling Azure Firewall to pfSense – Part 2

Force Tunneling Azure Firewall to pfSense – Part 2

Welcome back to my series on forced tunneling Azure Firewall using pfSense.  In my last post I covered the background of the problem I wanted to solve, the lab makeup I’m using, and the process to setup the S2S (site-to-site) VPN with pfSense and exchange of routes over BGP.  Take a few read through that post before jumping into this one.

At this point you should a working S2S VPN from your Azure VNet to your pfSense router and the two should be exchanging a few routes over BGP.  If you didn’t complete all the steps in the first post, go back and do them now.

Now that connectivity is established, it’s time to incorporate Azure Firewall.  Azure Firewall was introduced back in 2018 as a managed stateful firewall that can act as an alternative to rolling your own NVAs (network virtual appliances) like a Palo Alto or Checkpoint firewall.  Now I’m not going to lie to you and tell you it has all the bells and whistles that a 3rd party NVA has, but it can provide a reasonable alternative depending on what your needs are.  The major benefit is it’s a managed service to Microsoft owns the responsibility of managing the health of the service, its high availability and failover,  it’s closely integrated with the Azure platform, more than likely cheaper than what you’d pay for a 3rd-party NVA license.

Recently, Microsoft has introduced support for forced tunneling into public preview.  This provides you with the ability to send all of the traffic received by Azure Firewall on to another security stack that may exist within Azure, on-premises, or in another cloud. It helps to address some of the capability gaps such as lack of support for (DPI) deep packet inspection for Internet-bound traffic.  You can leverage Azure Firewall to transitively route and mediate traffic between on-premises and Azure, hub-spoke, and spoke to spoke while passing Internet bound traffic on to another security stack with DPI capabilities.

With that out of the way, let’s continue with the lab.

The first thing you’ll want to do is to deploy an instance of Azure Firewall.  To support forced tunneling, you’ll need to toggle the option to enabled.  You then need to provide another public IP address.  What’s happening here is the nodes are being created with two NICs (network interface cards).  One NIC will live in the AzureFirewallSubnet and one will live in the AzureFirewallManagementSubnet.  Traffic dedicated to Microsoft’s management of the nodes will go out to the Internet (but remains on Microsoft’s backbone) through the NIC in the AzureFirewallManagementSubnet.  Traffic from your VMs will exist the NIC in the AzureFirewallSubnet.  This split also means you can now attach a UDR (user defined route) to the AzureFirewallSubnet to route that traffic to your own security stack.

azfwsetup

The Azure Firewall instance will take about 10-20 minutes to provision.  While you’re waiting you need to prepare the Virtual Network Gateway for forced tunneling.

Now if you go Googling, you’re going to come across this Microsoft article which describes setting a GatewayDefaultSite for the VPN Gateway.  While you can do it this way and you opt for an active/active both on-premises and for the VPN Gateway configuration, you’ll need to need to flip this setting to the other local network gateway (your other router) in the event of a failover.

As an alternative solution you can propagate a default route via BGP from your on-premises router into Azure.  ECMP will be used by default and will spread the traffic across all available tunnels.  If one of your on-premises routers goes down, traffic will still be able to flow back on-premises without requiring you to fail anything over on the Azure end.  Note that if you want make one of your routers preferred, you’ll have to try your luck with AS Path Prepending.

For this lab scenario, I opted to broadcast a default route via BGP.  My OpenBGPD config file is pictured below.  Notice I’ve added a default route to be propagated.

openbgpd-config

Hopping over to Azure and enumerating the effective routes shows the new routes being propagated into the VNet via the VPN Gateway.

vnetroutes

With this configuration, all traffic without a more specific route (like all our Internet traffic) will be routed back to the VPN Gateway.  Since this lab calls for this traffic to be sent to Azure Firewall first, you’ll need to configure a UDR (user defined route).  As described in this link, when multiple routes exist for the same prefix, Azure picks from UDRs first, then BGP, and finally system routes.

For this you’re going to need to set up three route tables.

One routing table will be applied to the primary subnet the VM is living in.  This will contain a UDR for the default route (0.0.0.0/0) with a next hop type of Virtual appliance and next hop address of the Azure Firewall instance’s NIC in the AzureFirewallSubnet.  By order of

udrprimary

The second routing table will be applied to the AzureFirewallSubnet.  This will contain a UDR for the default route with a next hop of the Virtual network gateway.  This forces Azure Firewall to pipe all the VM traffic bound for the networks outside the VNet to the Virtual Network Gateway which will then tunnel it through the VPN tunnel.

routefirewall

Last but not least, you have an optional route table you can add.  This route table will be applied to the AzureFirewallManagementSubnet and will be configured with Virtual Network Gateway route propagation disabled.  It will have a single UDR with a default route and next hop type of Internet.  The reason I like adding this route table is it avoids the risk of someone propagating a default route from on-premises.  If this route were to be propagated to the AzureFirewallManagementSubnet, the management plane would see it down and may deallocate the instance.

routemgmt

The last thing you need to do in Azure is create a rule in Azure Firewall to allow traffic to the web.  For this I created a very simple application rule allowing all HTTP and HTTPS traffic to any domain.

azfirewallrule

 

At this point the Azure end of the configuration is complete.  We now need to hop over to pfSense and finish that configuration.

Remember back in the last post when I had you configure the phase 2 entry with a local network of 0.0.0.0/0?  That was the traffic selector which allows traffic destined for any network from the VNet to flow through our VPN tunnel.

Now you have a requirement to NAT traffic from the VNet out the WAN interface on the pfSense box.  For that you have to navigate to the Firewall drop-down menu and choose the NAT menu item.  From there you’ll navigate to the Outbound option and ensure your Outbound NAT Mode is set to Hybrid Outbound NAT rule generation since we’ll continue to leverage the automatic rules pfSense creates as well as this new custom rule.

Add a new mapping by clicking the Add button.  For this you’ll want to configure it as seen in the screenshot below.  Once complete save the new rule and new mappings.

nat

Last but not least, we need to open flows within the pfSense firewall to allow the traffic to go out to the Internet over HTTP and HTTPS as seen below.

pfsensefw

You’re done!  Now time to test the configuration.  For this you’ll want to RDP into your VM, open up a web browser, and try to hit a website.

google

Excellent, so you made it out to the web, but how do you know you were force tunneled through?  Simple!  Just hit a website like https://whatismyipaddress.com and validate the IP returned is the IP associated with your pfSense WAN interface.

One thing to note is that if you deallocate and reallocate your Azure Firewall or delete and recreate your Azure Firewall after everything is in place, you may run into an issue where forced tunneling doesn’t seem to work.  All you need to do is bring down the VPN tunnel and bring it back up again.  There is some type of dependency there, but what that is, I don’t know.

Well that’s it folks.  Hope you enjoyed the series and got some value out of it.  Azure Firewall is a solid alternative to a self-managed NVA.  Sure you don’t get all the bells and whistles, but you get key capabilities such as transitive routing and features that build on NSGs such as filtering traffic via FQDN, centralized rule management, and centralized logging of what’s being allowed and denied through your network.  As an added bonus, you can always leverage the forced tunneling feature you learned about today to tunnel traffic to a security stack which can perform features Azure Firewall can’t such as deep packet inspection.

Stay healthy!