A Deep Dive Into Azure Route Server

Hello again fellow geeks.

I recently had a customer reach out to me with an interest in learning more about ARS (Azure Route Server). The customer hoped it might ease some of the burden of managing routing across a large Azure implementation. I had yet to mess around with ARS since it only recently went GA (generally available), so I figured this would be a great opportunity to build out a lab and give it a whirl. Let’s get to it!

ARS is one of Microsoft’s new networking offerings in Azure offering a managed routing service that is hooked into Azure’s SDN (software defined network). The way I like to think of the service is a couple of VMs (virtual machines) that are managed by Microsoft, running a BGP (Border Gateway Protocol) service, and which have the ability to program routes directly into VNets (virtual networks). In addition to introducing some pretty cool new networking patterns such as an SD WAN and ExpressRoute pattern and dual-homed network, the feature that most interested my customer was its capability of BGP peering with a customer’s NVA (network virtual appliance) such as a Palo Alto or Cisco appliance and injecting those learned dynamic routes into the VNet. This is the feature which I’ll cover in this blog post.

I did some thinking about how I wanted to lab this out and what I wanted to use as an NVA. Since the behavior I wanted to test was primarily BGP, I figured I’d keep it simple and run a Linux VM which would host a lightweight BGP service. I found a wonderful post from Adam Stuart (seriously a great read and really cool Azure networking pattern in his post) which mentioned Exabgp. After doing a bit of research on it, it looked relatively easy to setup and use so the choice was made (Thanks Adam!) In addition to the NVA, I decided to build out a hub-and-spoke architecture and make use of my home pfSense appliance for S2S VPN (site-to-site virtual private network) connectivity and to replicate an on-premises environment. The result was the lab pictured below (image 1).

Image 1 – Azure Route Server Lab

One thing you may notice in the above image is ARS has a public IP address associated to it. Remember when I mentioned this is a couple of Microsoft-managed VMs? Well this public IP facilitates Microsoft management of the VMs similar to other managed services such as Azure SQL MI (Managed Instance). Before you ask, no you can’t associate an NSG (Network Security Group) to the subnet ARS is provisioned into, and the subnet has to be named RouteServerSubnet.

I setup a S2S VPN connection and BGP peering between my pfSense appliance and Azure VPN Gateway and advertised the set of routes documented in the lab diagram (image 1). I then provisioned a couple of Ubuntu VMs with two in the hub, one in the first spoke, and one in the second spoke.

Exabgp was a bit painful to setup because the documentation out there is a bit sparse and I had to piece it together from multiple sources. Given that, I’m going to spend a few minutes walking through the setup.

I first setup my ARS using the instructions in this link. Ensure you use a valid ASN for your NVA (autonomous system number).

Once ARS is setup, you can begin setting up Exabgp on the VM using the commands below.

sudo apt update
sudo apt install exabgp

You’ll then need to modify the service file located in /lib/systemd/system/exabgp.service to uncomment the lines below.

...
[Service]
#User=exabgp
#Group=exabgp
Environment=exabgp_daemon_daemonize=false
PermissionsStartOnly=true
ExecStartPre=-mkfifo /run/exabgp.in
...

Now you’ll need to create a configuration file for exabgp and save it to /etc/exabgp/exabgp.conf. Below is the configuration file I used for the lab.

neighbor 10.0.2.4 {
        router-id 10.0.0.4;
        local-address 10.0.0.4;
        local-as 65010;
        peer-as 65515;

        static {
                route 192.168.10.0/24 next-hop 10.0.0.4;
        #       route 192.168.1.0/24 next-hop 10.0.0.4 as-path [65010 65010] community [65010:2];
                route 192.168.1.0/24 next-hop 10.0.0.4 community [65010:2];
                route 10.1.0.0/16 next-hop 10.0.0.4;
                route 10.10.0.0/16 next-hop 10.0.0.4;
                route 0.0.0.0/0 next-hop 10.0.0.4;
        }
}
neighbor 10.0.2.5 {
        router-id 10.0.0.4;
        local-address 10.0.0.4;
        local-as 65010;
        peer-as 65515;

        static {
                route 192.168.10.0/24 next-hop 10.0.0.4;
        #       route 192.168.1.0/24 next-hop 10.0.0.4 as-path [65010 65010] community [65010:2];
                route 192.168.1.0/24 next-hop 10.0.0.4 community [65010:2];
                route 10.1.0.0/16 next-hop 10.0.0.4;
                route 10.10.0.0/16 next-hop 10.0.0.4;
                route 0.0.0.0/0 next-hop 10.0.0.4;
        }
}

Each instance of ARS is configured to be highly available and is deployed across availability zones if the region supports it. It comes with two BGP peering IPs you’ll need to peer with and advertise your routes to. If you only peer one with or advertise different routes, you’ll get funky behaviors. In the configuration file above, each JSON object represents one of these peers. You’ll notice I’m advertising a number of routes which will demonstrate different behaviors I’ll walk through with this post.

Once you’ve done the above you can start the service and validate that it successfully started.

sudo systemctl start exabgp
sudo systemctl status exabgp

Give it a minute and then you can run the following command to output the routes the exabgp service has learned from ARS.

sudo exabgpcli show adj-rib in extensive

In the output below (image 2) you can see ARS advertising the address space of the hub and spoke1 Vnet.

Image 2 – Exabgp output

So why isn’t spoke 2 being advertised? Well ARS requires the peerings between the VNets to be configured with UseRemoteGateway and AllowGatewayTransit properties (referred to in the Portal as Use the remote virtual network’s gateway or Route Server and Use this virtual network’s gateway or Route Server) in order for ARS to propagate routes between the Vnets. As you’ll note from the lab image, I enabled these settings for spoke 1 but not spoke 2.

This requirement does present a challenge in that when it’s enabled ARS propagates routes to the peered Vnets, but so does your ExpressRoute or VPN Gateway. My customer base typically requires all traffic leaving the spoke to flow through a security appliance. If you do not enable this setting the spoke only knows about itself and the hub VNet. To direct the traffic through an appliance in a hub you only need two UDRs (one for 0.0.0.0/0 and one for the hub VNet address space). This makes it easy to stamp out spokes from a networking perspective and optionally audit and enforce with Azure Policy.

Looking at the effective routes of the VM in spoke 1 (image 3), you can see routes highlighted in red are routes being propagated into the spoke by my VPN Gateway which is receiving them from my pfSense appliance. To ensure traffic between on-premises and Azure flows through a security appliance you’d need to define all of the routes you’re propagating over your ExpressRoute/VPN in your NVA. This way ARS propagates them to the peered VNets and overrides the ones coming in from the gateway.

Image 3 – Spoke 1 Effective Routes

Recall from my lab image (image 1), I’m sending 192.168.0.1/24, 192.168.2.0/24, 192.168.3.0/24, and 192.168.4.0/24 over BGP to Azure. In the image above (image 3) you’ll notice three out of four of these routes are coming from my VPN Gateway (10.0.3.5), but 192.168.0.1/24 is coming from Exabgp VM (10.0.0.4). The documentation says that when two routes for the same address space but different AS PATH lengths are received from different NVAs, ARS will only program the route with the shorter AS PATH. Apparently this isn’t just a trait of ARS and is a trait of the Azure SDN.

To validate this is the case, I edited my Exabgp config file and appended a longer AS PATH length to the route as seen below.

Exabgp config with modified AS PATH

After about a minute, the route table on my spoke 1 VM now displays the route for 192.168.0.1 coming from the VPN Gateway because the route coming from my Exabgp now has a longer AS PATH. This indicates that when two routes for the same address space are advertised, the route with the shortest AS PATH gets programmed to the VNet while the route with the longer AS PATH is seemingly discarded. I would have expected to see the 192.168.0.1/24 coming from on-premises with an Active value of Invalid, but instead it’s discarded.

Image 4 – VPN Gateway with shorter AS PATH

The next route I want to look at it is 10.1.0.0/16 which is the address space of spoke 1. I configured the Exabgp VM to advertise this route to ARS. Looking at the effective routes for the VM in the hub (VM-DEMO) the only route for this address space is the system route for the peering. This is because system routes for the VNet, VNet peering, or service endpoints are the preferred routes, even if BGP routes are more specific. This means you can’t use ARS to push routes that would force traffic from a spoke to flow through an NVA in the hub because of the VNet peering system route. For that use case it looks like you will need to continue to use UDRs (user defined routes)

I have the default route of 0.0.0.0/0 which I am advertising from the Exabgp VM. You can see in the above images is propagating to both the hub and spoke. Take note that if you advertise this route, you’re going to want to ensure you place a route table and UDR on the subnet your NVAs are in. Otherwise you’ll run into a scenario where you’ll have a routing loop because the default route would be received by the NVAs subnet.

Let’s pull the VPN Gateway into the mix. ARS does support BGP peering with an ExpressRoute or VPN Gateway. This allows you to propagate the routes ARS is learning from the NVA back on-premises. You enable this functionality by enabling the Branch-to-branch feature of ARS. In my lab environment, I commented out the default route in my Exabgp conf file because I didn’t want to send that back on-premises because I don’t know how to filter out routes in FRRoute (the BGP service running on pfSense). I then ran an az network vnet-gateway list-learned-routes and confirmed my VNG is now receiving routes from ARS as seen in the image below (image 5).

Image 5 – VPN Gateway learned routes

On the pfSense side I was now able to see the routes coming from my NVA coming over the VPN Gateway and back my on-premises appliance. These are the routes with an AS PATH of 65510 65515 65010 in the image below (image 6).

Image 6 – pfSense learned routes

The above shows that the routes coming from the NVA are being propagated back on-premises. What about the routes coming from on-premises? Are they being propagated all the way to the NVA? Let’s check it out!

To verify this I printed the routes ARS is advertising to the NVA about using az network routeserver peering list-advertised-routes. The routes in bold are the routes coming from the pfSense appliance showing that ARS is receiving the routes and advertising them to the Exabgp VM.

RouteServiceRole_IN_0:
- asPath: '65515'
  localAddress: 10.0.2.4
  network: 10.0.0.0/16
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: '65515'
  localAddress: 10.0.2.4
  network: 10.1.0.0/16
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.4.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.2.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0
- asPath: 65515-65510-65501
  localAddress: 10.0.2.4
  network: 192.168.3.0/24
  nextHop: 10.0.2.4
  origin: Igp
  weight: 0

Lastly, I wanted to see the routes and details on the Exabpg VM. For that I ran the command below on the VM.

sudo exabgpcli show adj-rib in extensive
Image 7 – Exabgp learned routes

In the image above (image 7) the on-premises routes for the 192.168.0 address spaces have been learned and have the AS PATH leading back on-premises. Note the community value of 65517:65517 being attached to the routes. That is not coming from my pfSense appliance but rather the VPN Gateway. The community tag is used to identify these routes as coming from a VPN Gateway and are used by Microsoft for filtering.

Well folks that about wraps it up. The key takeaways of my time with ARS were the following:

  • ARS requires the UseRemoteGateway and AllowGatewayTransit properties be configured on VNet peering. This makes managing traffic flow between an spoke and on-premises a bit more complicated. Instead of defining a 0.0.0.0/0 route and calling it a day, you would need to define all the routes in your NVA and propagate them using ARS.

    This isn’t necessarily a bad thing, you’re just shifting management of routing outside the Azure control plane and into the NVA data plane. That may be your preference.
  • When multiple routes for the same address space and different length AS PATHs are propagated to a VNet, the Azure SDN will only program the route the shortest AS PATH. The other route with the longer AS PATH is discarded.
  • You can’t use an NVA and ARS to propagate the hub and spoke VNet address spaces back on-premises because system routes for the VNet, VNet peering, and Service Endpoints supersede routes propagated via BGP. Instead, you could use a larger summarized route encompassing the entire address space for the VNets in your region and propagate that back on-premises using the NVA and ARS.
  • You can use an NVA and ARS to propagate routes from Azure back to on-premises. A use case might be that you want all Internet-bound traffic to egress out of Azure because you get better performance than you’re getting from your ISP on-premises. I’ve never seen it, but nevertheless. 🙂

Hopefully some of the above helps you become more familiar with Azure Route Server and what the benefits and considerations are.

See you next post!

Force Tunneling Azure Firewall to pfSense – Part 2

Force Tunneling Azure Firewall to pfSense – Part 2

Welcome back to my series on forced tunneling Azure Firewall using pfSense.  In my last post I covered the background of the problem I wanted to solve, the lab makeup I’m using, and the process to setup the S2S (site-to-site) VPN with pfSense and exchange of routes over BGP.  Take a few read through that post before jumping into this one.

At this point you should a working S2S VPN from your Azure VNet to your pfSense router and the two should be exchanging a few routes over BGP.  If you didn’t complete all the steps in the first post, go back and do them now.

Now that connectivity is established, it’s time to incorporate Azure Firewall.  Azure Firewall was introduced back in 2018 as a managed stateful firewall that can act as an alternative to rolling your own NVAs (network virtual appliances) like a Palo Alto or Checkpoint firewall.  Now I’m not going to lie to you and tell you it has all the bells and whistles that a 3rd party NVA has, but it can provide a reasonable alternative depending on what your needs are.  The major benefit is it’s a managed service to Microsoft owns the responsibility of managing the health of the service, its high availability and failover,  it’s closely integrated with the Azure platform, more than likely cheaper than what you’d pay for a 3rd-party NVA license.

Recently, Microsoft has introduced support for forced tunneling into public preview.  This provides you with the ability to send all of the traffic received by Azure Firewall on to another security stack that may exist within Azure, on-premises, or in another cloud. It helps to address some of the capability gaps such as lack of support for (DPI) deep packet inspection for Internet-bound traffic.  You can leverage Azure Firewall to transitively route and mediate traffic between on-premises and Azure, hub-spoke, and spoke to spoke while passing Internet bound traffic on to another security stack with DPI capabilities.

With that out of the way, let’s continue with the lab.

The first thing you’ll want to do is to deploy an instance of Azure Firewall.  To support forced tunneling, you’ll need to toggle the option to enabled.  You then need to provide another public IP address.  What’s happening here is the nodes are being created with two NICs (network interface cards).  One NIC will live in the AzureFirewallSubnet and one will live in the AzureFirewallManagementSubnet.  Traffic dedicated to Microsoft’s management of the nodes will go out to the Internet (but remains on Microsoft’s backbone) through the NIC in the AzureFirewallManagementSubnet.  Traffic from your VMs will exist the NIC in the AzureFirewallSubnet.  This split also means you can now attach a UDR (user defined route) to the AzureFirewallSubnet to route that traffic to your own security stack.

azfwsetup

The Azure Firewall instance will take about 10-20 minutes to provision.  While you’re waiting you need to prepare the Virtual Network Gateway for forced tunneling.

Now if you go Googling, you’re going to come across this Microsoft article which describes setting a GatewayDefaultSite for the VPN Gateway.  While you can do it this way and you opt for an active/active both on-premises and for the VPN Gateway configuration, you’ll need to need to flip this setting to the other local network gateway (your other router) in the event of a failover.

As an alternative solution you can propagate a default route via BGP from your on-premises router into Azure.  ECMP will be used by default and will spread the traffic across all available tunnels.  If one of your on-premises routers goes down, traffic will still be able to flow back on-premises without requiring you to fail anything over on the Azure end.  Note that if you want make one of your routers preferred, you’ll have to try your luck with AS Path Prepending.

For this lab scenario, I opted to broadcast a default route via BGP.  My OpenBGPD config file is pictured below.  Notice I’ve added a default route to be propagated.

openbgpd-config

Hopping over to Azure and enumerating the effective routes shows the new routes being propagated into the VNet via the VPN Gateway.

vnetroutes

With this configuration, all traffic without a more specific route (like all our Internet traffic) will be routed back to the VPN Gateway.  Since this lab calls for this traffic to be sent to Azure Firewall first, you’ll need to configure a UDR (user defined route).  As described in this link, when multiple routes exist for the same prefix, Azure picks from UDRs first, then BGP, and finally system routes.

For this you’re going to need to set up three route tables.

One routing table will be applied to the primary subnet the VM is living in.  This will contain a UDR for the default route (0.0.0.0/0) with a next hop type of Virtual appliance and next hop address of the Azure Firewall instance’s NIC in the AzureFirewallSubnet.  By order of

udrprimary

The second routing table will be applied to the AzureFirewallSubnet.  This will contain a UDR for the default route with a next hop of the Virtual network gateway.  This forces Azure Firewall to pipe all the VM traffic bound for the networks outside the VNet to the Virtual Network Gateway which will then tunnel it through the VPN tunnel.

routefirewall

Last but not least, you have an optional route table you can add.  This route table will be applied to the AzureFirewallManagementSubnet and will be configured with Virtual Network Gateway route propagation disabled.  It will have a single UDR with a default route and next hop type of Internet.  The reason I like adding this route table is it avoids the risk of someone propagating a default route from on-premises.  If this route were to be propagated to the AzureFirewallManagementSubnet, the management plane would see it down and may deallocate the instance.

routemgmt

The last thing you need to do in Azure is create a rule in Azure Firewall to allow traffic to the web.  For this I created a very simple application rule allowing all HTTP and HTTPS traffic to any domain.

azfirewallrule

 

At this point the Azure end of the configuration is complete.  We now need to hop over to pfSense and finish that configuration.

Remember back in the last post when I had you configure the phase 2 entry with a local network of 0.0.0.0/0?  That was the traffic selector which allows traffic destined for any network from the VNet to flow through our VPN tunnel.

Now you have a requirement to NAT traffic from the VNet out the WAN interface on the pfSense box.  For that you have to navigate to the Firewall drop-down menu and choose the NAT menu item.  From there you’ll navigate to the Outbound option and ensure your Outbound NAT Mode is set to Hybrid Outbound NAT rule generation since we’ll continue to leverage the automatic rules pfSense creates as well as this new custom rule.

Add a new mapping by clicking the Add button.  For this you’ll want to configure it as seen in the screenshot below.  Once complete save the new rule and new mappings.

nat

Last but not least, we need to open flows within the pfSense firewall to allow the traffic to go out to the Internet over HTTP and HTTPS as seen below.

pfsensefw

You’re done!  Now time to test the configuration.  For this you’ll want to RDP into your VM, open up a web browser, and try to hit a website.

google

Excellent, so you made it out to the web, but how do you know you were force tunneled through?  Simple!  Just hit a website like https://whatismyipaddress.com and validate the IP returned is the IP associated with your pfSense WAN interface.

One thing to note is that if you deallocate and reallocate your Azure Firewall or delete and recreate your Azure Firewall after everything is in place, you may run into an issue where forced tunneling doesn’t seem to work.  All you need to do is bring down the VPN tunnel and bring it back up again.  There is some type of dependency there, but what that is, I don’t know.

Well that’s it folks.  Hope you enjoyed the series and got some value out of it.  Azure Firewall is a solid alternative to a self-managed NVA.  Sure you don’t get all the bells and whistles, but you get key capabilities such as transitive routing and features that build on NSGs such as filtering traffic via FQDN, centralized rule management, and centralized logging of what’s being allowed and denied through your network.  As an added bonus, you can always leverage the forced tunneling feature you learned about today to tunnel traffic to a security stack which can perform features Azure Firewall can’t such as deep packet inspection.

Stay healthy!

 

 

Force Tunneling Azure Firewall to pfSense – Part 1

Force Tunneling Azure Firewall to pfSense – Part 1

The Problem

Welcome back fellow geeks!  I hope you all are staying healthy and not going too stir crazy being stuck at home.  I’m here tonight to help break the monotony and walk you through a fun lab I recently put together.

I recently had a customer building out a sandbox environment for experimentation in Microsoft Azure.  For this environment the customer opted to setup a S2S VPN (site-to-site virtual private network) to establish connectivity between their on-premises data center and Azure.  The customer had requirements to use BGP (border gateway protocol) to exchange routes between on-premises and Azure.  Additionally, their security team required all Internet-bound traffic be piped back on-premises (force tunneling) through a set of security appliances before being egressed out to the Internet from their data center.

While I’ve setup connectivity with Azure in the past using an S2S VPN, it was with a policy-based VPN vs a route-based VPN that utilized BGP.  I’ve also worked with a lot of customers that had requirements for forced tunneling, but never got involved much in the implementation.  My customers typically use Microsoft ExpressRoute for connectivity with on-premises and a third-party NVA (network virtual appliance) like a Palo Alto or Imperva.  Since I’m not cool enough to have a lab with ExpressRoute and I’m too cheap to pay for an NVA, I’ve never had a chance to do the implementation myself.   This has meant relying on documentation and other folks within Microsoft that have had that experience.

Beyond the implementation gap in that pattern, I also have gaps in my BGP skill set.  While I’ve been lucky enough to play with a lot different technologies over the course of my career, enterprise routing was one area I never got to dive deep in.  Over my time at Microsoft and AWS, I’ve had to learn the concepts of the protocol and how to use it within the public cloud, but still have lacked any practical implementation experience.

If you know me, you know I hate not being able to implement the technologies I speak with customers about.  Hence, this blog post was born.  I’ll be walking you through the lab I built to address the gaps in my BGP and get some practical experience force tunneling traffic.  Enough with my blabbing, let’s get into it.

Lab Environment

Lab Environment

The complete lab setup I used is illustrated above.  In my home lab I’m using the 192.168.100.0/24 address range and have assigned the .1 address to the pfSense interface.  Another interface on the device has been configured for DHCP to receive a public IP address from my ISP.  Within Azure I’ve setup a single VNet (Virtual Network) assigned the address block of 10.0.0.0/16.  Within the VNet I’ve create five subnets each using a /24 block of address space (I’m terrible at subnetting).

Inside the GatewaySubnet I’ve provisioned a VPN VNG (Virtual Network Gateway) with the VpnGw2 SKU to support BGP.  The subnet named primary contains a single Windows Server 2016  VM (Virtual Machine) that I’ll be using to test the setup.  Azure Bastion sits in the Azure Bastion subnet providing me with remote access into the VM.

Finally, an Azure Firewall instance has been provisioned using the new forced tunneling feature in preview.  To support this feature, I’ve provisioned two subnets, one named AzureFirewallSubnet and one named AzureFirewallManagementSubnet  as well as two public IPs.  To route the traffic as needed, I’ve created three route tables with some user defined routes.

For this post I’m going to walk through the setup of the S2S VPN tunnel.  Anytime I can refer you to official documentation for a step-by-step process, I’ll include a hyperlink.  The steps that aren’t documented in a single place or documented at all will be the steps I’ll cover in detail.

The first thing you need to do is provision a VNet (Virtual Network).  The VNet must at least include a subnet named GatewaySubnet.  Microsoft requires this name for the subnet in order to deploy a VNG (Virtual Network Gateway).  You’ll additionally want to provision another subnet named whatever you want to hold the VM (virtual machine) to test connectivity with.  If you want to use Azure Bastion for remote access to the VM, you’ll need a third subnet which must be named AzureBastionSubnet.

While you’re twiddling your thumbs for 20 minutes waiting for the VNG, optional Bastion, and VM, you can create the local network gateway.  The local network gateway is a logical resource in Azure which represents your on-premises VPN appliance. To set this resource up you’ll need a few different items:

  • The public IP address in use by your VPN appliance
  • The BGP peer address you’ll be peering with Azure
  • The ASN (autonomous system number) you’re using on-premises

For this lab you’ll want to use a private ASN between 64512-65514 or 65521-65534.

Below is a screenshot of my configuration.  I included the entire address space I’m going to advertise, but if you’re using BGP you only need to include the addresses you’ll be using as BGP peer.

localgateway

Now that Azure is provisioning all your necessary resources, it’s a good time to bounce over to pfSense.  Note that pfSense doesn’t provide BGP support.  For that you’ll need to add the OpenBGPD package.  To do that you’ll navigate to the System drop down menu and choose Package Manger.  Search for BGP and install the OpenBGP package.  Once complete you’ll see it as an installed package as seen below.packagemanager

Once the VPN Gateway has been provisioned you can begin configuration of the connection.  The connection is also represented in Azure as a logical resource.  There isn’t much to configure when you create the connection through the Portal.  If you configure it through PowerShell, CLI, or an ARM template, you’ll have the flexibility to tweak the configuration of the tunnel.  This includes the ability to limit the encryption ciphers and hashing algorithms supported on the Azure end.  Once the connection is provisioned, open up the resource blade for it, go to the Configuration menu item in the Settings section and toggle BGP to Enabled.

connection

Before you bounce over to pfSense and configure that end, you’ll need a few pieces of information from the VPN Gateway.  Within the Portal open up the VNG resource blade.  Note the public IP address that has been assigned to the VNG.  You’ll need this for the pfSense setup.Next click the Configuration menu item in the Settings section.  Here you’ll want to check off the Configure BGP ASN check box and note the ASN (by default 65515) and the BGP peer IP address because you’ll need them later.  Click Save once you complete.  This change will take around 5 minutes.

bgp

It’s now time to hop over to pfSense.  From the main menu navigate to the VPN drop down menu and choose the IPsec option.  You’ll first need to create a IKE Phase 1 entry to establish the authentication for the tunnel.In the General Information section ensure the Key Exchange Version box is populated with IKEv2 and the Remote Gateway is populated with the public IP address of the VNG.  In the Phase 1 Proposal (Authentication) section, choose to the Mutual PSK (Pre-Shared Key) option, the My identifier is set to My IP Address and Peer identifier set to Peer IP address.  Plus in the PSK you setup in Azure.In the Phase 1 Proposal (Encryption Algorithm) section pick your preferred encryption algorithm, key length, hashing algorithm, and Diffie-Hellman Group.The Azure end supports a number of cryptographic combinations just be aware you’ll need to configure a custom IPSec Policy using the CLI, PowerShell, or ARM template if you pick a combination that isn’t offered by default.  I’m not sure what it supports by default because I couldn’t find any documentation on it.  It seems like you’ll be forced to use DHGroup2 if you create through the Azure Portal, which you really shouldn’t be using due the small key length.  If you want to nerd out a bit, take a read through this document.  I wanted to bump this up to DHGroup24, so I opted to create the custom IPSec policy with the configuration below.

ipsecpol = New-AzIpsecPolicy -IkeEncryption AES256 -IkeIntegrity SHA256 -DhGroup Dhgroup24 -IpsecEncryption GCMAES256 -IpsecIntegrity GMAES256 -PfsGroup None -SALifeTimeSeconds 28800  

Next up you need to configure a Phase 2 entry which will control how traffic is carried across the tunnel.  Expand the Phase 1 entry you created and click the Add P2 button to add a phase 2 entry.  In the General Information section you’ll want to set the Local Network option to Network with an address of 0.0.0.0/0.  This will allow us to tunnel traffic to any address through the VPN tunnel which will support our use case for the forced tunneling we’ll create later on.  In the Remote Network section, set it to the CIDR block of the VNet.In the Phase 2 proposal configure the settings to support whatever encryption setup you’re using.  For my configuration, I set it up as seen in the screenshot below.

phase2
Once the phase 2 entry is configured, navigate to the Status drop-down menu and choose IPsec.  Click the Connect button and assuming you configured everything correctly, the status shift from Disconnected, to Connecting, and will end on Established as seen below.
ipsecstatus

Hurray, you have an established VPN tunnel.  Now it’s time to configure BGP.

Since you’ve already toggled the appropriate options in Azure to support BGP, it’s now time to configure it in pfSense.  You will first need to create a firewall rule to allow the BGP traffic to flow between Azure and the pfSense box.  To do this you’ll select the Firewall drop-down menu and choose the Rules option.  Create a new rule to allow TCP port 179 from the source of the Azure BGP peer IP you noted earlier to the pfSense interface IP for the network you’re connecting to Azure.

firewallrule1

Next you have to open the Services drop-down menu and choose OpenBGPD.In this section you have a few menu options, one which allows you to modify the raw config.  Like the idiot I am, I ignored the comment at the beginning of the raw config that says not to edit it.  After editing it, I was unable to configure using the menu options.  If you’re not an idiot like me, you should be able to configure it using the menus.  My working config is illustrated below.

bgpconfigOnce you have your Config set, save it and give it a minute.  The navigate to the Status section of the OpenBGPD service.  Scroll to the bottom and check out the OpenBGPD Neighbors section.  If you’ve misconfigured anything you’ll receive an error that the log file can’t be written (useful right?)

bgpstatus

Additionally when I check the effective routes for the network interface of the VM in Azure I can see the routes propagating into the VM’s subnet.

routes

You can validate your connectivity at this point in any number of ways.  I went the lazy route and used pfSense’s Test Port capability located in the Diagnostics drop-down menu.  Make sure that you open the appropriate rules in any NSGs between you and the VM.  Also consider the VM’s host firewall if you opt to use a non-standard port or protocol like ICMP.  If you opt to test from Azure back on-premises, make sure to open the appropriate firewall rules in the pfSense firewall for the IPSec interface.

connectiontest

With that you have a working S2S VPN complete with BGP exchange of routes.  That will wrap up this post.  In the next post I’ll walk through the configuration of forced tunneling with Azure Firewall.

Continue the journey in the second post.

pfsense + squid + Kerberos

pfsense + squid + Kerberos

Hi everyone,

I hope all of you had an enjoyable holiday. I spent my week off from work spending time with the family and catching up on some reading. One area I decided to spend some time reading up on is Microsoft’s Cloud App Security. For those unfamiliar with the solution, it’s Microsoft’s entry into the cloud access security broker (CASB) (or Cloud Security Gateway (CSG) if you’re a Forrester reader) market. If you haven’t heard of “CASB” or “CSG”, don’t worry too much. While the terminology is new, many of the collection of technologies encompassing a typical CASB or CSG are not new, simply used together in new and creative ways. For a quick intro, take a read through this article and follow up with some Forrester and Gartner research for a deeper dive.

Since I haven’t had much experience with a product specifically marketing itself as a CASB, I thought it would be a great opportunity to play around with Microsoft’s solution. A good first step for any organization to grasp the value of a CASB is to explore what’s happening within the organization outside the view of IT, or as the marketers love to call it, shadow IT. The ease of consuming cloud technologies such as software as a service (SaaS) applications has been both a blessing and a curse. The new technology has been wonderful in cutting IT costs, bringing the technology closer to the business, providing for shorter time to market for new features, and providing simpler integration paths for different applications and services. On the negative side, the ease of use of these solutions means an average employee is using far more of them than is officially sanctioned by IT. This can lead to issues like loss of critical data, non-compliance with policy, or multiple business groups within an organization subscribing to the same service resulting in redundant licensing costs.

Wouldn’t it be great to get visibility into that shadow IT? Since a majority of cloud solutions work over standard HTTP(S) the services are readily accessible to the user without the user having to request additional ports be opened on the firewall. This means it’s much more challenging to track who is using what and what they’re doing with those services. Many organizations attempt to control these types of solutions with a traditional forward web proxy. However, too much focus is put on blocking the “bad” sites instead of analyzing the overall patterns of usage of services. Microsoft’s Azure AD Cloud Discovery is a feature of Azure Active Directory that can be used in conjunction with Cloud App Security’s catalog of app to provide visibility into what’s being accessed as well as providing information as to the risks the services being accessed present to the organization.

To simulate a typical medium to large organization and get some good testing done with Cloud App Discovery, I’m going to add a forward web proxy to my home lab. As I’ve mentioned in previous blog entries I have a small form factor computer running pfsense which I use as my lab networking security appliance. Out of the box, it supports a base install of Squid which can be added and configured to act as a forward web proxy with minimal effort. It gets a bit more challenging when you want to add authentication to the proxy because the built-in options for the pfsense implementation are limited to local, LDAP, and RADIUS authentication. I want authentication so I can identify users connecting to the proxy and associate the web connections with specific users but I want to use Kerberos so I get that seamless single sign on experience.

Like many open source products, the documentation on how to setup Squid running on pfsense and using Kerberos authentication is pretty terrible. Searching the all-powerful Google presents lots of forum posts with people asking how to do it, pieces of answers that don’t make much sense, and some Wikis on how to configure Squid to use Kerberos on a standard server. Given the lack of good documentation, I thought it would be fun to work my way through it and compile a walkthrough. I’m issuing the standard disclaimer that this is intended for lab purposes only. If you’re trying to deploy pfsense and Squid in a production environment, do more reading and spend time doing it safely and securely.

I won’t be covering the basic setup of pfsense as there are plenty of guides out there and the process is simple for anyone with any experience in the network appliance realm. For this demonstration I’ll be running a box with pfsense 2.4.2 installed.

On to the walkthrough!

The first step in the process is to add the Squid package through the pfsense package manager UI.

squid1

On the Package Manager screen, select the Available Packages section and install the Squid package.  After the installation is complete, you’ll see Squid shown in the Installed Packages section.

squid2

Notice the package installed is a branch of the 3.5 release while the latest release available directly via Squid is 4.0.  It’s always fun to have the latest and greatest, but pfsense is an all-in-one solution so it comes with some sacrifices.  Let’s get some of the basic configuration settings done with.  Go to the Services menu, select the Squid Proxy Server menu item, and go the General section.  First up choose the interface you want Squid to be available for and specify a port for it to listen on.

squid3.png

Now check off the Allow Users on Interface unless you have a reason to limit it to certain subnets attached to the interface.  Additionally I’d recommend checking the Resolve DNS IPv4 First option.  I banged my head against the wall with a ton of issues with Squid when I turned on authentication and this option wasn’t set.  You can thank me for saving you hours of Google and trying other options.

squid4

Setup basic logging with the settings below.

squid5.png

Basic settings are complete and it’s a good time to test the proxy from a client machine to verify its base functionality.  You can do this by directing one of your client machines to use the proxy and attempting to access a website.

After you have verified functionality you’ll need to add support for SSH to the pfsense box since we’ll need to make some changes via the command shell.  For that you’ll want to navigate the System menu, select the Advanced menu item, and go to the Admin Access section.  Scroll down from there to the Secure Shell section and click the checkbox for Enable Secure Shell and set a the SSH port to the port of your choice.  I chose 50,000.

squid6

The Secure Shell Server is active, but the firewall blocks access to it across all interfaces by default.  You now need to create the appropriate firewall rule to allow access from devices behind the interface you wish to use to SSH to the box.  For me this is the interface that my lab devices connect to.  For this you’ll select Firewall from the top menu, select the Rules menu item, and select the appropriate interface from the menu items.  Once there, click the Add button to create a firewall rule allowing devices within the subnet to hit the router interface over the port you configured earlier as seen below.  The SSH listener will now be running and will be accessible from the designated interface.

squid7.png

Now you must configure DNS such that the pfsense box can resolve the Active Directory DNS namespace to perform Kerberos related activities.  You can go the easy route and make the Active Directory domain controller the primary DNS server for pfsense via the GUI.  However, I use pfsense as the primary DNS resolver for the lab environment and forward queries to Google’s DNS servers at 8.8.8.8.

In order to continue using with my preferred configuration, I needed to take a few additional steps.  First I needed to add a Domain Override to the DNS Resolver service on pfsense to ensure it doesn’t pass the query along to the external DNS server.  I did this by selecting Services from the main menu, selecting the DNS Resolver menu item, and going to the General Settings section.  I then scrolled down to the Domain Overrides section and added the appropriate override for my Active Directory DNS namespace as seen below.  Take note that you can’t go modifying the resolv.conf as you would in a normal Linux distro since pfsense will scrub any changes you make to the file each time it restarts its services.  Get used to this behavior, we’re going to see it a number of times through this blog entry and we’ll have to learn to work around that limitation (feature?).

squid8.png

Next up you’ll want to verify name resolution is working as intended and it can be tested by running a query from the pfsense box.  Go to the Diagnostics on the main menu, select the DNS Lookup item, and type in the hostname representing the Active Directory DNS namespace.  It should resolve to the entries representing domain controllers in your Active Directory domain.  Successful testing makes the DNS configuration complete.

squid9.png

On to the guts of the configuration.  Pfsense comes with the krb5 package installed so all you need to do is configure it.  For that you are going to need to access the command shell.  Open up your favorite SSH client and connect to the pfsense box as an administrative user.  Upon successful login you’ll see the menu below.

squid10.pngYou want to hit the command shell so choose option 8 and you will be dropped into the shell.

squid11.png

The first step is to configure the krb5 package to integrate with the Active Directory domain.  For that you’ll need to create a krb5.conf file.  Create a new a krb5.conf file in the /etc/ directory and populate it with the appropriate information.  I’ve included the content of my krb5.conf file as an example.

[libdefaults]
default_realm = JOURNEYOFTHEGEEK.LOCAL
dns_lookup_realm = false
dns_lookup_kdc = true
default_tgs_enctypes = aes128-cts-hmac-sha1-96
default_tkt_enctypes = aes128-cts-hmac-sha1-96
permitted_enctypes = aes128-cts-hmac-sha1-96

[realms]
JOURNEYOFTHEGEEK.LOCAL = {
kdc = jog-dc.journeyofthegeek.local
}

[domain_realm]
.journeyofthegeek.local = JOURNEYOFTHEGEEK.LOCAL
journeyofthegeek.local = JOURNEYOFTHEGEEK.LOCAL

[logging]
kdc = FILE:/var/log/kdc.log
Default = FILE:/var/log/krb5lib.log

Check out the MIT documentation on the options available to you in the krb5.conf. I made the choice to limit the encryption algorithms to AES128 for simplicity purposes, feel free to use something else if you wish. Once the settings are populated the file can be saved.

It’s time to test the Kerberos configuration.  You do that by using running kinit and authenticating as a valid user in the Active Directory domain.  If the configuration is correct klist will display the a valid Kerberos ticket granting ticket (TGT) for the user.

squid12.png

The system is now configured to interact with the Active Directory domain using Kerberos.  You now need to create a security principal in Active Directory to represent the Squid service.  Create a new user in Active Directory and name it whatever you wish, I used svc_squid for this lab.  Since I chose to use AES128, I had to select the account control option on the user account in Active Directory Users and Computers (ADUC) that the service supported AES128.  You can ignore that step if you chose not to force an encryption level.  Now a service principal name (SPN) for the service is needed to identify the service when a user attempts to authenticate to it.  For that you’ll need to open an elevated command prompt and use the setspn command.

squid13.png

Wonderful you have a security principal created and it includes the appropriate identifier.  You now need to create a keytab that the service can use to authenticate to Active Directory.  In comes ktpass.  From the same elevated command prompt run the command as seen below.

squid14.png

Pay attention to case sensitivity because it matters when we’re talking MIT Kerberos, which is Kerberos implementation pfsense is using.  The link I included above will explain the options.  I set the crypto option to AES128 to ensure the keytab aligns with the other options I’ve configured around encryption.

Next up you need to transfer the keytab to the pfsense box.  I used WinSCP to transfer the keytab to the pfsense box to the /usr/local/etc/squid/ directory.   The keytab is on the pfsense box but you need to tell Squid where the keytab is.  In a typical Squid implementation you’d define variable in the Squid startup script which would be consumed by the authentication helper.  However, this is another case where pfsense will overwrite any changes you make to the startup script.

In addition to being unable to modify the startup script to set, pfsense also overwrites any changes you make directly to the squid.conf file.  To get around this you’ll need to add the configuration options to the config file through the pfsense GUI.  From within the GUI go to the Services section of the main menu, select the Squid Proxy Server menu item, go to the General section, scroll down and hit the Advanced Options button and scroll to the Advanced Features section.  In the Custom Options (Before Auth) field, you’ll want to add the lines below.

squid15.png

The first four lines I’ve added here are called directives in Squid.  The first directive instructs Squid to use the negotiate_kerberos_auth authentication helper.  The options I’ve added to the helper set a few different configuration options for the helper.  The -k option allows me to direct Squid to the keytab file I added to the server which I couldn’t do with a variable in the startup script.  The -d option writes debug information for the helper to Squid cache.log and the -t option shuts off the replay cache for MIT Kerberos.   The second directive sets the child authentication processes to 1,000.  You’ll want to do some research on this directive if you’re moving this into a production environment.  I simply choose 1000 so I wouldn’t run any risk of getting my authentication requests queued for the purposes of this lab.  The third directive is set to on by default and should only be set to off if you run into issues with PUT/POST requests.

The fourth directive starts enforcing access controls within Squid.  Access controls within Squid are a bit weird.  The Squid wiki does a decent job of explaining how they work.  The short of what I’ve done in the fourth directive is create an access list called auth which will contain all users who successfully authenticate against Squid.  The next line denies users access to the http_access list if the user doesn’t below to the auth access line (blocking non-authenticated users).  The final line allows users who are in the auth list into the http_access list (allows authenticated users).

With that last amount of configuration, you’ve gotten pfsense and Squid configured for Kerberos authentication.  I’ll quickly demonstrate the what a successful implementation looks like.  For that I’m going to bounce over to a Windows 10 domain-joined machine with Chrome installed and configured to use the proxy server.  Navigating to Amazon displays the webpage with no authentication prompts and running a klist from a command prompt shows I have a Kerberos ticket for the proxy.

squid16.png

Going back into the pfsense GUI, going to the Services menu, selecting the Squid Proxy Server menu item and navigating to the Real Time section shows the access log displaying Rick Sanchez accessing Amazon and successful consumption of the Kerberos ticket in the Cache Log section.

squid17.png

In a future post I’ll dig a bit deeper into Azure AD Cloud Discovery and setup automatic forwarding of logs using the Microsoft collector.

Have a happy New Year!