Force Tunneling Azure Firewall to pfSense – Part 2

Force Tunneling Azure Firewall to pfSense – Part 2

Welcome back to my series on forced tunneling Azure Firewall using pfSense.  In my last post I covered the background of the problem I wanted to solve, the lab makeup I’m using, and the process to setup the S2S (site-to-site) VPN with pfSense and exchange of routes over BGP.  Take a few read through that post before jumping into this one.

At this point you should a working S2S VPN from your Azure VNet to your pfSense router and the two should be exchanging a few routes over BGP.  If you didn’t complete all the steps in the first post, go back and do them now.

Now that connectivity is established, it’s time to incorporate Azure Firewall.  Azure Firewall was introduced back in 2018 as a managed stateful firewall that can act as an alternative to rolling your own NVAs (network virtual appliances) like a Palo Alto or Checkpoint firewall.  Now I’m not going to lie to you and tell you it has all the bells and whistles that a 3rd party NVA has, but it can provide a reasonable alternative depending on what your needs are.  The major benefit is it’s a managed service to Microsoft owns the responsibility of managing the health of the service, its high availability and failover,  it’s closely integrated with the Azure platform, more than likely cheaper than what you’d pay for a 3rd-party NVA license.

Recently, Microsoft has introduced support for forced tunneling into public preview.  This provides you with the ability to send all of the traffic received by Azure Firewall on to another security stack that may exist within Azure, on-premises, or in another cloud. It helps to address some of the capability gaps such as lack of support for (DPI) deep packet inspection for Internet-bound traffic.  You can leverage Azure Firewall to transitively route and mediate traffic between on-premises and Azure, hub-spoke, and spoke to spoke while passing Internet bound traffic on to another security stack with DPI capabilities.

With that out of the way, let’s continue with the lab.

The first thing you’ll want to do is to deploy an instance of Azure Firewall.  To support forced tunneling, you’ll need to toggle the option to enabled.  You then need to provide another public IP address.  What’s happening here is the nodes are being created with two NICs (network interface cards).  One NIC will live in the AzureFirewallSubnet and one will live in the AzureFirewallManagementSubnet.  Traffic dedicated to Microsoft’s management of the nodes will go out to the Internet (but remains on Microsoft’s backbone) through the NIC in the AzureFirewallManagementSubnet.  Traffic from your VMs will exist the NIC in the AzureFirewallSubnet.  This split also means you can now attach a UDR (user defined route) to the AzureFirewallSubnet to route that traffic to your own security stack.

azfwsetup

The Azure Firewall instance will take about 10-20 minutes to provision.  While you’re waiting you need to prepare the Virtual Network Gateway for forced tunneling.

Now if you go Googling, you’re going to come across this Microsoft article which describes setting a GatewayDefaultSite for the VPN Gateway.  While you can do it this way and you opt for an active/active both on-premises and for the VPN Gateway configuration, you’ll need to need to flip this setting to the other local network gateway (your other router) in the event of a failover.

As an alternative solution you can propagate a default route via BGP from your on-premises router into Azure.  ECMP will be used by default and will spread the traffic across all available tunnels.  If one of your on-premises routers goes down, traffic will still be able to flow back on-premises without requiring you to fail anything over on the Azure end.  Note that if you want make one of your routers preferred, you’ll have to try your luck with AS Path Prepending.

For this lab scenario, I opted to broadcast a default route via BGP.  My OpenBGPD config file is pictured below.  Notice I’ve added a default route to be propagated.

openbgpd-config

Hopping over to Azure and enumerating the effective routes shows the new routes being propagated into the VNet via the VPN Gateway.

vnetroutes

With this configuration, all traffic without a more specific route (like all our Internet traffic) will be routed back to the VPN Gateway.  Since this lab calls for this traffic to be sent to Azure Firewall first, you’ll need to configure a UDR (user defined route).  As described in this link, when multiple routes exist for the same prefix, Azure picks from UDRs first, then BGP, and finally system routes.

For this you’re going to need to set up three route tables.

One routing table will be applied to the primary subnet the VM is living in.  This will contain a UDR for the default route (0.0.0.0/0) with a next hop type of Virtual appliance and next hop address of the Azure Firewall instance’s NIC in the AzureFirewallSubnet.  By order of

udrprimary

The second routing table will be applied to the AzureFirewallSubnet.  This will contain a UDR for the default route with a next hop of the Virtual network gateway.  This forces Azure Firewall to pipe all the VM traffic bound for the networks outside the VNet to the Virtual Network Gateway which will then tunnel it through the VPN tunnel.

routefirewall

Last but not least, you have an optional route table you can add.  This route table will be applied to the AzureFirewallManagementSubnet and will be configured with Virtual Network Gateway route propagation disabled.  It will have a single UDR with a default route and next hop type of Internet.  The reason I like adding this route table is it avoids the risk of someone propagating a default route from on-premises.  If this route were to be propagated to the AzureFirewallManagementSubnet, the management plane would see it down and may deallocate the instance.

routemgmt

The last thing you need to do in Azure is create a rule in Azure Firewall to allow traffic to the web.  For this I created a very simple application rule allowing all HTTP and HTTPS traffic to any domain.

azfirewallrule

 

At this point the Azure end of the configuration is complete.  We now need to hop over to pfSense and finish that configuration.

Remember back in the last post when I had you configure the phase 2 entry with a local network of 0.0.0.0/0?  That was the traffic selector which allows traffic destined for any network from the VNet to flow through our VPN tunnel.

Now you have a requirement to NAT traffic from the VNet out the WAN interface on the pfSense box.  For that you have to navigate to the Firewall drop-down menu and choose the NAT menu item.  From there you’ll navigate to the Outbound option and ensure your Outbound NAT Mode is set to Hybrid Outbound NAT rule generation since we’ll continue to leverage the automatic rules pfSense creates as well as this new custom rule.

Add a new mapping by clicking the Add button.  For this you’ll want to configure it as seen in the screenshot below.  Once complete save the new rule and new mappings.

nat

Last but not least, we need to open flows within the pfSense firewall to allow the traffic to go out to the Internet over HTTP and HTTPS as seen below.

pfsensefw

You’re done!  Now time to test the configuration.  For this you’ll want to RDP into your VM, open up a web browser, and try to hit a website.

google

Excellent, so you made it out to the web, but how do you know you were force tunneled through?  Simple!  Just hit a website like https://whatismyipaddress.com and validate the IP returned is the IP associated with your pfSense WAN interface.

One thing to note is that if you deallocate and reallocate your Azure Firewall or delete and recreate your Azure Firewall after everything is in place, you may run into an issue where forced tunneling doesn’t seem to work.  All you need to do is bring down the VPN tunnel and bring it back up again.  There is some type of dependency there, but what that is, I don’t know.

Well that’s it folks.  Hope you enjoyed the series and got some value out of it.  Azure Firewall is a solid alternative to a self-managed NVA.  Sure you don’t get all the bells and whistles, but you get key capabilities such as transitive routing and features that build on NSGs such as filtering traffic via FQDN, centralized rule management, and centralized logging of what’s being allowed and denied through your network.  As an added bonus, you can always leverage the forced tunneling feature you learned about today to tunnel traffic to a security stack which can perform features Azure Firewall can’t such as deep packet inspection.

Stay healthy!

 

 

Force Tunneling Azure Firewall to pfSense – Part 1

Force Tunneling Azure Firewall to pfSense – Part 1

The Problem

Welcome back fellow geeks!  I hope you all are staying healthy and not going too stir crazy being stuck at home.  I’m here tonight to help break the monotony and walk you through a fun lab I recently put together.

I recently had a customer building out a sandbox environment for experimentation in Microsoft Azure.  For this environment the customer opted to setup a S2S VPN (site-to-site virtual private network) to establish connectivity between their on-premises data center and Azure.  The customer had requirements to use BGP (border gateway protocol) to exchange routes between on-premises and Azure.  Additionally, their security team required all Internet-bound traffic be piped back on-premises (force tunneling) through a set of security appliances before being egressed out to the Internet from their data center.

While I’ve setup connectivity with Azure in the past using an S2S VPN, it was with a policy-based VPN vs a route-based VPN that utilized BGP.  I’ve also worked with a lot of customers that had requirements for forced tunneling, but never got involved much in the implementation.  My customers typically use Microsoft ExpressRoute for connectivity with on-premises and a third-party NVA (network virtual appliance) like a Palo Alto or Imperva.  Since I’m not cool enough to have a lab with ExpressRoute and I’m too cheap to pay for an NVA, I’ve never had a chance to do the implementation myself.   This has meant relying on documentation and other folks within Microsoft that have had that experience.

Beyond the implementation gap in that pattern, I also have gaps in my BGP skill set.  While I’ve been lucky enough to play with a lot different technologies over the course of my career, enterprise routing was one area I never got to dive deep in.  Over my time at Microsoft and AWS, I’ve had to learn the concepts of the protocol and how to use it within the public cloud, but still have lacked any practical implementation experience.

If you know me, you know I hate not being able to implement the technologies I speak with customers about.  Hence, this blog post was born.  I’ll be walking you through the lab I built to address the gaps in my BGP and get some practical experience force tunneling traffic.  Enough with my blabbing, let’s get into it.

Lab Environment

Lab Environment

The complete lab setup I used is illustrated above.  In my home lab I’m using the 192.168.100.0/24 address range and have assigned the .1 address to the pfSense interface.  Another interface on the device has been configured for DHCP to receive a public IP address from my ISP.  Within Azure I’ve setup a single VNet (Virtual Network) assigned the address block of 10.0.0.0/16.  Within the VNet I’ve create five subnets each using a /24 block of address space (I’m terrible at subnetting).

Inside the GatewaySubnet I’ve provisioned a VPN VNG (Virtual Network Gateway) with the VpnGw2 SKU to support BGP.  The subnet named primary contains a single Windows Server 2016  VM (Virtual Machine) that I’ll be using to test the setup.  Azure Bastion sits in the Azure Bastion subnet providing me with remote access into the VM.

Finally, an Azure Firewall instance has been provisioned using the new forced tunneling feature in preview.  To support this feature, I’ve provisioned two subnets, one named AzureFirewallSubnet and one named AzureFirewallManagementSubnet  as well as two public IPs.  To route the traffic as needed, I’ve created three route tables with some user defined routes.

For this post I’m going to walk through the setup of the S2S VPN tunnel.  Anytime I can refer you to official documentation for a step-by-step process, I’ll include a hyperlink.  The steps that aren’t documented in a single place or documented at all will be the steps I’ll cover in detail.

The first thing you need to do is provision a VNet (Virtual Network).  The VNet must at least include a subnet named GatewaySubnet.  Microsoft requires this name for the subnet in order to deploy a VNG (Virtual Network Gateway).  You’ll additionally want to provision another subnet named whatever you want to hold the VM (virtual machine) to test connectivity with.  If you want to use Azure Bastion for remote access to the VM, you’ll need a third subnet which must be named AzureBastionSubnet.

While you’re twiddling your thumbs for 20 minutes waiting for the VNG, optional Bastion, and VM, you can create the local network gateway.  The local network gateway is a logical resource in Azure which represents your on-premises VPN appliance. To set this resource up you’ll need a few different items:

  • The public IP address in use by your VPN appliance
  • The BGP peer address you’ll be peering with Azure
  • The ASN (autonomous system number) you’re using on-premises

For this lab you’ll want to use a private ASN between 64512-65514 or 65521-65534.

Below is a screenshot of my configuration.  I included the entire address space I’m going to advertise, but if you’re using BGP you only need to include the addresses you’ll be using as BGP peer.

localgateway

Now that Azure is provisioning all your necessary resources, it’s a good time to bounce over to pfSense.  Note that pfSense doesn’t provide BGP support.  For that you’ll need to add the OpenBGPD package.  To do that you’ll navigate to the System drop down menu and choose Package Manger.  Search for BGP and install the OpenBGP package.  Once complete you’ll see it as an installed package as seen below.packagemanager

Once the VPN Gateway has been provisioned you can begin configuration of the connection.  The connection is also represented in Azure as a logical resource.  There isn’t much to configure when you create the connection through the Portal.  If you configure it through PowerShell, CLI, or an ARM template, you’ll have the flexibility to tweak the configuration of the tunnel.  This includes the ability to limit the encryption ciphers and hashing algorithms supported on the Azure end.  Once the connection is provisioned, open up the resource blade for it, go to the Configuration menu item in the Settings section and toggle BGP to Enabled.

connection

Before you bounce over to pfSense and configure that end, you’ll need a few pieces of information from the VPN Gateway.  Within the Portal open up the VNG resource blade.  Note the public IP address that has been assigned to the VNG.  You’ll need this for the pfSense setup.Next click the Configuration menu item in the Settings section.  Here you’ll want to check off the Configure BGP ASN check box and note the ASN (by default 65515) and the BGP peer IP address because you’ll need them later.  Click Save once you complete.  This change will take around 5 minutes.

bgp

It’s now time to hop over to pfSense.  From the main menu navigate to the VPN drop down menu and choose the IPsec option.  You’ll first need to create a IKE Phase 1 entry to establish the authentication for the tunnel.In the General Information section ensure the Key Exchange Version box is populated with IKEv2 and the Remote Gateway is populated with the public IP address of the VNG.  In the Phase 1 Proposal (Authentication) section, choose to the Mutual PSK (Pre-Shared Key) option, the My identifier is set to My IP Address and Peer identifier set to Peer IP address.  Plus in the PSK you setup in Azure.In the Phase 1 Proposal (Encryption Algorithm) section pick your preferred encryption algorithm, key length, hashing algorithm, and Diffie-Hellman Group.The Azure end supports a number of cryptographic combinations just be aware you’ll need to configure a custom IPSec Policy using the CLI, PowerShell, or ARM template if you pick a combination that isn’t offered by default.  I’m not sure what it supports by default because I couldn’t find any documentation on it.  It seems like you’ll be forced to use DHGroup2 if you create through the Azure Portal, which you really shouldn’t be using due the small key length.  If you want to nerd out a bit, take a read through this document.  I wanted to bump this up to DHGroup24, so I opted to create the custom IPSec policy with the configuration below.

ipsecpol = New-AzIpsecPolicy -IkeEncryption AES256 -IkeIntegrity SHA256 -DhGroup Dhgroup24 -IpsecEncryption GCMAES256 -IpsecIntegrity GMAES256 -PfsGroup None -SALifeTimeSeconds 28800  

Next up you need to configure a Phase 2 entry which will control how traffic is carried across the tunnel.  Expand the Phase 1 entry you created and click the Add P2 button to add a phase 2 entry.  In the General Information section you’ll want to set the Local Network option to Network with an address of 0.0.0.0/0.  This will allow us to tunnel traffic to any address through the VPN tunnel which will support our use case for the forced tunneling we’ll create later on.  In the Remote Network section, set it to the CIDR block of the VNet.In the Phase 2 proposal configure the settings to support whatever encryption setup you’re using.  For my configuration, I set it up as seen in the screenshot below.

phase2
Once the phase 2 entry is configured, navigate to the Status drop-down menu and choose IPsec.  Click the Connect button and assuming you configured everything correctly, the status shift from Disconnected, to Connecting, and will end on Established as seen below.
ipsecstatus

Hurray, you have an established VPN tunnel.  Now it’s time to configure BGP.

Since you’ve already toggled the appropriate options in Azure to support BGP, it’s now time to configure it in pfSense.  You will first need to create a firewall rule to allow the BGP traffic to flow between Azure and the pfSense box.  To do this you’ll select the Firewall drop-down menu and choose the Rules option.  Create a new rule to allow TCP port 179 from the source of the Azure BGP peer IP you noted earlier to the pfSense interface IP for the network you’re connecting to Azure.

firewallrule1

Next you have to open the Services drop-down menu and choose OpenBGPD.In this section you have a few menu options, one which allows you to modify the raw config.  Like the idiot I am, I ignored the comment at the beginning of the raw config that says not to edit it.  After editing it, I was unable to configure using the menu options.  If you’re not an idiot like me, you should be able to configure it using the menus.  My working config is illustrated below.

bgpconfigOnce you have your Config set, save it and give it a minute.  The navigate to the Status section of the OpenBGPD service.  Scroll to the bottom and check out the OpenBGPD Neighbors section.  If you’ve misconfigured anything you’ll receive an error that the log file can’t be written (useful right?)

bgpstatus

Additionally when I check the effective routes for the network interface of the VM in Azure I can see the routes propagating into the VM’s subnet.

routes

You can validate your connectivity at this point in any number of ways.  I went the lazy route and used pfSense’s Test Port capability located in the Diagnostics drop-down menu.  Make sure that you open the appropriate rules in any NSGs between you and the VM.  Also consider the VM’s host firewall if you opt to use a non-standard port or protocol like ICMP.  If you opt to test from Azure back on-premises, make sure to open the appropriate firewall rules in the pfSense firewall for the IPSec interface.

connectiontest

With that you have a working S2S VPN complete with BGP exchange of routes.  That will wrap up this post.  In the next post I’ll walk through the configuration of forced tunneling with Azure Firewall.

Continue the journey in the second post.

Azure Private Link and DNS – Part 2

Azure Private Link and DNS – Part 2

Hello again!

In this post I’ll be continuing my series on Azure Private Link and DNS.  In my last post I gave some background into Private Link, how it came to be, and what it offers.  For this post I’ll be diving into some DNS patterns you can use to support name resolution with Private Link Endpoints for Azure services.  I’ll be covering the six scenarios below:

  1. Default DNS pattern without Private Link Endpoint
  2. Azure Private DNS pattern with a single virtual network
  3. BYODNS (Bring your own DNS) in a hub and spoke architecture
  4. BYODNS with a custom DNS forwarder in a hub and spoke architecture
  5. BYODNS with the use of root hints in a hub and spoke architecture
  6. BYODNS with the use of a custom DNS zone hosted in the BYODNS in a hub and spoke architecture

Before I jump into the scenarios, I want to cover some basic (and not so basic) DNS concepts.  If you know nothing about DNS, I’d highly suggest you stop reading here and take a quick few minutes to read through this DNS 101 by RedHat.  If you’ve operated a DNS service in a large enterprise, you can skip this section and jump into the scenarios.  If you only know the basics, read through the below or else you may not get much out of this post.

  • A record – Translates a hostname to an IP address such as http://www.journeyofthegeek.com to 5.5.5.5
  • CNAME record – Alias record where you can point on (FQDNs) fully qualified domain name to another to make it a domain a human can remember and for high availability configurations
  •  Recursive Name Resolution – A DNS query where the client asks for a definitive answer to a query and relies on the upstream DNS server to resolve it to completion.  Forwarders such as Google DNS function as recursive resolvers.
  • Iterative Name Resolution – A DNS query where a client receives back a referral to another server in place of resolving the query to completion.  Querying root hints often involves iterate name resolution.
  • DNS Forwarder – Forward all queries the DNS service can’t resolve to an upstream DNS service.  These upstream servers are typically configure to perform recursive name resolution, but depending on your DNS service (such as Infoblox), you can configure it to request iterative name resolution.
  • Conditional Forwarder – Forward queries for a specific DNS namespace to an upstream DNS service for resolution.
  • Split-brain / Split Horizon DNS – A DNS configuration where a DNS namespace exists authoritatively across one or more DNS implementations.  A common use case is to have a single DNS namespace defined on Internet-resolvable public facing DNS servers and also on Intranet private facing DNS servers.  This allows trusted clients to reach the service via a private IP address and untrusted clients to reach the service via a public IP address.

If you can grasp the topics above, you’ll be in good shape for the rest of this post.

Scenario 1 – Default DNS Pattern Without Private Link Endpoint

Scenario 1

Scenario 1

Before we jump into how DNS for Azure services works when Private Link Endpoint is introduced, let’s first look at how it works without it.  For this example, let’s look at a scenario where I’m using an VM (virtual machine) running in an VNet (virtual network) and am attempting to connect to an Azure SQL instance named db1.database.windows.net.  No Private Link Endpoint has been configured for the Azure SQL instance and the VNet is configured to use Azure-provided DNS and thus sends its DNS queries out the 168.63.129.16 virtual IP.  I explain how Azure-provided DNS works with the virtual IP in a prior blog post.  When I open SQL Server Management Studio and try to connect to d1.database.windows.net, my VM first needs to determine the IP address of the resource it needs to establish a TCP connection with.  For this it issues a DNS query to the Azure DNS service.

The FQDN (fully-qualified domain name) for your specific instance of an Azure service will more than likely have two or more CNAME records associated with it.  I don’t have any super secret information as to the official reasons behind these CNAMEs and can only theorize that they are used to orchestrate high availability of the service.  By using the CNAMEs Microsoft is able to to provide you with DNS record you can customize to your requirements and place in code.  Any failures in the backend require a simple modification of the alias the CNAME is pointing to without requiring changes to your code such as modifications to the connection string.

Since Azure DNS is a recursive DNS resolver, it handles resolving each of these records for you and returns the public IP address of your Azure SQL instance.  Your VM will then use this public IP address to setup a TCP connection and establish a connection to your database.

Scenario 2 – Azure Private DNS pattern with a single virtual network

Scenario 2

Scenario 2

Now let’s cover how things change when we add a Private Link Endpoint and configure it to integrate with Azure Private DNS.  If you’re unfamiliar with how Azure Private DNS works take a read from my prior post on the topic.

In this scenario I’ve added a Private Link Endpoint for my Azure SQL instance.  I’ve configured the Endpoint to integrate with an Azure Private DNS zone named privatelink.database.windows.net and have linked the VNet to the Azure Private DNS zone.

Notice the changes to the records in Azure Public DNS.  The hostname for my Azure SQL instance now has a CNAME record with an alias defined for db1.privatelink.database.windows.net.  There is also a new CNAME record for db1.privatelink.database.windows.net which points to the same dataslice4.eastus2.database.windows.net record as we saw in the last scenario.  This is done for two reasons.  The first reason is it allows clients accessing to instance through a public IP to continue to do so because Microsoft has established a split-brain DNS configuration for the privatelink.database.windows.net zone.  The second reason is it allows Microsoft to work some magic in the backend (I have no idea how they’re doing it) that redirects queries originating from an Azure VNet that is linked to the Azure Private DNS zone to be resolved against the record in the Azure Private DNS zone.

This means that clients outside the linked Azure VNet will receive back the public IP address of the Azure SQL instance and clients within the Azure VNet linked to the Azure Private DNS zone will receive back the private IP address of Private Link Endpoint.

Scenario 3 – BYODNS in a Hub and Spoke Architecture

Scenario 3

Scenario 3

Scenarios 1 and 2 are important to understand, but the reality is very few organization have such a simple DNS pattern for their Azure footprints.  Most enterprises using Azure will be using a hub and spoke architecture.  Shared services such as a DNS service (Windows DNS, InfoBlox, BIND, whatever) are placed in the hub VNet and are shared among spoke VNets containing various workloads.  This DNS service will typically provide advanced features not provided by Azure Private DNS (at this time) such as conditional forwarders and DNS query logging.  You can check out my prior post on this pattern if you want to understand the details.

In the scenario below I’ve provisioned a DNS service in the hub VNet and configured it to forward all queries it can’t resolve to the 168.63.129.16 virtual IP.  Notice that I’ve now linked the Azure Private DNS zone to the hub VNet instead of the spoke VNet.  This is to ensure the DNS service can resolve the queries to this Azure Private DNS zone.  It also lets me take advantage of the advanced features of the DNS service such as those I discussed above.

The resolution with Azure-provided DNS occurs in the same manner as scenario 2 with the exception being that the DNS service performs the query and returns the results to the VM running in the spoke.

Scenario 4 – BYODNS With a Custom DNS Forwarder in a Hub and Spoke Architecture

Scenario 4

Scenario 4

Next up we have a scenario similar to the above where we have a hub and spoke architecture and have the DNS service in the hub configured to forward all queries it can’t resolve to an upstream forwarder.  Maybe it’s to some on-premises DNS server, a 3rd party threat service, or simply Google’s DNS service.   Whatever the case, this scenario means we now have to care about recursive resolution and conditional forwarders.

If the upstream DNS service you’re using supports recursive name resolution and the DNS service you’re using in your hub is configured to send recursive queries to it, then any queries for db1.database.windows.net will resolve to the public IP address of the service.  The reason for this is with recursion you’re asking the upstream DNS service to chase down the answer for you and that upstream DNS service only knows about the public privatelink.database.windows.net DNS zone and does not have access to the Azure Private DNS zone.

To handle this scenario want to create a conditional forwarder for database.windows.net (or the recommended zone for the service you’re using) and point it to Azure-provided DNS via the 168.61.129.16 virtual IP.  This enables you to let the Azure platform handle the split-brain DNS challenge as it has been engineered to do.

Scenario 5 – BYODNS With The Use of Root Hints in a Hub and Spoke Architecture

Scenario 5

Scenario 5

In scenario 5 we again have the same architecture as the prior scenarios with a few differences.  First off we are now sending iterative queries to the DNS Root Hints instead of an upstream forwarder.  This means our DNS service will chase the entirety of the resolution requesting referrals back from each DNS server in the path to resolve the FQDN.  The usage of iterative queries gives us the option of creating a conditional forwarder (our second difference) to the 168.63.129.16 for the privatelink.database.windows.net or optionally sending that query to some other DNS service we’re running in an on-premises data center or another cloud.

The key takeaway of this configuration is that using root hints puts a bigger burden on your DNS service because you are resolving a whole bunch more queries vs using an upstream DNS service like Azure DNS.   Additionally, if you opt to maintain your own DNS zone, it’s on you to figure out how to manage the whole lifecycle of the DNS records for the Private Link Endpoints.

Scenario 6 – BYODNS With The Use of a Custom DNS zone Hosted in The BYODNS In a Hub and Spoke Architecture

Scenario 6

Scenario 6

The last scenario I’ll cover is the use of a custom DNS zone named something outside of the Microsoft recommended zones (more required than recommended) that is hosted in your BYODNS service.  Let me save you any pain and suffering by telling you this will not work.  You’re probably asking why it won’t work.  The answer to that question requires understanding how data is secured in transit to Azure services.

Since you surely don’t want your data flowing through a network in clear text, most Azure services will either require or support encryption of data in transit using TLS (Transport Layer Security).  If you’re not familiar with TLS flow, you get a reasonably good overview here.  The key thing you want to understand is that TLS session is often established by using the certificate being served up by the Azure service.  In addition to confidentiality, it also authenticates the service to your client.

The authentication piece is what we care about here.  Without going too deep into the weeds, the certificate contains a property called the SAN (subject alternative name) which lists the identities of the services the certificate should be used for.  These identities are typically DNS names such as db1.database.windows.net.  If you try to go ahead and create a custom DNS zone and attempt to access the Azure service through that name, you’ll run into a certificate mismatch error which is due to DNS name of the service you typed into your browser or that was called by your library not matching the identities listed in the certificate.

cert

Yes I know there are ways to get around this by ignoring certificate mismatches (terrible security decision) or doing something funky like overriding database.windows.net (this is against Microsoft recommendations) with your own zone.  Don’t do this.  If you want the service to support this type of functionality, submit a feedback request.

Now if anyone is aware of a way to get around this limitation that is supported and not insane, I’d definitely be interested in hearing about it.

Before I conclude this series I want to provide one more gotcha.  Take note that while Private Link Endpoints can be integrated Azure Private DNS and the records can be automatically created, they do not share the full lifecycle.  This means that if you delete a private link endpoint and create a new one for the same resource, the NIC (network interface) associated with the endpoint may get a new IP.  This will cause your queries to fail to resolve because they will resolve to the prior IP.  You will need to manually clean up the A record hosted in the Azure Private DNS zone before creating the new endpoint.

Well folks that wraps it up.  Hopefully you found this information helpful and it cleared up some of the mystery of DNS patterns with Private Link Endpoints.  I want to plug a stellar write-up by Dan Mauser, who is one of the networking all stars over at Microsoft.  He wrote up an incredibly detailed post on this topic which covers the topic more exhaustively than I did above.

Thanks!

Azure Private Link and DNS – Part 1

Azure Private Link and DNS – Part 1

Hi there fellow geeks!

Azure Private Link is becoming a frequent topic of discussion among peers and my customers.  One of the often discussed topics is how to handle DNS with Private Link Endpoints.  I spent the past few days deep diving into the documentation and doing some labbing to better understand what the patterns and gotchas were.  There seemed to be enough value to the findings to share it with you all.

Before I dive into the guts of Private Link Endpoints, I want to spend a post walking through how Private Link came to be.

Last September Microsoft released the Azure Private Link service.  One of the primary drivers behind the introduction of the service was to address the customer demand for secure and private connectivity to Azure services such as Azure SQL and Azure Storage as well as third-party services.  Azure PaaS services used to be accessible only via public IP addresses which required a path out to the Internet. From a network security perspective, your only option to use the firewall feature built into many of the services to filter the IPs allowed to communicate with the service.  While technically feasible, there had to be something better.

The first attempt at something better was Service Endpoints, which started to be introduced into general availability in February 2018.  For you AWS folk, the Service Endpoints are probably closest to VPC Gateway Endpoints.  Service Endpoints attempted to improve the experience of accessing the services from a VNet (virtual network) by providing a direct route for resources in a VNet (virtual network) to Azure services in order to optimize routing.  To mitigate the risk of the service being accessible over an public IP, Service Endpoints also added an identity to the VNet.  This allowed customers to expand context of the filtering being done by the service firewall beyond IP to the identity of the VNet containing resources that need to access the relevant service.

Service Endpoints

Service Endpoints

While Service Endpoints made some great improvements there was more work to be done.  Service Endpoints did nothing to mitigate the risk of data exfiltration.  If an attacker was able to compromise a VM (virtual machine) in your VNet, that attacker could use that optimized route to their advantage piping whatever data they were able to get access to out to an attacker controlled instance of the resource such as an Azure Storage Account.  Service Endpoint policies were then introduced to help address this risk.

Well that’s great an all, but Service Endpoints did nothing to address accessing Azure services from outside the VNet such as from an on-premises data center or another public cloud.  Customers were still stuck accessing the services over the Internet or using an ExpressRoute using Microsoft Peering.  Wouldn’t it be great there was a service with all of those features?

In comes Azure Private Link to the rescue.  Azure Private Link includes the concept of an Azure Private Link Service and Private Link Endpoint.  Those of you coming from AWS, yeah, I’ll let you guess which AWS service this is like :-).  I won’t be covering Private Link Services in this series beyond saying it’s way to build your own third party services and make them directly accessible from a customer VNet.  Instead we’ll keep our focus on Private Link Endpoints, specifically in the context of Microsoft-provided services.

The Private Link services introduces two new features that seek to address the gaps Service Endpoints did not and to include features from Service Endpoints that were beneficial.  These features are:

  • Private access to services running on the Azure platform through the provisioning of a virtual network interface within the customer VNet that is assigned one of the VNet IP addresses from the RFC1918 address space.
  • Makes the services accessible over private IP space to resources running outside of Azure such as machines running in an on-premises data center or virtual machines running in other clouds.
  • Protects against data exfiltration by the endpoint providing access to only a specific instance of a PaaS service.
Azure Private Link

Azure Private Link

As you can see from the above, the service solves a lot of problems and is going to be a necessary component of any Azure footprint.  Now when it comes to design and implementation, there are some options as to how you use DNS to resolve the name of the service resource being exposed by the endpoint to the private IP address of the Private Link Endpoint.  This is what I’ll be focusing on for this series.

In the next post I’ll walk you through what happens within Azure DNS when you create a Private Link endpoint, some patterns you can use for DNS resolution, and some of the gotchas.

The series is continued in my second post.

Deep Dive into Azure AD and AWS SSO Integration – Part 5

Deep Dive into Azure AD and AWS SSO Integration – Part 5

I’m back yet again with the fifth entry into my series on integrating Azure AD and AWS SSO.  It’s been a journey and the series has covered a lot of ground.  It started with outlining the challenge with the initial integration of Azure AD and AWS using the AWS app in the Azure Marketplace.  From there it took a deep dive into the components of the solution and how it compares to a standard integration using your SAML provider of choice.  It continued with the steps necessary to configure Azure AD and AWS SSO to support the federated trust to enable single sign-on.  The fourth post explored the benefits of SCIM and went step by step on how to configure SCIM between the two services.  For this final post I’m going to cover a few different scenarios to demonstrate what’s possible with this new integration.

Before I jump into the scenarios, there is one final task that needs to be completed now that the federated trust and SCIM have been setup.  That task is setting up the permission sets in AWS SSO.  Permission sets are simply IAM policies (either AWS-managed or custom policies you create).  For those of you from the Microsoft Azure world, an IAM policy is a collection of permissions which define what a security principal (such as a user or role) is authorized to do.  They are most similar to an Azure RBAC role definition but more flexible and granular due to advanced features such as condition keys.  Permission sets are projected into the AWS accounts they are assigned to as AWS IAM roles.  These are the IAM roles the security principal assumes.

As I mentioned above, AWS SSO supports both AWS-managed IAM policies and custom IAM policies for permission sets.  If you go into the AWS Accounts menu option of AWS SSO you’ll see the accounts associated with the AWS Organization and which permission sets have been associated to the AWS accounts thus resulting in AWS IAM Roles being created within the AWS account.  In the image below you can see that I’ve provisioned two permission sets to account1 and account2.

accountassignments.pngThe permission sets tab displays the permission sets I’ve created and whether or not they’ve been provisioned to any accounts.  In the screenshot below you’ll see I’ve added four AWS-managed policies for Billing, SecurityAudit, AdministratorAccess, and NetworkAdministrator.  Additionally, I created a new permission set named SystemsAdmin which uses a custom IAM policy which restricts the principal assuming the rule to EC2, CloudWatch, and ELB activities.

permissionsets.png

Back on the AWS organization tab, if you click on an account you can see the AWS SSO Users or Groups that have been assigned to a permission set.  In the image below, you can see that I’ve assigned both the B2B Security Admins group and the Security Admins group to the AdministratorAccess permission set and the System Operators group to the SystemsAdmin permission set.

assignments.png

With permission sets out of the way, let’s jump into the scenarios.

Scenario 1 – Windows AD User, AD FS, Azure AD, AWS SSOscenario1.PNG

In this scenario the user is Bart Simpson who is a member of the System Operators group on-premises and exists authoritatively in a Windows AD forest.  A federated trust has been established with Azure AD using an instance of AD FS running on-premises. Azure AD has been integrated with AWS SSO for both SSO (via SAML) and provisioning (via SCIM).

Once Bart was logged into a domain-joined machine, I popped open a browser and navigated to My Apps portal at https://myapps.microsoft.com.  This redirected me to the Azure AD login screen.  Here I entered Bart’s user name.

bartazuread.PNG

Azure AD performed its home realm discovery process, identified that the domain jogcloud.com is configured for federated authentication, and redirected me to AD FS.  Take note I purposely broke integrated windows authentication here to show you each step.  In a correctly configured browser, you wouldn’t see this screen.

bartadfs.PNG

After I successfully authenticated to AD FS, I was bounced back over to Azure AD where the assertion was delivered.  Azure AD then whipped up a SAML assertion for AWS SSO, returned it to the browser, and redirected the browser to the AWS SSO assertion consumer URL.  AWS SSO consumed the assertion and authenticated Bart into AWS SSO displaying the AWS IAM Role selection page with the relevant roles he has permission to access.

bartawssso.PNG

Scenario 2 – Windows AD User, AD FS with Certificate MFA, Azure AD with Conditional Access, AWS SSO

scenario2.PNG

Scenario 1 is pretty simple, so let’s get fancy and layer on some security.  Here I added an access control policy into AD FS requiring certificate-based authentication for members of the Security Admins group.  Additionally, I added a conditional access policy in Azure AD requiring MFA for any user that is a member of that same group.

Since Homer Simpson regularly runs a nuclear reactor, he’s also the Security Admin for JOGCLOUD.  He has been made member of the Windows AD Security Admin group.

As a first step I again popped open a browser and navigated to the My Apps portal.  After Homer’s username was plugged in, Azure AD redirected me to the AD FS server.  I again broke IWA to capture each step in the process.

signin2

After the password challenge was satisfied, I was prompted to provide the appropriate user certificate.

signin3.PNG

From there I was authenticated to Azure AD and served up the My Apps portal.

myapps.PNG

Wondering why I wasn’t prompted for Azure MFA?  No, I didn’t misconfigure it (at least this time).  A not well documented feature (at least in my opinion) of Azure AD is that you can pass a claim asserting a user has satisfied the MFA requirement thus making for a better user experience because the user isn’t required to authenticate multiple times.  Yes folks, this means you can layer your traditional certificate-based authentication on top of Azure AD and AWS. 

mfaonprem.png

After selecting the AWS SSO app, I was signed into AWS SSO and presented with the role selection screen.

awsssosignin1.PNG

I then selected a one of the roles and was signed into the relevant AWS account assuming the AdministratorAccess IAM Role.

awsssosignin2

Scenario 3 – Azure AD B2B User, AWS SSO

scenario3.PNG

What if you have a multi-tenant situation due to an acquisition or merger or perhaps you farm out operations to a managed service provider?  No worries there, B2B is also supported with this pattern.  In this scenario I’m using a user sourced from tenant that has been invited via Azure AD’s B2B.  The user has been added to the B2B Security Admins group which exists authoritatively in the inviting tenant (jogcloud.com) and was synchronized to AWS SSO via SCIM.

Opening a browser and navigating to the My Apps portal kicks off Azure AD authentication and drops the user into their source tenant.  Once there I can change my tenant by selecting the profile icon and selecting the jogcloud tenant.

myappsmultiple.png

I’m then presented with the apps that I’m authorized to use in the jogcloud tenant, which includes the AWS SSO app.

guestmyapp.PNG

Azure AD kicks off the federated authentication and I’m presented with the AWS role selection page where I can choose to assume the AdministratorAccess role in two of the AWS accounts.

guestawsso.png

Scenario 4 – AWS CLI

I know what you’re saying now, “But what about CLI?”  Well folks, for that you can leverage the AWS CLI v2.  It’s still in preview right now, but I did test it using the user from scenario 2 and it worked flawlessly.  The experience is pretty anti-climatic so I’m not going to dive into it.  The user experience is similar to using the Azure PowerShell cmdlets in that a web browser instance is opened and guides you through the authentication process.

That will sum up this series.

Few technologies get me excited enough to write five posts, but this integration is really amazing.  With AWS hooking into Azure AD as effectively as they have (especially love the CLI integration), it reduces operational overhead and improves security which is a combination you rarely see together.  Most importantly, it puts the customer first by optimizing the user experience.  If you weren’t convinced on Azure AD’s capabilities as an IDaaS, hopefully this series has helped educate you as to the value of the platform.

With that I’ll sign off.  A big thanks to the AWS product team that worked on this integration.  You did an amazing job that will greatly benefit our mutual customers.

To the rest of you, I wish you happy holidays!

 

 

 

Deep Dive into Azure AD and AWS SSO Integration – Part 4

Deep Dive into Azure AD and AWS SSO Integration – Part 4

Today we continue exploring the new integration between Microsoft’s Azure AD (Azure Active Directory) and AWS (Amazon Web Services) SSO (Single Sign-On).  Over the past three posts I’ve covered the high level concepts of both platforms, the challenges the integration seeks to solve, and how to enable the federated trust which facilitates the single sign-on experience.  If you haven’t read through those posts, I recommend you before you dive into this one.  In this post I’ll be covering the neatest feature of the new integration, which is the support for automated provisioning.

If you’ve ever worked in the identity realm before, you know the pains that come with managing the life cycle of an identity from initial provisioning, changes resulting to the identity such as department and position changes, to the often forgotten stage of de-provisioning.  On-premises these problems were used solved by cobbled together scripts or complex identity management solution such as SailPoint Identity IQ or Microsoft Identity Manager.  While these tools were challenging to implement and operate, they did their job in the world of Windows Active Directory, LDAP, SQL databases and the like.

Then came cloud, and all bets were off.  Identity data stores skyrocketed from less than a hundred to hundreds and sometimes thousands (B2C has exploded far beyond event that).  Each new cloud service introduced into the enterprise introduced yet another identity management challenge.  While some of these offerings have APIs that support identity management operations, most do not, and those that do are proprietary in nature.  Writing custom code to each of the APIs is a huge challenge that most enterprises can’t keep up.  The result is often manual management of an identity life cycle, through uploading exported CSV files or some poor soul pointing and clicking a thousand times in a vendor portal.

Wouldn’t it be great if there was some mythical standard out that would help to solve this problem, use a standard API through REST, and support the JSON format?  Turns out there is and that standard is SCIM (System for Cross-domain Identity Management).  You may be surprised to know the standard has been around for a while now (technically 2011).  I recall hearing about it at a Gartner conference many many hears ago.  Unfortunately, it’s taken a long time to catch on but support is steadily increasing.

Thankfully for us, Microsoft has baked support into Azure AD and AWS recognized the value and took advantage of the feature.  By doing this, the identity life cycle challenges of managing an Azure AD and AWS integration has been heavily re-mediated and our lives made easier.

Azure AD Provisioning - Example

Azure AD Provisioning – Example

Let’s take a look at how set it up, shall we?

The first place you’ll need to go is into the AWS account which is the master for the organization and into the AWS SSO Settings.  In Settings you’ll see the provisioning option which is initially set as manual.  Select to enable automatic provisioning.

AWS SSO Settings - Provisioning

AWS SSO Settings – Provisioning

Once complete, a SCIM endpoint will be created.  This is the endpoint in AWS (referred to as the SCIM service provider in the SCIM standard) that the SCIM service on Azure AD (referred to as the client in the SCIM standard) will interact with to search for, create, modify, and delete AWS users and groups.  To interact with this endpoint, Azure AD must authenticate to it, which it does with a bearer access token that is issued by AWS SSO.  Be aware that the access token has a one year life span, so ensure you set some type of reminder.  A quick search through the boto3 API doesn’t show a way to query for issued access tokens (yes you can issue more than one at at time) so you won’t be able to automate the process as of yet.

awssso-scimendpoint.png

After SCIM is enabled, AWS SSO Settings for provisioning now reports SCIM in use.

awssso-scimenabled.png

Next you’ll need to bounce over to Azure AD and go into the enterprise app you created (refer to my third post for this process).   There you’ll navigate to the Provisioning blade and select Automatic as the provisioning method.

azuread-scimprov.png

You’ll then need to configure the URL and access token you collected from AWS and test the connection.  This will cause Azure AD to test querying the endpoint for a random user and group to validate functionality.

azuread-scimtest.png

If your test is successful you can then save the settings.

azuread-scimtestsucccess.PNG

You’re not done yet.  Next you have to configure a mapping which map attributes in Azure AD to the resource and attributes in the SCIM schema.  Yes folks, SCIM does have a schema for attributes and resources (like users and groups).  You can extend it as needed, but in this integration it looks to be using the default user and group resources.

azuread-scimmappings

Let’s take a look at what the group mappings look like.

azuread-scimgroupmappings.PNG

The attribute names on the left are the names of the attributes in Azure AD and the attributes on the right are the names of the attributes Azure AD will write the values of the attributes to in AWS SSO.  Nothing too surprising here.

How about the user mappings?

azuread-scimusermappings1azuread-scimusermappings2

Lots more attributes in the user mappings by default.  Now I’m not sure how many of these attributes AWS SSO supports.  According to the SCIM standard, a client can attempt to write whatever it wants and any attributes the service provider doesn’t understand is simply discarded.  The best list of attributes I could find were located here, and it’s not near this number.  I can’t speak to what the minimum required attributes are to make AWS work, because their official instructions on this integration doesn’t say.  I know some of the product team sometimes reads the blog, so maybe we’ll luck out and someone will respond with that answer.

The one tweak you’ll need to make here is to delete the mailNickName mapping and replace it with a mapping of objectId to externalId.  After you make the change, click the save icon.

I don’t know why AWS requires this so I can only theorize.  Maybe they’re using this attribute as a primary key in the back end database or perhaps they’re using it to map the users to the groups?  I’m not sure how Azure AD is writing the members attribute over to AWS.  Maybe in the future I’ll throw together a basic app to visualize what the service provider end looks like.

newmapping.PNG

Now you need to decide what users and groups you want to sync to AWS SSO.  Towards the bottom of the provisioning blade, you’ll see the option to toggle the provisioning status.  The scope drop down box has an option to sync all users and groups or to sync only assigned users and groups.  Best practice here is basic security, only sync what you need to sync, so leave the option on sync only assigned users and group.

The assigned users and groups refers to users that have been assigned to the enterprise application in Azure AD.  This is configured on the Users and Groups blade for the enterprise app.  I tested a few different scenarios using an Azure AD dynamic group, standard group, and a group synchronized from Windows AD.  All worked successfully and synchronized the relevant users over.

Once you’re happy with your settings, toggle the provisioning status and save the changes.  It may take some time depending on how much you’re syncing.

syncsuccess.PNG

If the sync is successful, you’ll be able to hop back over to AWS SSO and you’ll see your users and groups.

awssyncedusersawssyncedgroup

Microsoft’s official documentation does a great job explaining the end to end cycle.  The short of it is there’s an initial cycle which grabs all users and groups from Azure AD, then filters the list down to the users and groups assigned to the application.  From there it queries the target system to match the user with the matching attribute and if it isn’t found creates it, and if found and needs updating, updates it.

Incremental cycles are down from that point forward every 40 minutes.  I couldn’t find any documentation on how to adjust the synchronization frequency.  Be aware of that 40 minute sync and consider the end to end synchronization if you’re sourcing from Windows Active Directory.  In that case making changes in Windows AD could take just over an hour (assuming you’re using the 30 minute sync interval in Azure AD Connect) to fully synchronize.

awsssotime.PNG

As I described in my third post, I have a lab environment setup where a Windows Active Directory domain is syncing to Azure AD.  I used that environment to play out a few scenarios.

In the first scenario I disabled Marge Simpson’s account.  After waiting some time for changes to synchronize across both platforms, I saw in AWS SSO that Marge Simpson was now disabled.

margedisabled.PNG

For another scenario, I removed Barney Gumble from the Network Operators Active Directory group.  After waiting time for the sync to complete, the Network Operators group is now empty reflecting Barney’s removal from the group.

networkoperators.PNG

Recall that I assigned four groups to the app in Azure AD, Network Operators, Security Admins, Security Auditors, and Systems Operators.  These are the four groups syncing to AWS SSO.  Barney Gumble was only a member of the Network Operators group, which means removing him put him out of scope for the app assignment.  In AWS SSO, he now reports as being disabled.

barneydisabled.PNG

For our final scenario, let’s look at what happens when I deleted Barney Gumble from Windows Active Directory.  After waiting the required replication time, Barney Gumble’s user account was still present in AWS SSO, but set as disabled.  While Barney wouldn’t be able to login to AWS SSO, there would still be cleanup that would need to happen on the AWS SSO directory to remove the stale identity records.

barneydisabled.PNG

The last thing I want to cover is the logging capabilities of the SCIM service in Azure AD.  There are two separate logs you can reference.  The first are the Provisioning Logs which are currently in preview.  These logs are going to be your go to to troubleshoot issues with the provisioning process.  They’re available with an Azure AD P1 or above license and are kept for 30 days.  Supposedly they’re kept for free for 7 days, but the documentation isn’t clear whether or not you have the ability to consume them.  I also couldn’t find any documentation on if it’s possible to pull the logs from an API for longer term retention or analysis in Log Analytics or a 3rd party logging solution.

If you’ve ever used Azure AD, you’ll be familiar with the second source of logs.  In the Azure AD Audit logs, you get additional information, which while useful, is more catered to tracking the process vs troubleshooting the process like the provisioning logs.

Before I wrap up, let’s cover a few key findings:

  • The access token used to access the SCIM endpoint in AWS SSO has a one year lifetime.  There doesn’t seem to be a way to query what tokens have been issued by AWS SSO at this time, so you’ll need to manage the life cycle in another manner until the capability is introduced.
  • Users that are removed from the scope of the sync, either by unassigning them from the app or deleting their user object, become disabled in AWS SSO.  The records will need to be cleaned up via another process.
  • If synchronizing changes from a Windows AD the end to end synchronization process can take over an hour (30 minutes from Windows AD to Azure AD and 40 minutes from Azure AD to AWS SSO).

That will wrap up this post.  In my opinion the SCIM service available in Azure AD is extremely under utilized.  SCIM is a great specification that needs more love.  While there is a growing adoption from large enterprise software vendors, there is a real opportunity for your organization to take advantage of the features it offers in the same way AWS has.  It can greatly ease the pain your customers and enterprise users experience having to manage the life cycle of an identity and makes for a nice belt and suspenders to modern identity capabilities in an application.

In the last post of my series I’ll demonstrate a few scenarios showing how simple the end to end experience is for users.  I’ll include some examples of how you can incorporate some of the advanced security features of Azure AD to help protect your multi-cloud experience.

See you next post!

 

Deep Dive into Azure AD and AWS SSO Integration – Part 3

Deep Dive into Azure AD and AWS SSO Integration – Part 3

Back for more are you?

Over the past few posts I’ve been covering the new integration between Azure AD and AWS SSO.  The first post covered high level concepts of both platforms and some of the problems with the initial integration which used the AWS app in the Azure Marketplace.  In the second post I provided a deep dive into the traditional integration with AWS using a non-Azure AD security token service like AD FS (Active Directory Federation Services), what the challenges were, how the new integration between Azure AD and AWS SSO addresses those challenges, and the components that make up both the traditional and the new solution.  If you haven’t read the prior posts, I highly recommend you at least read through the second post.

Azure AD and AWS SSO Integration

New Azure AD and AWS SSO Integration

In this post I’m going to get my hands dirty and step through the implementation steps to establish the SAML trust between the two platforms.  I’ve setup a fairly simple lab environment in Azure.  The lab environment consists of a single VNet (virtual network) with a four virtual machines with the following functions:

  • dc1 – Windows Active Directory domain controller for jogcloud.com domain
  • adcs – Active Directory Certificate Services
  • aadc1 – Azure Active Directory Connect (AADC)
  • adfs1 – Active Directory Federation Services

AADC has been configured to synchronize to the jogcloud.com Azure Active Directory tenant.  I’ve configured federated authentication in Azure AD with the AD FS server acting as an identity provider and Windows Active Directory as the credential services provider.

visio of lab environment

Lab Environment

On the AWS side I have three AWS accounts setup associated with an AWS Organization.  AWS SSO has not yet been setup in the master account.

Let’s setup it up, shall we?

The first thing you’ll need to do is log into the AWS Organization master account with an account with appropriate permissions to enable AWS SSO for the organization.  If you’ve never enabled AWS SSO before, you’ll be greeted by the following screen.

1.png

Click the Enable AWS SSO button and let the magic happen in the background.  That magic is provisioning of a service-linked role for AWS SSO in each AWS account in the organization.  This role has a set of permissions which include the permission to write to the AWS IAM instance in the child account.  This is used to push the permission sets configured in AWS SSO to IAM roles in the accounts.

Screenshot of AWS SSO IAM Role

AWS SSO Service-Linked IAM Role

After about a minute (this could differ depending on how many AWS accounts you have associated with your organization), AWS SSO is enabled and you’re redirected to the page below.

Screenshot of AWS SSO successfully enabled page

AWS SSO Successfully Enabled

Now that AWS SSO has been configured, it’s time to hop over to the Azure Portal.  You’ll need to log into the portal as a user with sufficient permissions to register new enterprise applications.  Once logged in, go into the Azure Active Directory blade and select the Enterprise Applications option.

Register new Enterprise Application

Register new Enterprise Application

Once the new blade opens select the New Application option.

Register new application

Register new application

Choose the Non-gallery application potion since we don’t want to use the AWS app in the Azure Marketplace due to the issues I covered in the first post.

Choose Non-gallery application

Choose Non-gallery application

Name the application whatever you want, I went with AWS SSO to keep it simple.  The registration process will take a minute or two.

Registering application

Registering application

Once the process is complete, you’ll want to open the new application and to go the Single sign-on menu item and select the SAML option.  This is the menu where you will configure the federated trust between your Azure AD tenant and AWS SSO on the Azure  AD end.

SAML Configuration Menu

SAML Configuration Menu

At this point you need to collect the federation metadata containing all the information necessary to register Azure AD with AWS SSO.  To make it easy, Azure AD provides you with a link to directly download the metadata.

Download federation metadata

Download federation metadata

Now that the new application is registered in Azure AD and you’ve gotten a copy of the federation metadata, you need to hop back over to AWS SSO.  Here you’ll need to go to Settings.  In the settings menu you can adjust the identity source, authentication, and provisioning methods for AWS SSO.  By default AWS SSO is set to use its own local directory as an identity source and itself for the other two options.

AWS SSO Settings

AWS SSO Settings

Next up, you select the Change option next to the identity source.  As seen in the screenshot below, AWS SSO can use its own local directory, an instance of Managed AD or BYOAD using the AD Connector, or an external identity provider (the new option).  Selecting the External Identity Provider option opens up the option to configure a SAML trust with AWS SSO.

Like any good authentication expert, you know that you need to configure the federated trust on both the identity provider and service provider.  To do this we need to get the federation metadata from AWS SSO, which AWS has been lovely enough to also provide it to us via a simple download link which you’ll want to use to get a copy of the metadata we’ll later import into Azure AD.

Now you’ll need to upload the federation metadata you downloaded from Azure AD in the Identity provider metadata section.  This establishes the trust in AWS SSO for assertions created from Azure AD.  Click the Next: Review button and complete the process.

AWS SSO Identity Sources

Configure SAML trust

You’ll be asked to confirm changing the identity source.  There a few key points I want to call out in the confirmation page.

  • AWS SSO will preserve your existing users and assignments -> If you have created existing AWS SSO users in the local directory and permission sets to go along with them, they will remain even after you enable it but those users will no longer be able to login.
  • All existing MFA configurations will be deleted when customer switches from AWS SSO to IdP.  MFA policy controls will be managed on IdP -> Yes folks, you’ll now need to handle MFA.  Thankfully you’re using Azure AD so you plenty of options there.
  • All items about provisioning – You have to option to manually provision identities into AWS SSO or use the SCIM endpoint to automatically provision accounts.  I won’t be covering it, but I tested manual provisioning and the single sign-on aspect worked flawless.  Know it’s an option if you opt to use another IdP that isn’t as fully featured as Azure AD.
Confirmation prompt

Confirmation prompt

Because I had to, I popped up the federation metadata to see what AWS requiring in the order of claims in the SAML assertion.  In the screenshot below we see is requesting the single claim of nameid-format:emailaddress.  This value of this claim will be used to map the user to the relevant identity in AWS SSO.

AWS SSO Metadata

Back to the Azure Portal once again where you’ll want to hop back to Single sign-on blade of the application you registered.  Here you’ll click the Upload metadata file button and upload the AWS metadata.

Uploading AWS federation metadata

Uploading AWS federation metadata

After the upload is successful you’ll receive a confirmation screen.  You can simple hit the Save button here and move on.

Confirming SAML

Confirming SAML

At this stage you’ve now registered your Azure AD tenant as an identity provider to AWS SSO.  If you were using a non-Azure AD security token service, you could now manually provision your users AWS SSO, create the necessary groups and permissions sets, and administer away.

I’ll wrap up there and cover the SCIM provisioning in the next post.  To sum it up, in this post we configured AWS SSO in the AWS Organization and established the SAML federated trust between the Azure AD tenant and AWS SSO.

See you next post!