Azure Private Link and DNS – Part 2

Azure Private Link and DNS – Part 2

Hello again!

In this post I’ll be continuing my series on Azure Private Link and DNS.  In my last post I gave some background into Private Link, how it came to be, and what it offers.  For this post I’ll be diving into some DNS patterns you can use to support name resolution with Private Link Endpoints for Azure services.  I’ll be covering the six scenarios below:

  1. Default DNS pattern without Private Link Endpoint
  2. Azure Private DNS pattern with a single virtual network
  3. BYODNS (Bring your own DNS) in a hub and spoke architecture
  4. BYODNS with a custom DNS forwarder in a hub and spoke architecture
  5. BYODNS with the use of root hints in a hub and spoke architecture
  6. BYODNS with the use of a custom DNS zone hosted in the BYODNS in a hub and spoke architecture

Before I jump into the scenarios, I want to cover some basic (and not so basic) DNS concepts.  If you know nothing about DNS, I’d highly suggest you stop reading here and take a quick few minutes to read through this DNS 101 by RedHat.  If you’ve operated a DNS service in a large enterprise, you can skip this section and jump into the scenarios.  If you only know the basics, read through the below or else you may not get much out of this post.

  • A record – Translates a hostname to an IP address such as http://www.journeyofthegeek.com to 5.5.5.5
  • CNAME record – Alias record where you can point on (FQDNs) fully qualified domain name to another to make it a domain a human can remember and for high availability configurations
  •  Recursive Name Resolution – A DNS query where the client asks for a definitive answer to a query and relies on the upstream DNS server to resolve it to completion.  Forwarders such as Google DNS function as recursive resolvers.
  • Iterative Name Resolution – A DNS query where a client receives back a referral to another server in place of resolving the query to completion.  Querying root hints often involves iterate name resolution.
  • DNS Forwarder – Forward all queries the DNS service can’t resolve to an upstream DNS service.  These upstream servers are typically configure to perform recursive name resolution, but depending on your DNS service (such as Infoblox), you can configure it to request iterative name resolution.
  • Conditional Forwarder – Forward queries for a specific DNS namespace to an upstream DNS service for resolution.
  • Split-brain / Split Horizon DNS – A DNS configuration where a DNS namespace exists authoritatively across one or more DNS implementations.  A common use case is to have a single DNS namespace defined on Internet-resolvable public facing DNS servers and also on Intranet private facing DNS servers.  This allows trusted clients to reach the service via a private IP address and untrusted clients to reach the service via a public IP address.

If you can grasp the topics above, you’ll be in good shape for the rest of this post.

Scenario 1 – Default DNS Pattern Without Private Link Endpoint

Scenario 1

Scenario 1

Before we jump into how DNS for Azure services works when Private Link Endpoint is introduced, let’s first look at how it works without it.  For this example, let’s look at a scenario where I’m using an VM (virtual machine) running in an VNet (virtual network) and am attempting to connect to an Azure SQL instance named db1.database.windows.net.  No Private Link Endpoint has been configured for the Azure SQL instance and the VNet is configured to use Azure-provided DNS and thus sends its DNS queries out the 168.63.129.16 virtual IP.  I explain how Azure-provided DNS works with the virtual IP in a prior blog post.  When I open SQL Server Management Studio and try to connect to d1.database.windows.net, my VM first needs to determine the IP address of the resource it needs to establish a TCP connection with.  For this it issues a DNS query to the Azure DNS service.

The FQDN (fully-qualified domain name) for your specific instance of an Azure service will more than likely have two or more CNAME records associated with it.  I don’t have any super secret information as to the official reasons behind these CNAMEs and can only theorize that they are used to orchestrate high availability of the service.  By using the CNAMEs Microsoft is able to to provide you with DNS record you can customize to your requirements and place in code.  Any failures in the backend require a simple modification of the alias the CNAME is pointing to without requiring changes to your code such as modifications to the connection string.

Since Azure DNS is a recursive DNS resolver, it handles resolving each of these records for you and returns the public IP address of your Azure SQL instance.  Your VM will then use this public IP address to setup a TCP connection and establish a connection to your database.

Scenario 2 – Azure Private DNS pattern with a single virtual network

Scenario 2

Scenario 2

Now let’s cover how things change when we add a Private Link Endpoint and configure it to integrate with Azure Private DNS.  If you’re unfamiliar with how Azure Private DNS works take a read from my prior post on the topic.

In this scenario I’ve added a Private Link Endpoint for my Azure SQL instance.  I’ve configured the Endpoint to integrate with an Azure Private DNS zone named privatelink.database.windows.net and have linked the VNet to the Azure Private DNS zone.

Notice the changes to the records in Azure Public DNS.  The hostname for my Azure SQL instance now has a CNAME record with an alias defined for db1.privatelink.database.windows.net.  There is also a new CNAME record for db1.privatelink.database.windows.net which points to the same dataslice4.eastus2.database.windows.net record as we saw in the last scenario.  This is done for two reasons.  The first reason is it allows clients accessing to instance through a public IP to continue to do so because Microsoft has established a split-brain DNS configuration for the privatelink.database.windows.net zone.  The second reason is it allows Microsoft to work some magic in the backend (I have no idea how they’re doing it) that redirects queries originating from an Azure VNet that is linked to the Azure Private DNS zone to be resolved against the record in the Azure Private DNS zone.

This means that clients outside the linked Azure VNet will receive back the public IP address of the Azure SQL instance and clients within the Azure VNet linked to the Azure Private DNS zone will receive back the private IP address of Private Link Endpoint.

Scenario 3 – BYODNS in a Hub and Spoke Architecture

Scenario 3

Scenario 3

Scenarios 1 and 2 are important to understand, but the reality is very few organization have such a simple DNS pattern for their Azure footprints.  Most enterprises using Azure will be using a hub and spoke architecture.  Shared services such as a DNS service (Windows DNS, InfoBlox, BIND, whatever) are placed in the hub VNet and are shared among spoke VNets containing various workloads.  This DNS service will typically provide advanced features not provided by Azure Private DNS (at this time) such as conditional forwarders and DNS query logging.  You can check out my prior post on this pattern if you want to understand the details.

In the scenario below I’ve provisioned a DNS service in the hub VNet and configured it to forward all queries it can’t resolve to the 168.63.129.16 virtual IP.  Notice that I’ve now linked the Azure Private DNS zone to the hub VNet instead of the spoke VNet.  This is to ensure the DNS service can resolve the queries to this Azure Private DNS zone.  It also lets me take advantage of the advanced features of the DNS service such as those I discussed above.

The resolution with Azure-provided DNS occurs in the same manner as scenario 2 with the exception being that the DNS service performs the query and returns the results to the VM running in the spoke.

Scenario 4 – BYODNS With a Custom DNS Forwarder in a Hub and Spoke Architecture

Scenario 4

Scenario 4

Next up we have a scenario similar to the above where we have a hub and spoke architecture and have the DNS service in the hub configured to forward all queries it can’t resolve to an upstream forwarder.  Maybe it’s to some on-premises DNS server, a 3rd party threat service, or simply Google’s DNS service.   Whatever the case, this scenario means we now have to care about recursive resolution and conditional forwarders.

If the upstream DNS service you’re using supports recursive name resolution and the DNS service you’re using in your hub is configured to send recursive queries to it, then any queries for db1.database.windows.net will resolve to the public IP address of the service.  The reason for this is with recursion you’re asking the upstream DNS service to chase down the answer for you and that upstream DNS service only knows about the public privatelink.database.windows.net DNS zone and does not have access to the Azure Private DNS zone.

To handle this scenario want to create a conditional forwarder for database.windows.net (or the recommended zone for the service you’re using) and point it to Azure-provided DNS via the 168.61.129.16 virtual IP.  This enables you to let the Azure platform handle the split-brain DNS challenge as it has been engineered to do.

Scenario 5 – BYODNS With The Use of Root Hints in a Hub and Spoke Architecture

Scenario 5

Scenario 5

In scenario 5 we again have the same architecture as the prior scenarios with a few differences.  First off we are now sending iterative queries to the DNS Root Hints instead of an upstream forwarder.  This means our DNS service will chase the entirety of the resolution requesting referrals back from each DNS server in the path to resolve the FQDN.  The usage of iterative queries gives us the option of creating a conditional forwarder (our second difference) to the 168.63.129.16 for the privatelink.database.windows.net or optionally sending that query to some other DNS service we’re running in an on-premises data center or another cloud.

The key takeaway of this configuration is that using root hints puts a bigger burden on your DNS service because you are resolving a whole bunch more queries vs using an upstream DNS service like Azure DNS.   Additionally, if you opt to maintain your own DNS zone, it’s on you to figure out how to manage the whole lifecycle of the DNS records for the Private Link Endpoints.

Scenario 6 – BYODNS With The Use of a Custom DNS zone Hosted in The BYODNS In a Hub and Spoke Architecture

Scenario 6

Scenario 6

The last scenario I’ll cover is the use of a custom DNS zone named something outside of the Microsoft recommended zones (more required than recommended) that is hosted in your BYODNS service.  Let me save you any pain and suffering by telling you this will not work.  You’re probably asking why it won’t work.  The answer to that question requires understanding how data is secured in transit to Azure services.

Since you surely don’t want your data flowing through a network in clear text, most Azure services will either require or support encryption of data in transit using TLS (Transport Layer Security).  If you’re not familiar with TLS flow, you get a reasonably good overview here.  The key thing you want to understand is that TLS session is often established by using the certificate being served up by the Azure service.  In addition to confidentiality, it also authenticates the service to your client.

The authentication piece is what we care about here.  Without going too deep into the weeds, the certificate contains a property called the SAN (subject alternative name) which lists the identities of the services the certificate should be used for.  These identities are typically DNS names such as db1.database.windows.net.  If you try to go ahead and create a custom DNS zone and attempt to access the Azure service through that name, you’ll run into a certificate mismatch error which is due to DNS name of the service you typed into your browser or that was called by your library not matching the identities listed in the certificate.

cert

Yes I know there are ways to get around this by ignoring certificate mismatches (terrible security decision) or doing something funky like overriding database.windows.net (this is against Microsoft recommendations) with your own zone.  Don’t do this.  If you want the service to support this type of functionality, submit a feedback request.

Now if anyone is aware of a way to get around this limitation that is supported and not insane, I’d definitely be interested in hearing about it.

Before I conclude this series I want to provide one more gotcha.  Take note that while Private Link Endpoints can be integrated Azure Private DNS and the records can be automatically created, they do not share the full lifecycle.  This means that if you delete a private link endpoint and create a new one for the same resource, the NIC (network interface) associated with the endpoint may get a new IP.  This will cause your queries to fail to resolve because they will resolve to the prior IP.  You will need to manually clean up the A record hosted in the Azure Private DNS zone before creating the new endpoint.

Well folks that wraps it up.  Hopefully you found this information helpful and it cleared up some of the mystery of DNS patterns with Private Link Endpoints.

Thanks!

Azure Private Link and DNS – Part 1

Azure Private Link and DNS – Part 1

Hi there fellow geeks!

Azure Private Link is becoming a frequent topic of discussion among peers and my customers.  One of the often discussed topics is how to handle DNS with Private Link Endpoints.  I spent the past few days deep diving into the documentation and doing some labbing to better understand what the patterns and gotchas were.  There seemed to be enough value to the findings to share it with you all.

Before I dive into the guts of Private Link Endpoints, I want to spend a post walking through how Private Link came to be.

Last September Microsoft released the Azure Private Link service.  One of the primary drivers behind the introduction of the service was to address the customer demand for secure and private connectivity to Azure services such as Azure SQL and Azure Storage as well as third-party services.  Azure PaaS services used to be accessible only via public IP addresses which required a path out to the Internet. From a network security perspective, your only option to use the firewall feature built into many of the services to filter the IPs allowed to communicate with the service.  While technically feasible, there had to be something better.

The first attempt at something better was Service Endpoints, which started to be introduced into general availability in February 2018.  For you AWS folk, the Service Endpoints are probably closest to VPC Gateway Endpoints.  Service Endpoints attempted to improve the experience of accessing the services from a VNet (virtual network) by providing a direct route for resources in a VNet (virtual network) to Azure services in order to optimize routing.  To mitigate the risk of the service being accessible over an public IP, Service Endpoints also added an identity to the VNet.  This allowed customers to expand context of the filtering being done by the service firewall beyond IP to the identity of the VNet containing resources that need to access the relevant service.

Service Endpoints

Service Endpoints

While Service Endpoints made some great improvements there was more work to be done.  Service Endpoints did nothing to mitigate the risk of data exfiltration.  If an attacker was able to compromise a VM (virtual machine) in your VNet, that attacker could use that optimized route to their advantage piping whatever data they were able to get access to out to an attacker controlled instance of the resource such as an Azure Storage Account.  Service Endpoint policies were then introduced to help address this risk.

Well that’s great an all, but Service Endpoints did nothing to address accessing Azure services from outside the VNet such as from an on-premises data center or another public cloud.  Customers were still stuck accessing the services over the Internet or using an ExpressRoute using Microsoft Peering.  Wouldn’t it be great there was a service with all of those features?

In comes Azure Private Link to the rescue.  Azure Private Link includes the concept of an Azure Private Link Service and Private Link Endpoint.  Those of you coming from AWS, yeah, I’ll let you guess which AWS service this is like :-).  I won’t be covering Private Link Services in this series beyond saying it’s way to build your own third party services and make them directly accessible from a customer VNet.  Instead we’ll keep our focus on Private Link Endpoints, specifically in the context of Microsoft-provided services.

The Private Link services introduces two new features that seek to address the gaps Service Endpoints did not and to include features from Service Endpoints that were beneficial.  These features are:

  • Private access to services running on the Azure platform through the provisioning of a virtual network interface within the customer VNet that is assigned one of the VNet IP addresses from the RFC1918 address space.
  • Makes the services accessible over private IP space to resources running outside of Azure such as machines running in an on-premises data center or virtual machines running in other clouds.
  • Protects against data exfiltration by the endpoint providing access to only a specific instance of a PaaS service.
Azure Private Link

Azure Private Link

As you can see from the above, the service solves a lot of problems and is going to be a necessary component of any Azure footprint.  Now when it comes to design and implementation, there are some options as to how you use DNS to resolve the name of the service resource being exposed by the endpoint to the private IP address of the Private Link Endpoint.  This is what I’ll be focusing on for this series.

In the next post I’ll walk you through what happens within Azure DNS when you create a Private Link endpoint, some patterns you can use for DNS resolution, and some of the gotchas.

The series is continued in my second post.

Deep Dive into Azure AD and AWS SSO Integration – Part 4

Deep Dive into Azure AD and AWS SSO Integration – Part 4

Today we continue exploring the new integration between Microsoft’s Azure AD (Azure Active Directory) and AWS (Amazon Web Services) SSO (Single Sign-On).  Over the past three posts I’ve covered the high level concepts of both platforms, the challenges the integration seeks to solve, and how to enable the federated trust which facilitates the single sign-on experience.  If you haven’t read through those posts, I recommend you before you dive into this one.  In this post I’ll be covering the neatest feature of the new integration, which is the support for automated provisioning.

If you’ve ever worked in the identity realm before, you know the pains that come with managing the life cycle of an identity from initial provisioning, changes resulting to the identity such as department and position changes, to the often forgotten stage of de-provisioning.  On-premises these problems were used solved by cobbled together scripts or complex identity management solution such as SailPoint Identity IQ or Microsoft Identity Manager.  While these tools were challenging to implement and operate, they did their job in the world of Windows Active Directory, LDAP, SQL databases and the like.

Then came cloud, and all bets were off.  Identity data stores skyrocketed from less than a hundred to hundreds and sometimes thousands (B2C has exploded far beyond event that).  Each new cloud service introduced into the enterprise introduced yet another identity management challenge.  While some of these offerings have APIs that support identity management operations, most do not, and those that do are proprietary in nature.  Writing custom code to each of the APIs is a huge challenge that most enterprises can’t keep up.  The result is often manual management of an identity life cycle, through uploading exported CSV files or some poor soul pointing and clicking a thousand times in a vendor portal.

Wouldn’t it be great if there was some mythical standard out that would help to solve this problem, use a standard API through REST, and support the JSON format?  Turns out there is and that standard is SCIM (System for Cross-domain Identity Management).  You may be surprised to know the standard has been around for a while now (technically 2011).  I recall hearing about it at a Gartner conference many many hears ago.  Unfortunately, it’s taken a long time to catch on but support is steadily increasing.

Thankfully for us, Microsoft has baked support into Azure AD and AWS recognized the value and took advantage of the feature.  By doing this, the identity life cycle challenges of managing an Azure AD and AWS integration has been heavily re-mediated and our lives made easier.

Azure AD Provisioning - Example

Azure AD Provisioning – Example

Let’s take a look at how set it up, shall we?

The first place you’ll need to go is into the AWS account which is the master for the organization and into the AWS SSO Settings.  In Settings you’ll see the provisioning option which is initially set as manual.  Select to enable automatic provisioning.

AWS SSO Settings - Provisioning

AWS SSO Settings – Provisioning

Once complete, a SCIM endpoint will be created.  This is the endpoint in AWS (referred to as the SCIM service provider in the SCIM standard) that the SCIM service on Azure AD (referred to as the client in the SCIM standard) will interact with to search for, create, modify, and delete AWS users and groups.  To interact with this endpoint, Azure AD must authenticate to it, which it does with a bearer access token that is issued by AWS SSO.  Be aware that the access token has a one year life span, so ensure you set some type of reminder.  A quick search through the boto3 API doesn’t show a way to query for issued access tokens (yes you can issue more than one at at time) so you won’t be able to automate the process as of yet.

awssso-scimendpoint.png

After SCIM is enabled, AWS SSO Settings for provisioning now reports SCIM in use.

awssso-scimenabled.png

Next you’ll need to bounce over to Azure AD and go into the enterprise app you created (refer to my third post for this process).   There you’ll navigate to the Provisioning blade and select Automatic as the provisioning method.

azuread-scimprov.png

You’ll then need to configure the URL and access token you collected from AWS and test the connection.  This will cause Azure AD to test querying the endpoint for a random user and group to validate functionality.

azuread-scimtest.png

If your test is successful you can then save the settings.

azuread-scimtestsucccess.PNG

You’re not done yet.  Next you have to configure a mapping which map attributes in Azure AD to the resource and attributes in the SCIM schema.  Yes folks, SCIM does have a schema for attributes and resources (like users and groups).  You can extend it as needed, but in this integration it looks to be using the default user and group resources.

azuread-scimmappings

Let’s take a look at what the group mappings look like.

azuread-scimgroupmappings.PNG

The attribute names on the left are the names of the attributes in Azure AD and the attributes on the right are the names of the attributes Azure AD will write the values of the attributes to in AWS SSO.  Nothing too surprising here.

How about the user mappings?

azuread-scimusermappings1azuread-scimusermappings2

Lots more attributes in the user mappings by default.  Now I’m not sure how many of these attributes AWS SSO supports.  According to the SCIM standard, a client can attempt to write whatever it wants and any attributes the service provider doesn’t understand is simply discarded.  The best list of attributes I could find were located here, and it’s not near this number.  I can’t speak to what the minimum required attributes are to make AWS work, because their official instructions on this integration doesn’t say.  I know some of the product team sometimes reads the blog, so maybe we’ll luck out and someone will respond with that answer.

The one tweak you’ll need to make here is to delete the mailNickName mapping and replace it with a mapping of objectId to externalId.  After you make the change, click the save icon.

I don’t know why AWS requires this so I can only theorize.  Maybe they’re using this attribute as a primary key in the back end database or perhaps they’re using it to map the users to the groups?  I’m not sure how Azure AD is writing the members attribute over to AWS.  Maybe in the future I’ll throw together a basic app to visualize what the service provider end looks like.

newmapping.PNG

Now you need to decide what users and groups you want to sync to AWS SSO.  Towards the bottom of the provisioning blade, you’ll see the option to toggle the provisioning status.  The scope drop down box has an option to sync all users and groups or to sync only assigned users and groups.  Best practice here is basic security, only sync what you need to sync, so leave the option on sync only assigned users and group.

The assigned users and groups refers to users that have been assigned to the enterprise application in Azure AD.  This is configured on the Users and Groups blade for the enterprise app.  I tested a few different scenarios using an Azure AD dynamic group, standard group, and a group synchronized from Windows AD.  All worked successfully and synchronized the relevant users over.

Once you’re happy with your settings, toggle the provisioning status and save the changes.  It may take some time depending on how much you’re syncing.

syncsuccess.PNG

If the sync is successful, you’ll be able to hop back over to AWS SSO and you’ll see your users and groups.

awssyncedusersawssyncedgroup

Microsoft’s official documentation does a great job explaining the end to end cycle.  The short of it is there’s an initial cycle which grabs all users and groups from Azure AD, then filters the list down to the users and groups assigned to the application.  From there it queries the target system to match the user with the matching attribute and if it isn’t found creates it, and if found and needs updating, updates it.

Incremental cycles are down from that point forward every 40 minutes.  I couldn’t find any documentation on how to adjust the synchronization frequency.  Be aware of that 40 minute sync and consider the end to end synchronization if you’re sourcing from Windows Active Directory.  In that case making changes in Windows AD could take just over an hour (assuming you’re using the 30 minute sync interval in Azure AD Connect) to fully synchronize.

awsssotime.PNG

As I described in my third post, I have a lab environment setup where a Windows Active Directory domain is syncing to Azure AD.  I used that environment to play out a few scenarios.

In the first scenario I disabled Marge Simpson’s account.  After waiting some time for changes to synchronize across both platforms, I saw in AWS SSO that Marge Simpson was now disabled.

margedisabled.PNG

For another scenario, I removed Barney Gumble from the Network Operators Active Directory group.  After waiting time for the sync to complete, the Network Operators group is now empty reflecting Barney’s removal from the group.

networkoperators.PNG

Recall that I assigned four groups to the app in Azure AD, Network Operators, Security Admins, Security Auditors, and Systems Operators.  These are the four groups syncing to AWS SSO.  Barney Gumble was only a member of the Network Operators group, which means removing him put him out of scope for the app assignment.  In AWS SSO, he now reports as being disabled.

barneydisabled.PNG

For our final scenario, let’s look at what happens when I deleted Barney Gumble from Windows Active Directory.  After waiting the required replication time, Barney Gumble’s user account was still present in AWS SSO, but set as disabled.  While Barney wouldn’t be able to login to AWS SSO, there would still be cleanup that would need to happen on the AWS SSO directory to remove the stale identity records.

barneydisabled.PNG

The last thing I want to cover is the logging capabilities of the SCIM service in Azure AD.  There are two separate logs you can reference.  The first are the Provisioning Logs which are currently in preview.  These logs are going to be your go to to troubleshoot issues with the provisioning process.  They’re available with an Azure AD P1 or above license and are kept for 30 days.  Supposedly they’re kept for free for 7 days, but the documentation isn’t clear whether or not you have the ability to consume them.  I also couldn’t find any documentation on if it’s possible to pull the logs from an API for longer term retention or analysis in Log Analytics or a 3rd party logging solution.

If you’ve ever used Azure AD, you’ll be familiar with the second source of logs.  In the Azure AD Audit logs, you get additional information, which while useful, is more catered to tracking the process vs troubleshooting the process like the provisioning logs.

Before I wrap up, let’s cover a few key findings:

  • The access token used to access the SCIM endpoint in AWS SSO has a one year lifetime.  There doesn’t seem to be a way to query what tokens have been issued by AWS SSO at this time, so you’ll need to manage the life cycle in another manner until the capability is introduced.
  • Users that are removed from the scope of the sync, either by unassigning them from the app or deleting their user object, become disabled in AWS SSO.  The records will need to be cleaned up via another process.
  • If synchronizing changes from a Windows AD the end to end synchronization process can take over an hour (30 minutes from Windows AD to Azure AD and 40 minutes from Azure AD to AWS SSO).

That will wrap up this post.  In my opinion the SCIM service available in Azure AD is extremely under utilized.  SCIM is a great specification that needs more love.  While there is a growing adoption from large enterprise software vendors, there is a real opportunity for your organization to take advantage of the features it offers in the same way AWS has.  It can greatly ease the pain your customers and enterprise users experience having to manage the life cycle of an identity and makes for a nice belt and suspenders to modern identity capabilities in an application.

In the last post of my series I’ll demonstrate a few scenarios showing how simple the end to end experience is for users.  I’ll include some examples of how you can incorporate some of the advanced security features of Azure AD to help protect your multi-cloud experience.

See you next post!

 

Deep Dive into Azure AD and AWS SSO Integration – Part 3

Deep Dive into Azure AD and AWS SSO Integration – Part 3

Back for more are you?

Over the past few posts I’ve been covering the new integration between Azure AD and AWS SSO.  The first post covered high level concepts of both platforms and some of the problems with the initial integration which used the AWS app in the Azure Marketplace.  In the second post I provided a deep dive into the traditional integration with AWS using a non-Azure AD security token service like AD FS (Active Directory Federation Services), what the challenges were, how the new integration between Azure AD and AWS SSO addresses those challenges, and the components that make up both the traditional and the new solution.  If you haven’t read the prior posts, I highly recommend you at least read through the second post.

Azure AD and AWS SSO Integration

New Azure AD and AWS SSO Integration

In this post I’m going to get my hands dirty and step through the implementation steps to establish the SAML trust between the two platforms.  I’ve setup a fairly simple lab environment in Azure.  The lab environment consists of a single VNet (virtual network) with a four virtual machines with the following functions:

  • dc1 – Windows Active Directory domain controller for jogcloud.com domain
  • adcs – Active Directory Certificate Services
  • aadc1 – Azure Active Directory Connect (AADC)
  • adfs1 – Active Directory Federation Services

AADC has been configured to synchronize to the jogcloud.com Azure Active Directory tenant.  I’ve configured federated authentication in Azure AD with the AD FS server acting as an identity provider and Windows Active Directory as the credential services provider.

visio of lab environment

Lab Environment

On the AWS side I have three AWS accounts setup associated with an AWS Organization.  AWS SSO has not yet been setup in the master account.

Let’s setup it up, shall we?

The first thing you’ll need to do is log into the AWS Organization master account with an account with appropriate permissions to enable AWS SSO for the organization.  If you’ve never enabled AWS SSO before, you’ll be greeted by the following screen.

1.png

Click the Enable AWS SSO button and let the magic happen in the background.  That magic is provisioning of a service-linked role for AWS SSO in each AWS account in the organization.  This role has a set of permissions which include the permission to write to the AWS IAM instance in the child account.  This is used to push the permission sets configured in AWS SSO to IAM roles in the accounts.

Screenshot of AWS SSO IAM Role

AWS SSO Service-Linked IAM Role

After about a minute (this could differ depending on how many AWS accounts you have associated with your organization), AWS SSO is enabled and you’re redirected to the page below.

Screenshot of AWS SSO successfully enabled page

AWS SSO Successfully Enabled

Now that AWS SSO has been configured, it’s time to hop over to the Azure Portal.  You’ll need to log into the portal as a user with sufficient permissions to register new enterprise applications.  Once logged in, go into the Azure Active Directory blade and select the Enterprise Applications option.

Register new Enterprise Application

Register new Enterprise Application

Once the new blade opens select the New Application option.

Register new application

Register new application

Choose the Non-gallery application potion since we don’t want to use the AWS app in the Azure Marketplace due to the issues I covered in the first post.

Choose Non-gallery application

Choose Non-gallery application

Name the application whatever you want, I went with AWS SSO to keep it simple.  The registration process will take a minute or two.

Registering application

Registering application

Once the process is complete, you’ll want to open the new application and to go the Single sign-on menu item and select the SAML option.  This is the menu where you will configure the federated trust between your Azure AD tenant and AWS SSO on the Azure  AD end.

SAML Configuration Menu

SAML Configuration Menu

At this point you need to collect the federation metadata containing all the information necessary to register Azure AD with AWS SSO.  To make it easy, Azure AD provides you with a link to directly download the metadata.

Download federation metadata

Download federation metadata

Now that the new application is registered in Azure AD and you’ve gotten a copy of the federation metadata, you need to hop back over to AWS SSO.  Here you’ll need to go to Settings.  In the settings menu you can adjust the identity source, authentication, and provisioning methods for AWS SSO.  By default AWS SSO is set to use its own local directory as an identity source and itself for the other two options.

AWS SSO Settings

AWS SSO Settings

Next up, you select the Change option next to the identity source.  As seen in the screenshot below, AWS SSO can use its own local directory, an instance of Managed AD or BYOAD using the AD Connector, or an external identity provider (the new option).  Selecting the External Identity Provider option opens up the option to configure a SAML trust with AWS SSO.

Like any good authentication expert, you know that you need to configure the federated trust on both the identity provider and service provider.  To do this we need to get the federation metadata from AWS SSO, which AWS has been lovely enough to also provide it to us via a simple download link which you’ll want to use to get a copy of the metadata we’ll later import into Azure AD.

Now you’ll need to upload the federation metadata you downloaded from Azure AD in the Identity provider metadata section.  This establishes the trust in AWS SSO for assertions created from Azure AD.  Click the Next: Review button and complete the process.

AWS SSO Identity Sources

Configure SAML trust

You’ll be asked to confirm changing the identity source.  There a few key points I want to call out in the confirmation page.

  • AWS SSO will preserve your existing users and assignments -> If you have created existing AWS SSO users in the local directory and permission sets to go along with them, they will remain even after you enable it but those users will no longer be able to login.
  • All existing MFA configurations will be deleted when customer switches from AWS SSO to IdP.  MFA policy controls will be managed on IdP -> Yes folks, you’ll now need to handle MFA.  Thankfully you’re using Azure AD so you plenty of options there.
  • All items about provisioning – You have to option to manually provision identities into AWS SSO or use the SCIM endpoint to automatically provision accounts.  I won’t be covering it, but I tested manual provisioning and the single sign-on aspect worked flawless.  Know it’s an option if you opt to use another IdP that isn’t as fully featured as Azure AD.
Confirmation prompt

Confirmation prompt

Because I had to, I popped up the federation metadata to see what AWS requiring in the order of claims in the SAML assertion.  In the screenshot below we see is requesting the single claim of nameid-format:emailaddress.  This value of this claim will be used to map the user to the relevant identity in AWS SSO.

AWS SSO Metadata

Back to the Azure Portal once again where you’ll want to hop back to Single sign-on blade of the application you registered.  Here you’ll click the Upload metadata file button and upload the AWS metadata.

Uploading AWS federation metadata

Uploading AWS federation metadata

After the upload is successful you’ll receive a confirmation screen.  You can simple hit the Save button here and move on.

Confirming SAML

Confirming SAML

At this stage you’ve now registered your Azure AD tenant as an identity provider to AWS SSO.  If you were using a non-Azure AD security token service, you could now manually provision your users AWS SSO, create the necessary groups and permissions sets, and administer away.

I’ll wrap up there and cover the SCIM provisioning in the next post.  To sum it up, in this post we configured AWS SSO in the AWS Organization and established the SAML federated trust between the Azure AD tenant and AWS SSO.

See you next post!

Deep Dive into Azure AD and AWS SSO Integration – Part 2

Deep Dive into Azure AD and AWS SSO Integration – Part 2

Welcome back folks.

Today I’ll be continuing my series on the new integration between Azure AD and AWS SSO.  In my last post I covered the challenges with the prior integration between the two platforms, core AWS concepts needed to understand the new integration, and how the new integration addresses the challenges of the prior integration.

In this post I’m going to give some more context to the challenges covered in the first post and then provide an overview of the what the old and new patterns look like.  This will help clarify the value proposition of the integration for those of you who may still not be convinced.

The two challenges I want to focus on are:

  1. The AWS app was designed to synchronize identity data between AWS and Azure AD for a single AWS account
  2. The SAML trust between Azure AD and an AWS account had to be established separately for each AWS account.

Challenge 1 was unique to the Azure Marketplace AWS app because they were attempting to solve the identity lifecycle management problem.  Your security token service (STS) needs to pass a SAML assertion which includes the AWS IAM roles you are asserting for the user.  Those roles need to be mapped to the user somewhere for your STS to tap into them.  This is a problem you’re going to feel no matter what STS you use, so I give the team that put together the AWS app together credit for trying.

The folks over at AWS came up with an elegant solution requiring some transformation in the claims passed in the SAML token and another solution to store the roles in commonly unused attributes in Active Directory.  However, both solutions suffered the same problem in that you’re forced to workaround that mapping, which becomes considerably difficult as you began to scale to hundreds of AWS accounts.

Challenge 2 plagues all STSs because the SAML trust needs to be created for each and every AWS account.  Again, something that begins to get challenging as you scale.

AWS Past Integration

AWS Past Integration

In the image above, we see an example of how some enterprises addressed these problems.  We see that there is some STS in use acting as an identity provider (idP) (could be Azure AD, Okta, Ping, AD FS, whatever) that has a SAML trust with each AWS account.  The user to AWS IAM role mappings are included in an attribute of the user’s Active Directory user account.  When the user attempts to access AWS, the STS queries Active Directory for the information.  There is a custom process (manual or automated) that queries each AWS account for a list of AWS IAM Roles that are associated with the IdP in the AWS account.  These roles are then populated in the attribute for each relevant user account.  Lastly, CloudFormation is used to push IAM Roles to each AWS account.  This could be pushed through a manual process or a CI/CD pipeline.

Yeah this works, but who wants all that overhead?  Let’s look at the new method.

Azure AD and AWS SSO Integration

Azure AD and AWS SSO Integration

In the new integration where we use Azure AD and AWS SSO together, we now only need to establish a single SAML trust with AWS SSO.  Since AWS SSO is integrated with AWS Organizations it can be used as a centralized identity source for all AWS accounts within the organization.  Additionally, we can now leverage Azure AD to manage the synchronization of identity data (users and groups) from Azure AD to AWS SSO.  We then map our users or groups to permission sets (collections of IAM policies) in AWS SSO which are then provisioned as IAM roles in the relevant AWS accounts.  If we want to add a user to a role in AWS IAM, we can add that user to the relevant group in Azure AD and wait for the synchronization process to occur.  Once it’s complete, that user will have access to that IAM role in the relevant accounts.  A lot less work, right?

Let’s sum up what changes here:

  • We can use existing processes already in place to move users in and out of groups either on-premises in Windows AD (that is syncing to Azure AD with Azure AD Connect) or directly in Azure AD (if we’re not syncing from Windows AD).
  • Group to role mappings are now controlled in AWS SSO
  • Permission sets (or IAM policies for the IAM roles) are now centralized in AWS SSO
  • We no longer have to provision the IAM roles individually into each AWS account, we can centrally control it in AWS SSO

Cool right?

In my few posts I’ll begin walking through the integration an demonstrating some the solution.

Thanks!

DNS in Microsoft Azure – Part 3

DNS in Microsoft Azure – Part 3

Today I’ll be continuing my series on DNS in Microsoft Azure.  In my first post I covered fundamental concepts of DNS resolution in Azure such as the 168.63.129.16 virtual IP and Azure-provided DNS.  In the second post I went over the Azure Private DNS service, it’s benefits, limitations, and available patterns when you use Azure Private DNS alone.  In this post I’ll be exploring how, when combined with bring your own DNS (BYODNS), Azure Private DNS begins to really shine and introduces opportunities some very cool self-service/delegation models.

If an enterprise has any degree technical footprint, it will have a DNS infrastructure providing DNS resolution to intranet and Internet resources.  These existing services are often very mature and deeply embedded into the technology stack.  This means the likelihood of ditching your existing DNS service for a cloud-based DNS service isn’t going to happen out of the gates (if at all).  This leaves you with the question of extending your existing DNS infrastructure into Azure as is, or hooking it into cloud native DNS services such as Azure Private DNS.  I’m not going to give you the typical sales pitch stating how easy it is to do the latter, because it can be challenging depending on how complex your DNS infrastructure is and what your internal policies and operations models are.  Instead I’m going to show you how you can make these two services coexist and compliment each other.

As I covered in my first post, you can configure the VMs to use either Azure DNS servers or your own DNS servers.  This configuration is available at both the VNet level and VM network interface level.  Avoid setting the DNS server settings directly on the VM’s network interface if possible because it will introduce more management overhead.  There are always exceptions to the rule, but make sure establish what those exceptions are and have a way of tracking them.

So you’ve decided you’re going to BYODNS.  Common reason for doing this are:

  1. Hybrid workloads that require access to on-premises services
  2. Advanced capabilities of existing DNS services
  3. Requirements for Windows Active Directory for centralized identity, authentication, and optionally configuration management services
  4. Maintaining a singular management plane for all DNS services across an organization

Since the requirement around Windows Active Directory services is the most common reason in my experiences, I’m going to cover that use case.  Keep in mind that you could easily sub in your favorite DNS infrastructure service for the DNS patterns I demonstrate in this post.  Yes, this means you could toss in a BIND server or InfoBlox NVA.

With that settled, let’s cover the basics.

In the BYODNS scenario, you’ll want to configure your own DNS servers as seen in the screenshot below (note that you should include at least two DNS servers for redundancy):

dnservers.PNG

When configured to use a specific set of DNS servers, a few things happen at the VM.  The screenshot below is the results of an ipconfig /all on a domain-joined Windows Server 2016 VM.  First you’ll notice that the DNS server being pushed to the VM is the 10.100.4.10 address which is the DNS server setting I’m pushing at the VNet.  The other thing to take note of is the Connection-specific DNS suffix being pushed by the Azure DHCP service is no longer the Azure-provided (xxx.xxx.internal.cloudapp.net).  It’s now reddog.microsoft.com which is a non-functioning placeholder.  This is pushed to avoid interfering with DNS resolution through BYODNS such as the domain-joined scenario I’m demonstrating.ipconfig.png

The lab environment I’m using for this post looks like the below.

labenv.PNG

It has three VNets in a hub and spoke architecture where the shared VNet is peered to both the app1 and app2 VNet.  The shared VNet contains a single VM named dc1 acting as a domain controller for a Windows Active Directory forest named journeyofthegeek.com.  Each spoke VNet is configured to push the IP of dc1 (10.100.4.10) to the VMs within the VNet as the DNS server.  The VMs in each spoke are domain-joined.  I’ve also created multiple Azure Private Zones as seen in the table in the diagram.  The shared VNet has been linked to all the zones for resolution.  Each spoke VNet is linked to a zone for registration and resolution.

The DNS Server service running on dc1 has been configured to forward all traffic outside of its domain to Google’s public DNS servers .  It also has multiple conditional forwarders configured to send traffic for any of the Azure Private DNS zones to the 168.63.129.16 virtual IP.  I’ve created a single A record in the appzone.com named www and assigned it the IP of the app1 server (10.102.0.10) in the app1 VNet.

If you take a look below at each of the Azure Private DNS zones assigned to the spokes, you can see that the VMs in each spoke has automatically registered an A record for itself with its associated zone.  Take note that this happened even though each VM is configured to use dc1 as a DNS server.  This is the magic of the cloud platform where the platform itself took care of registration of the records.

app1zone

app1zone.com Private DNS Zone

app2zone

app2zone.com Private DNS Zone

When a VM needs to perform DNS resolution, it sends that DNS query to dc1.  It then sends a DNS query to Azure DNS services via the 168.63.129.16 virtual IP for resolution of the Azure Private zones (red line) that it has been linked to.  Resolution of records in other domains are sent out to the Internet (blue).  The traffic flows is illustrated in the diagram below:

stddnsreso.PNG

There are a few benefits to this pattern introduces.  One benefit is it addresses a few of the gaps in Azure Private DNS, namely no conditional forwarding and no query logging.

With no support for conditional forwarding, any VMs you set to use the Azure DNS servers through the 168.63.129.16 virtual IP will only be able to resolve namespaces Azure DNS is aware of.  Since Azure DNS has no awareness of DNS zones running on the domain controller, we’d be out of luck if we needed to use any domain services.  This problem extends to any DNS zone you’re running on DNS equipment that isn’t resolvable from the Internet.  Yep, this means no hybrid workloads over your private connection back to your on-prem or colo datacenter.  The conditional forwarder capability on the BYODNS service allow us to resolve the problem and additionally get the queries to Azure DNS when it’s called for.

The other limitation is DNS query logging.  As I’ve mentioned before, DNS query logs are excellent inputs to any organization’s behavior analytics to help detect threats in the environment.  That log data is that much more important when you move into the cloud, because it helps mitigate the risks of the additional freedoms you’ll be giving application owners and developers to spin up their own resources.  By introducing a BYODNS service, we capture that log data.

I fully expect both of these features to eventually make their way into the service.  Until that time, the BYODNS pattern demonstrated above can help address the gaps.

You may be asking yourself, “If I have to BYODNS, what does Azure Private DNS get me?” Excellent question.  The answer is it can provide self-service, agility, reduce overhead, and mitigate risk.  How does it do these things, let me count the ways:

  1. In most organizations, DNS is managed by a central IT group.  This means application owners and developers have to submit request and wait for those requests to be completed.  Wouldn’t it be great to let them perform the updates themselves on a zone they own?
  2. Azure Private DNS is available over a modern REST API.  Yes yes, I know you are a scripting ninja and have a 100 PowerShell and Bash scripts available at your fingertips, but show me a developer in 2019 who wants to write anything in those languages when a REST option is available.
  3. Managing multiple DNS zones and associated records on BYODNS equipment can require significant overhead in both staff and hardware.  This sometimes drives organizations to support fewer zones which increases the risks of changes to the zone affecting applications.  By incorporating Azure Private DNS into the mix, you can reduce the overhead of BYODNS (think of how much more when logging and conditional forwarders are introduced) by letting each business unit own a zone (i.e. marketing.journeyofthegeek.com, hr.journeyofthegeek.com, etc).
  4. Show me someone who been in operations that hasn’t had a major outage caused by what should have been a simple DNS change.  No?  I didn’t think so.  By giving each BU its own Azure Private DNS zone, you limit the blast radius of a bad change to BU1 affecting BU2.  Since each zone is different resource in Azure, you can additionally wrap an authorization boundary around that resource limiting employees to only the zones they need to administer.

Once you have the above pattern in place, you can easily expand upon it providing DNS resolution from on-premises VMs to Azure and vice versa.  You can Setup the appropriate connection between Azure and your on-premises (S2S, ExpressRoute), put in the appropriate conditional forwarders on both ends, and you’re good to go!  Again, expect this to be easier as the service matures if conditional forwarders and a PrivateLink endpoint for the service are introduced.

Well folks, that will wrap up the series.  The key things I want you to take away from this is that Azure Private DNS isn’t in a state where it can replace a mature DNS implementation (I fully expect that to change over time).  Instead, you will want to use to to supplement your existing DNS implementation to reduce overhead, increase agility of application owners and developers, and yes even mitigate a bit risk in the process.

For those of you who will be stuffing themselves with turkey, stuffing, and mashed potatoes this week, have a wonderful Thanksgiving!

 

DNS in Microsoft Azure – Part 2

DNS in Microsoft Azure – Part 2

Welcome back fellow geek to part two of my series on DNS in Azure.  In the first post I covered some core concepts behind the DNS offerings in Azure.  One of the core concepts was the 168.63.129.16 virtual IP address which acts the communication point when Azure services within a VNet need to talk to Azure DNS resolver.  If you’re unfamiliar with it, circle back and read that portion of the post.  I also covered the basic DNS offering, Azure-provided DNS.  For this post I’m going to cover the newly minted Azure Private DNS service.

As I covered my last post, Azure-provided DNS is a decent service if you’re doing some very basic proof-of-concept testing, but not much use beyond that.  The limited capabilities around record types, scale challenges for BYODNS when requiring resolution across multiple VNets, and no reverse DNS support typically have required an enterprise BYODNS solution.  This meant organizations were stuck purchasing expensive NVAs or rolling VMs running BIND or Windows DNS Server.  Beyond the overhead of having to manage all aspects of the VM we’re all familiar with, it also brings along legacy request and change management processes.  In most enterprises application owners have to submit requests to central IT to have DNS entries created or modified.  This is counter to the goal of empowering application owners to be more agile.

Thankfully, Microsoft heard the cries of application owners and central IT and introduced Azure Private DNS into public preview back in early 2018.  After a few iterations and improvements, the service officially went general availability just last month.  The service addresses many of the gaps Azure-provided DNS has such as:

  • Support for custom DNS namespaces
  • Support for all common DNS record types such as A, MX, CNAME, PTR
  • Support for reverse DNS
  • Automatic lifecycle management of VM DNS records
  • Resolution across multiple VNets

Before we jump into the weeds, we’ll first want to cover the basic concepts of the service.  Azure Private DNS zones are an Azure resource under the namespace of /providers/Microsoft.Network/privateDnsZones/.  Each DNS zone you want to create is represented as a separate resource.  Zones created in one subscription can be consumed in another subscription as long as they’re within the same Azure AD tenant.  VNets can resolve and register DNS records with the zones you create after you “link” the VNet to the zone.  Each zone can be linked to multiple VNets for registration and resolution.  On other hand, VNets can be linked to multiple zones for resolution but only one zone for registration.  Once a zone is linked to the VNet, VMs within the VNet resolve and/or register DNS records for those zones through the 168.63.129.16 virtual IP.

I’ll quickly cover the reverse lookup zone capability that comes along with using the service. When a VNet is linked to a zone for registration there is reverse lookup zone created for the VNet.  VMs created in subnets within that VNet will register a PTR record for its FQDN of the private zone as well as a PTR record for FQDN of internal.cloudapp.net zone.  Take note that records in the reverse lookup zone will only be resolvable by VMs within that VNet when sent through the 168.63.129.16 virtual IP.

In the image below VNet1 is linked to an Azure Private Zone for both resolution and registration.  VNet2 is registered to a different Azure Private Zone for both resolution and registration.  Both VNets are configured to use Azure DNS servers.  In this scenario, Server1 will be able to perform a reverse lookup for the IP address of Server2 because it is within the same VNet.  However, Server3 will not be able to perform a reverse lookup for Server2 because it is in a different VNet.

Picture of reverse DNS lookup flows

Reverse DNS Lookups with Azure Private DNS

In addition to PTR records, the VMs also register A records for the private zone and the Azure-provided DNS zone.  There are a few things to note about the A records automatically created in the private zone:

  • Each record has a property called isAutoRegistered which has a boolean value of true for any records created through the auto-registration process.
  • Auto-registered records have an extremely short TTL of 10 seconds.  If you have plans of performing DNS scavenging, take note of this and that these records are automatically deleted when the VM is deleted.
Private DNS zone viewed in portal

Portal View of Private DNS Zone

The Azure-provided DNS zone dynamically created for the VNet is still created even when linking an Azure Private DNS zone to a VNet.  Additionally, if you try to resolve the IP address using a single label hostname, you’ll get back the A record for the Azure-provided DNS zone.  This is by design and allows you to control the DNS suffix automatically appended by your VMs.  It also means you need to use the FQDN in any application configuration to ensure the record is resolved correctly.

dnsquery.PNG

Let’s now look at resolution between two VNets.  In this scenario we again have VNet1 and VNet2.  VNet1 is linked for both registration and resolution to the Azure Private DNS Zone of app1zone.com.  VNet2 is linked for just resolution to the app1zone.com.  VMs in VNet2 are able to resolve queries for the fully-qualified domain name of VMs in VNet1 as illustrated in the diagram below.

Image showing DNS resolution between two VNets where both are linked for resolution

DNS Resolution between two VNets

Beyond the auto-registration of records, you can also manually create a variety of record types as I mentioned above. There isn’t anything special or different in the way Azure is handling these records.  The only thing worth noting is the records have a standard 1 hour TTL.

There are two significant limitations in the service right now.  One of those limitations is no support for query logging.  Given how important DNS query logging data can be as data points to identifying threats in the environment, your organization may require this.  If so, you’ll need to insert some BYODNS into the mix (I’ll cover that pattern next post).  The other bigger and more critical limitation is the lack of support for conditional forwarding.  As of today, you can’t create conditional forwarders for the service which will prevent you from forwarding queries from the 168.63.129.16 virtual IP to other DNS services you may have running for resolution of other resources such as on-premises resources.  Again, the workaround here is BYODNS.  Expect both of these limitations to be addressed in time as the service matures.

Azure Private DNS alone is a great service if your organization is completely in the cloud and has basic DNS resolution needs.  Some patterns you could leverage here:

  • Separate private DNS zone for each application –  In this scenario you could grant your application owners full control of the zone letting them manage the records as they see fit.  This would improve the application team’s agility while reducing operational burden on central IT.
  • Separate private DNS zones for each environment (Dev/QA/Prod) – In this scenario you could avoid having to do any BYODNS if there are no dependencies on on-premises infrastructure.  You also get full lifecycle management of VM records cutting back on operational overhead.

Summing up the service:

  • Positives
    • Managed service where you don’t have to worry the managing the underlining infrastructure
    • Scalability and availability are baked into the service
    • Use of custom DNS namespaces
    • VMs spread across multiple VNets can resolve each other’s addresses
    • Reverse DNS is supported within a VNet
    • Lifecycle of the VMs DNS records are automatically managed by the platform
    • Applications could be assigned their own DNS zones and application owners delegated full control over that zone
  • Negatives
    • No support for conditional forwarders at this time
    • No support for DNS query logging at this time
    • No support for WINS or NETBIOS (although I call this a positive 🙂 )

In my next post I’ll cover how the service works with BYODNS and will discuss some neat patterns that are available when you take advantage of the service.