Integrating Azure AD and AWS – Part 1

Update: In November 2019 AWS introduced support for integration between Azure AD and AWS SSO.  The integration offers a ton more features, including out of the box support for multiple AWS accounts.  I highly recommend you go that route if you’re looking to integrate the two platforms.  Check out my series on the new integration here.

Hi everyone.  After being slammed with work from the real job over the past few months, I’m back with a new deep dive into the integration between Azure Active Directory (AAD) and Amazon Web Services (AWS).  I enjoyed the heck out of this one because I finally got some playtime in AWS and got to integrate two of the big cloud providers together to make some cool stuff happen.  There are a lot of blogs and articles out there (including Microsoft’s and Amazon’s *cough cough*) which provide the steps to accomplish this integration, but either the steps are incomplete, outdated, or wrong and none of them give a great explanation of the why or the how.  I can’t complain though, what else would I blog about?

Before we jump into the technologies that power the integration, let me first answer the question as to why we’d want to do the integration in the first place.

Let’s face it, managing digital identities is hard and it’s only getting harder with the introduction of cloud technology to the mix.  The tens of thousands of identities you’re managing for your users on-premises data center can quickly grow into the millions when SaaS comes into the mix. The operational overhead or supporting those millions of identities can eat up a large part of your IT budget and make your user experience miserable.  Beyond the cost issue, saddling your users with hundreds of credentials means users are going to re-use passwords and store them in whatever ways are convenient for them (under the keyboard anyone?) which introduces the risk of the credentials being compromised and sensitive data getting leaked.

Now more than ever you need to put a strong focus on centralized identity management and modern authentication and authorization.  Historically this was very challenging to do because of the lack of application programming interfaces (APIs) that allowed for create read update delete (CRUD) operations against the individual user records represented in an application database such as a SQL backend.  Beyond the lack of good APIs, you also were stuck using complicated and limiting legacy authentication and authorization protocols such as Kerberos, NTLM, LDAP, and the like.

Thankfully the industry has made a dramatic shift towards providing robust web-based APIs and support for modern authentication and authorization such as SAML, WS-Fed, Open ID Connect, and OAuth.  This presents a unique opportunity for organizations to shift towards a centralized identity management model where one authoritative store drives the lifecycle of an identity across all applications.  With the introduction of the modern protocols, users aren’t required to maintain thousands of credentials and can instead rely upon a singular trusted credential service provider (CSP) to act as the primary authentication point allowing users to then assert their identities to applications.  This frees the application from having to be saddled with storing and managing user credentials as well as improving the user experience, not to mention using these modern protocols is far simpler for your average developer.

Integrating AAD and AWS allow you to take advantage of centralized identity and modern authentication and authorization.  AAD specifically allows you to leverage all the cool features of a modern Identity-as-a-Service (IDaaS) offering such as behavioral analytics, multifactor authentication, adaptive authentication, and contextual-based authorization.  The short of it is you get a rock solid IDaaS to back the industry leading PaaS and IaaS offerings of AWS.  The best of both worlds right?

Now that you understand why you’d want to integrate the two solutions, let’s look at the technology powering the solution. In this integration the vendors are leveraging the concepts of modern APIs and modern authentication and authorization I touched upon above.  First up is authentication.

aws-signon

In this integration SAML, specifically the identity provider-initiated single sign-on POST binding, is being used to assert the user’s identity to the service provider (SP) after the user successfully authenticates with the identity provider (IdP).  Azure AD plays the role of IdP and AWS plays the role of SP.  The sequence of events plays out as follows:

  1. The user navigates to AAD and authenticates using either a credential or an asserted identity from a federated identity store.  The user then selects AWS from the listing of applications exposed through a method like the MyApps portal.  AAD generates an assertion containing a claim of the user’s identity and the AWS Identity and Access Management (IAM) role(s) the user is authorized to use and redirects the user to an endpoint at AWS.
  2. The user’s browser posts the assertion to the endpoint at AWS.
  3. The assertion is passed to the AWS security token service (STS) which checks the assertion to ensure it is from an identity provider that has been configured to be trusted for the AWS account, verifies the roles can be granted to a federated user, and completes the authentication process granting the user access to the AWS management console.(Don’t worry, we’ll dig into this process much more deeply using Fiddler in the next post.)

For provisioning, the AWS API is used.  AAD queries the AWS API using credentials for an AWS security principal that is associated with a role that has the IAMReadOnlyAccess permissions policy or greater.  It queries for the IAM roles configured for the account and synchronizes those roles back to AAD.  When the synchronization is complete, AAD users can then be added to the relevant roles from within AAD creating a one stop shop for doing your identity lifecycle management, authentication, and authorization.  Nice right?

At a high level that is the why and the what. In my upcoming posts in this series I’ll be digging deep into the how.  This will include how to do the integration, the pitfalls of the Microsoft tutorial, and of course Fiddler captures showing the conversations between the web browser, AAD, and AWS.

The journey continues in my second entry.

Deep dive into AD FS and MS WAP – Overview

Hi everyone,

If you’ve followed my blog at all, you will notice I spend a fair amount of my time writing about the products and technologies powering the integration of on-premises and cloud solutions.  The industry refers to that integration using a variety of buzzwords from hybrid cloud to software defined data center/storage/networking/etc.  I prefer a more simple definition of legacy solutions versus modern solutions.

So what do I mean by a modern solution?  I’m speaking of solutions with the following most if not all of these characteristics:

  • Customer maintains only the layers of the technology that directly present business value
  • Short time to market for new features and features are introduced in a “toggle on and toggle off” manner
  • Supports modern authentication, authorization, and identity management standards and specifications such as Open ID Connect, OAuth, SAML, and SCIM
  • On-demand scaling
  • Provides a robust web-based API
  • Customer data can exist on-premises or off-premises

Since I love the identity realm, I’m going to focus on the bullet regarding modern authentication, authorization, and identity management.  For this series of posts I’m going to look at how Microsoft’s Active Directory Federation Service (AD FS)  and Microsoft’s Web Application Proxy (WAP) can be used to help facilitate the use of modern authentication and authorization.

So where does AD FS and the WAP come in?  AD FS provides us with a security token service producing the logical security tokens used in SAML, OAuth, and Open ID Connect.  Why do we care about the MS WAP?  The WAP acts a reverse proxy giving us the ability to securely expose AD FS to untrusted networks (like the Internet) so that devices outside our traditional firewalled security boundary can leverage our modern authentication and authorization solution.

Some real life business cases that can be solved with this solution are:

  1. Single sign-on (SSO) experience to a SaaS application such as SharePoint online from both an Active Directory domain-joined endpoint or a non-domain joined endpoint such as a mobile phone.
  2. Limit the number of passwords a user needs to remember to access both internal and cloud applications.
  3. Provide authentication or authorization for modernized internal applications for endpoints outside the traditional firewalled security boundary.
  4. Authentication and authorization of devices prior to accessing an internal or cloud application.

As we can see from the above, there are some great benefits around SSO, limiting user credentials to improve security and user experience, and taking our authorization to the next step by doing contextual-based authorization (device information, user location, etc) versus relying upon just Active Directory group.

Microsoft does a relatively decent job describing how to design and implement your AD FS and WAP rollout, so I’m not going to cover much of that in this series.  Instead I’m going to focus on the “behind the scenes” conversations that occur with endpoints, WAP, AD FS, AD DS, and Azure AD. Before I begin delving into the weeds of the product, I’m going to spend this post giving an overview of what my lab looks like.

I recently put together a more permanent lab consisting of a mixture of on-premise VMs running on HyperV and Azure resources.  I manage to stay well within my $150.00 MSDN balance by keeping a majority of the VMs deallocated.   The layout of the lab is diagramed below.

HomeLab

 

On-premises I am running a small collection of Windows Server 2016 machines within HyperV running on top of Windows Server 2016.  I’m using a standard setup of an AD DS, AD CS, AADC, AD FS, and IIS/MS SQL server.  Running in Azure I have a single VNet with three subnets each separated by a network security group.  My core infrastructure of an AD DS, IIS/MS SQL, and AD FS server exist in my Intranet subnet with my DMZ subnet containing a single WAP.

The Active Directory configuration consists of a single Active Directory forest with an FQDN of journeyofthegeek.local.  The domain has been configured with an explicit UPN of journeyofthegeek.com which is assigned as the UPN suffix for all users synchronized to Azure Active Directory.  The domain is running in Windows Server 2016 domain and forest functional level.  The on-premises domain controller holds all FSMO roles and acts as the DC for the Active Directory site representing the on-premises physical location.  The domain controller in Azure acts as the sole DC for the Active Directory site representing Azure.  Both DCs host the split-brain DNS zone for journeyofthegeek.com.

The on-premises domain controller also runs Active Directory Certificate Services.  The CA is an enterprise CA that is used to distribute certificates to security principals in the environment.  I’ve removed the CDP from the certificate templates issued by the CA to eliminate complications with the CRL revocation checking.

The AD FS servers are members of an AD FS farm named sts.journeyofthegeek.com and use a MS SQL Server 2016 backend for storage of configuration information.  The SQL Server on-premises hosts the SQL instance that the AD FS users are using to store configuration information.

Azure Active Directory Connect is co-located on the AD FS server and uses the same SQL server as the AD FS uses.  It has been integrated with a lab Azure Active Directory tenant I use which has a few licenses of Office 365 Business Essentials.  The objectGUID attribute is used as the immutable ID and the Azure Active Directory tenant has the DNS namespaces of journeyofthegeek.onmicrosoft.com and journeyofthegeek.com associated with it.

The IIS server running in Azure runs a simple .NET application (https://blogs.technet.microsoft.com/tangent_thoughts/2015/02/20/install-and-configure-a-simple-net-4-5-sample-federated-application-samapp/) that is used for claims-based authentication.  I’ll be using that application for demonstrations with the Web Application Proxy and have used it in the past to demonstrate functionality of the Azure Application Proxy.

For the demonstrations throughout these series I’ll be using the following tools:

In my next post I’ll do a deep dive into what happens behind the scenes during the registration of the Web Application Proxy with an AD FS farm.  See you then!

 

Helpful hints for resolving AD FS problems – Part 1

Hi everyone.

Over the past week I’ve been building a lab for an upcoming deep dive into Microsoft’s Web Application Proxy.  During the course of building the lab I ran into a few interesting issues with AD FS and the Web Application Proxy that I wanted to cover.  Some were similar to issues I’ve run into in production environments and some were new to me.

These issues are interesting in that there aren’t any obvious indicators of the problem in any of the typical logs.  Two out of three required some trial and error to determine root cause, while the third drove me quite insane for a good two weeks before getting an answer from an “official” source.  Over the course of this series of blogs I’ll cover each issue in detail with the hopes that it will help others troubleshoot these issues in the future.

Issue 1: AD FS Certificate authentication fails

I’m going to start with the problem that took me the longest to resolve and eventually required getting the answer directly from an official source.

For those of you that are unfamiliar, AD FS provides the capability to offer multi-factor authentication methods both native and third-party.  Out of the box, it supports certificate-based authentication as an option for a multi-factor or “step-up” authentication mechanism.

A few months back I wanted to take advantage of the certificate authentication feature to provide a two-factor authentication solution for applications integrated with AD FS.  Like a good engineer I did my Googling, read the Microsoft articles and various blogs out there to understand how the feature worked and what the requirements were.  I built a lab in Azure, setup an AD FS server, and ensured port 49443 was open in addition to the the typical ports required by AD FS.  I created my instance of AD CS, issued a user certificate containing the user’s UPN in the subject alternate name field, and setup a sample SAML app and configured it to require Certificate authentication.

How easy it all sounds right?  I navigated to the sample application and got the screen below…

Screen Shot 2017-06-04 at 9.29.35 PM

and I waited….  and waited…. and waited…  Ummm, what went wrong?  Well surely the AD FS log will tell me what happened.

Screen Shot 2017-06-04 at 9.34.03 PM.png

Well isn’t that odd.  No errors or warnings in the AD FS Admin log.  A quick check of the Application and System logs showed no errors either.  Maybe the AD FS Debug log would show me something?  I flipped on the log and attempted another authentication.

Screen Shot 2017-06-04 at 9.38.07 PM

Nothing as well?  Maybe the server can’t query the revocation lists designated in the certificates CDP?  Nope, not that either the server can successfully contact the CDP endpoints.  At this point I began to get quite frustrated and attempted packet captures, Fiddler captures, and anything and everything I could think of.  Nothing I tried revealed the answer.

I finally gave in (which I can tell you is incredibly challenging for me) and reached out to an “official” source.  We chatted back and forth and went through much of the same steps as outlined above to ensure I didn’t miss anything.  However, we ran into another dead end.  He then reached out to some other engineers he knew and eventually we got a hit.  We were told to check to see if there were any intermediary certificates stored within the trusted root certificate authorities store.  Sounds like an odd circumstance, but sure why not.

Upon opening up the certificates MMC, opening the machine store, and exploring the trusted root certificate authorities store low and behold I see an intermediary certificate within the store.  I deleted the certificate, restarted the AD FS server and attempted another login to the sample claim application and hit the screen below.

Screen Shot 2017-06-04 at 9.50.16 PM

Boom, I’m finally receiving the certificate prompt.  Clicking the OK button brings about the successful login below.

Screen Shot 2017-06-04 at 9.51.23 PM

So what was the issue?  Apparently AD FS certificate authentication fails without generating an error in any logical location (maybe nowhere at all?) if there is an intermediary certificate in the trusted root certificate authority machine store.  I’ve verified this is an issue in both AD FS 2012 R2 and AD FS 2016.  Now why this occurs is unknown to me.  It could be the underlining HTTPS.SYS driver that pukes and doesn’t report any errors to the event logs.  I didn’t get a straight answer as to why this occurs, just that it will due to some type of integrity check on the machine certificate store.  Odd right?

That completes the rundown of the first of three problems I’ll be outlining in this series of blogs.  Hopefully this helps save someone else some time and aggravation.

See you next post!

 

 

Active Directory Federation Services – SQL Attribute Store

Active Directory Federation Services – SQL Attribute Store

Hi everyone,

I recently had a use case come across my desk where I needed to do a SAML integration with a SaaS provider.  The provider required a number of pieces of information about the user beyond the standard unique identifier.  The additional information would be used to direct the user to the appropriate instance of the SaaS application.

In the past fifty or so SAML integrations I’ve done, I’ve been able to source my data directly from the Active Directory store.  This was because Active Directory was authoritative for the data or there was a reliable data synchronization process in place such that the data was being sourced from an authoritative source.  In this scenario, neither options was available.  Thankfully the data source I needed to hit to get the missing data exposed a subset of its data through a Microsoft SQL view.

I have done a lot in AD FS over the past few years from design to operational support of the service, but I had never sourced information from a data source hosted via MS SQL Server.  I reviewed the Microsoft documentation available via TechNet and found it to be lacking.  Further searches across MS blogs and third-party blogs provided a number of “bits” of information but no real end to end guide.  Given the lack of solid content, I decided it would be fun to put one together so off to Azure I went.

For the lab environment, I built the following:

  • Active Director forest name – geekintheweeds.com
  • Server 1 – SERVERDC (Windows Server 2016)
    • Active Directory Domain Services
    • Active Directory Domain Naming Services
    • Active Directory Certificate Services
  • Server 2 – SERVER-ADFS (Windows Server 2016)
    • Active Directory Federation Services
    • Microsoft SQL Server Express 2016
  • Server 3 – SERVER-WEB (Windows Server 2016)
    • Microsoft IIS

On SERVER-WEB I installed the sample claims application referenced here.  Make sure to follow the instructions in the blog to save yourself some headaches.  There are plenty of blogs out there that discuss building a lab consisting the of the services outlined above, so I won’t cover those details.

On SERVER-ADFS I created a database named hrdb within the same instance as the AD FS databases.  Within the database I created a table named dbo.EmployeeInfo with 5 columns named givenName, surName, email, userName, and role all of data type nvchar(MAX).  The userName column contained the unique values I used to relate a user object in Active Directory back to a record in the SQL database.

Screen Shot 2017-05-28 at 9.18.37 PM

Once the database was created and populated with some sample data and the appropriate Active Directory user objects were created, it was time to begin to configure the connectivity between AD FS and MS SQL.  Before we go creating the new attribute store, the AD FS service account needs appropriate permissions to access the SQL database.  I went the easy route and gave the service account the db_datareader role on the database, although the CONNECT and SELECT permissions would have probably been sufficient.

Screen Shot 2017-05-28 at 9.23.49 PM

After the service account was given appropriate permissions the next step was to configure it as an attribute store in AD FS.  To that I opened the AD FS management console, expanded the service node, and right-clicked on the Attribute Store and selected the Add Attribute Store option.  I used mysql  as the store name and selected SQL option from the drop-down box.  My SQL was a bit rusty so the connection string took a few tries to get right.

Screen Shot 2017-05-28 at 9.28.35 PM

I then created a new claim description to hold the role information I was pulling from the SQL database.

Screen Shot 2017-05-28 at 9.33.12 PM.png

The last step in the process was to create some claim rules to pull data from the SQL database.  Pulling data from a MS SQL datastore requires the use of custom claim rules.  If you’re unfamiliar with the custom claim language, the following two links are two of the best I’ve found on the net:

The first claim rule I created was a rule to query Active Directory via LDAP for the SAM-Account-Name attribute.  This is the attribute I would be using to query the SQL database for the user’s unique record.

Screen Shot 2017-05-28 at 9.42.05 PM.png

Next up I had my first custom claim rule where I queried the SQL database for the value in the userName column for the value of the SAM-Account-Name I pulled from earlier step and I requested back the value in the email column of the record that was returned. Since I wanted to do some transforming of the information in a later step, I added the claim to incoming claim set.

Screen Shot 2017-05-28 at 9.42.39 PM

I then issued another query for the value in the role column.

Screen Shot 2017-05-28 at 9.48.14 PM

Finally, I performed some transforms to verify I was getting the appropriate data that I wanted.  I converted the email address claim type to the Common Name type and the custom claim definition role I referenced above to the out of the box role claim definition.  I then hit the endpoint for the sample claim app and… VICTORY!

Screen Shot 2017-05-28 at 9.52.29 PM

Simple right?  Well it would be if this information had been documented within a single link.  Either way, I had some good lessons learned that I will share with you now:

  • Do NOT copy and paste claim rules.  I chased a number of red herrings trying to figure out why my claim rule was being rejected.  More than likely the copy/paste added an invalid character I was unable to see.
  • Brush up on your MS SQL before you attempt this.  My SQL was super rusty and it caused me to go down a number of paths which wasted time.  Thankfully, my worker Jeff Lee was there to add some brain power and help work through the issues.

Before I sign off, I want to thank my coworker Jeff Lee for helping out on this one.  It was a great learning experience for both of us.

Thanks and have a wonderful Memorial Day!