Deep Dive into Azure AD and AWS SSO Integration – Part 2

Deep Dive into Azure AD and AWS SSO Integration – Part 2

Welcome back folks.

Today I’ll be continuing my series on the new integration between Azure AD and AWS SSO.  In my last post I covered the challenges with the prior integration between the two platforms, core AWS concepts needed to understand the new integration, and how the new integration addresses the challenges of the prior integration.

In this post I’m going to give some more context to the challenges covered in the first post and then provide an overview of the what the old and new patterns look like.  This will help clarify the value proposition of the integration for those of you who may still not be convinced.

The two challenges I want to focus on are:

  1. The AWS app was designed to synchronize identity data between AWS and Azure AD for a single AWS account
  2. The SAML trust between Azure AD and an AWS account had to be established separately for each AWS account.

Challenge 1 was unique to the Azure Marketplace AWS app because they were attempting to solve the identity lifecycle management problem.  Your security token service (STS) needs to pass a SAML assertion which includes the AWS IAM roles you are asserting for the user.  Those roles need to be mapped to the user somewhere for your STS to tap into them.  This is a problem you’re going to feel no matter what STS you use, so I give the team that put together the AWS app together credit for trying.

The folks over at AWS came up with an elegant solution requiring some transformation in the claims passed in the SAML token and another solution to store the roles in commonly unused attributes in Active Directory.  However, both solutions suffered the same problem in that you’re forced to workaround that mapping, which becomes considerably difficult as you began to scale to hundreds of AWS accounts.

Challenge 2 plagues all STSs because the SAML trust needs to be created for each and every AWS account.  Again, something that begins to get challenging as you scale.

AWS Past Integration

AWS Past Integration

In the image above, we see an example of how some enterprises addressed these problems.  We see that there is some STS in use acting as an identity provider (idP) (could be Azure AD, Okta, Ping, AD FS, whatever) that has a SAML trust with each AWS account.  The user to AWS IAM role mappings are included in an attribute of the user’s Active Directory user account.  When the user attempts to access AWS, the STS queries Active Directory for the information.  There is a custom process (manual or automated) that queries each AWS account for a list of AWS IAM Roles that are associated with the IdP in the AWS account.  These roles are then populated in the attribute for each relevant user account.  Lastly, CloudFormation is used to push IAM Roles to each AWS account.  This could be pushed through a manual process or a CI/CD pipeline.

Yeah this works, but who wants all that overhead?  Let’s look at the new method.

Azure AD and AWS SSO Integration

Azure AD and AWS SSO Integration

In the new integration where we use Azure AD and AWS SSO together, we now only need to establish a single SAML trust with AWS SSO.  Since AWS SSO is integrated with AWS Organizations it can be used as a centralized identity source for all AWS accounts within the organization.  Additionally, we can now leverage Azure AD to manage the synchronization of identity data (users and groups) from Azure AD to AWS SSO.  We then map our users or groups to permission sets (collections of IAM policies) in AWS SSO which are then provisioned as IAM roles in the relevant AWS accounts.  If we want to add a user to a role in AWS IAM, we can add that user to the relevant group in Azure AD and wait for the synchronization process to occur.  Once it’s complete, that user will have access to that IAM role in the relevant accounts.  A lot less work, right?

Let’s sum up what changes here:

  • We can use existing processes already in place to move users in and out of groups either on-premises in Windows AD (that is syncing to Azure AD with Azure AD Connect) or directly in Azure AD (if we’re not syncing from Windows AD).
  • Group to role mappings are now controlled in AWS SSO
  • Permission sets (or IAM policies for the IAM roles) are now centralized in AWS SSO
  • We no longer have to provision the IAM roles individually into each AWS account, we can centrally control it in AWS SSO

Cool right?

In my few posts I’ll begin walking through the integration an demonstrating some the solution.

Thanks!

Deep Dive into Azure AD and AWS SSO Integration – Part 1

Deep Dive into Azure AD and AWS SSO Integration – Part 1

Hello fellow geeks!

Back in 2017 I did a series of posts on how to integrate Azure AD using the AWS app available in the Azure Marketplace with AWS IAM in order to use Azure AD as an identity provider for an AWS account.  The series has remained quite popular over the past two years, largely because the integration has remained the same without much improvement.  All of this changed last week when AWS released support for integration between Azure AD and AWS SSO.

The past integration between the two platforms functioned, but suffered from three primary challenges:

  1. The AWS app was designed to synchronize identity data between AWS and Azure AD for a single AWS account
  2. The SAML trust between Azure AD and an AWS account had to be established separately for each AWS account.
  3. The application manifest file used by the AWS app to establish a mapping of roles between Azure AD and synchronized AWS IAM roles had a limitation of 1200 which didn’t scale for organizations with a large AWS footprint.

To understand these challenges, I’m going to cover some very basic AWS concepts.

The most basic component an AWS presence is an AWS account.  Like an Azure subscription, it represents a billing relationship, establishes limitations for services, and acts as an authorization boundary.  Where it differs from an Azure subscription is that each AWS account has a separate identity and authentication boundary.

While multiple Azure subscriptions can be associated with a single instance of Azure AD to centralize identity and authentication, the same is not true for AWS.  Each AWS account has its own instance of AWS IAM with its own security principals and no implicit trust with any other account.

Azure Subscription Identity vs AWS Account Identity

Azure Subscription Identity vs AWS Account Identity

Since there is no implicit trust between accounts, that trust needs to be manually established by the customer.  For example, if a customer wants bring their own identities using SAML, they need to establish a SAML trust with each AWS account.

SAML Trusts with each AWS Account

SAML Trusts with each AWS Account

This is nice from a security perspective because you have a very clear security boundary that you can use effectively to manage blast radius.  This is paramount in the cloud from a security standpoint.  In fact, AWS best practice calls for separate accounts to mitigate risks to workloads of different risk profiles.  A common pattern to align with this best practice is demonstrated in the AWS Landing Zone documentation.  If you’re interested in a real life example of what happens when you don’t establish a good radius, I encourage you to read the cautionary tale of Code Spaces.

AWS Landing Zone

AWS Landing Zone

However, it doesn’t come without costs because each AWS IAM instance needs to be managed separately.  Prior to the introduction of AWS SSO (which we’ll cover later), you as the customer would be on the hook for orchestrating the provisioning of security principals (IAM Users, groups, roles, and identity providers) in every account.  Definitely doable, but organizations skilled at identity management are few and far between.

Now that you understand the importance of having multiple AWS accounts and that each AWS account has a separate instance of AWS IAM, we can circle back to the challenges of the past integration.  The AWS App available in the Azure Marketplace has a few significant gaps

The app is designed to simplify the integration with AWS by providing the typical “wizard” type experience Microsoft so loves to provide.  Plug in a few pieces of information and the SAML trust between Azure AD and your AWS account is established on the Azure AD end to support an identity provider initiated SAML flow.  This process is explained in detail in my past blog series.

In addition to easing the SAML integration, it also provides a feature to synchronize AWS IAM roles from an AWS account to the application manifest file used by the AWS app.  The challenges here are two-fold: one is the application manifest file has a relatively small limit of entries; the other is the synchronization process only supports a single AWS account.  These two gaps make it unusable by most enterprises.

Azure AWS Application Sync Process

Azure Marketplace AWS Application Sync Process

Both Microsoft and AWS have put out workarounds to address the gaps.  However, the workarounds require the customer to either develop or run custom code and additional processes and neither addresses the limitation of the application manifest.  This lead to many organizations to stick with their on-premises security token service (AD FS, Ping, etc) or going with another 3rd party IDaaS (Okta, Centrify, etc).  This caused them to miss out on the advanced features of Azure AD, some of which they were more than likely already paying for via the use of Office 365.  These features include adaptive authentication, contextual authorization, and modern multi-factor authentication.

AWS recognized the challenge organizations were having managing AWS accounts at scale and began introducing services to help enterprises manage the ever growing AWS footprint.  The first service was AWS Organizations.  This service allowed enterprises to centralize some management operations, consolidate billing, and group accounts together for billing or security and compliance.  For those of you from the Azure world, the concept is similar to the benefits of using Azure Management Groups and Azure Policy.  This was a great start, but the platform still lacked a native solution for centralized identity management.

AWS Organization

AWS Organization

At the end of 2017, AWS SSO was introduced.  Through integration with AWS Organizations, AWS SSO has the ability to enumerate all of the AWS accounts associated with an Organization and act as a centralized identity, authentication, and authorization plane.

While the product had potential, at the time of its release it only supported scenarios where users and groups were created directly in the AWS SSO directory or were sourced from an AWS Managed AD or customer-managed AD using the LDAP connector.  It lacked support for acting as a SAML service provider to a third-party identity provider.  Since the service lacks the features of most major on-premises security token services and IDaaS providers, many organizations kept to the standard pattern of managing identity across their AWS accounts using their own solutions and processes.

Fast forward to last week and AWS announced two new features for AWS SSO.  The first feature is that it can now act as a SAML service provider to Azure AD (YAY!).  By federating directly with AWS SSO, there is no longer a requirement to federate Azure AD which each individual AWS account.

The second feature got me really excited and that was support for the System for Cross-domain Identity Management (SCIM) specification through the addition of SCIM endpoints.  If you’re unfamiliar SCIM, it addresses a significant gap in IAM in the cloud world, and that is identity management.  If you’ve ever integrated with any type of cloud service, you are more than likely aware of the pains of having to upload CSVs or install custom vendor connectors in order to provision security principals into a cloud identity store.  SCIM seeks to solve that problem by providing a specification for a REST API that allows for management of the lifecycle of security principals.

Support for this feature, along with Azure AD’s longtime support for SCIM, allows Azure AD to handle the identity lifecycle management of the shadow identities in AWS SSO which represent Azure AD Users and Groups.  This is an absolutely awesome feature of Azure AD and I’m thrilled to see that AWS is taking advantage of it.

Well folks, that will close out this entry in the series.  Over the next few posts I’ll walk through what the integration and look behind the curtains a bit with my go to tool Fiddler.

See you next post!

 

Capturing and Visualizing Office 365 Security Logs – Part 2

Capturing and Visualizing Office 365 Security Logs – Part 2

Hello again my fellow geeks.

Welcome to part two of my series on visualizing Office 365 security logs.  In my last post I walked through the process of getting the sign-in and security logs and provided a link to some Lambda’s I put together to automate pulling them down from Microsoft Graph.  Recall that the Lambda stores the files in raw format (with a small bit of transformation on the time stamps) into Amazon S3 (Simple Storage Service).  For this demonstration I modified the parameters for the Lambda to download the 30 days of the sign-in logs and to store them in an S3 bucket I use for blog demos.

When the logs are pulled from  Microsoft Graph they come down in JSON (JavaScript Object Notation) format.  Love JSON or hate it is the common standard for exchanging information these days.  The schema for the JSON representation of the sign-in logs is fairly complex and very nested because there is a ton of great information in there.  Thankfully Microsoft has done a wonderful job of documenting the schema.  Now that we have the logs and the schema we can start working with the data.

When I first started this effort I had put together a Python function which transformed the files into a CSV using pipe delimiters.  As soon as I finished the function I wondered if there was an alternative way to handle it.  In comes Amazon Athena to the rescue with its Openx-JsonSerDe library.  After reading through a few blogs (great AWS blog here), StackOverflow posts, and the official AWS documentation I was ready to put something together myself.  After some trial and error I put together a working DDL (Data Definition Language) statement for the data structure.  I’ve made the DDLs available on Github.

Once I had the schema defined, I created the table in Athena.  The official AWS documentation does a fine job explaining the few clicks that are provided to create a table, so I won’t re-create that here.  The DDLs I’ve provided you above will make it a quick and painless process for you.

Let’s review what we’ve done so far.  We’ve setup a reoccurring job that is pulling the sign-in and audit logs via the API and is dumping all that juicy data into cheap object storage which we can further enforce lifecycle policies for.  We’ve then defined the schema for the data and have made it available via standard SQL queries.  All without provisioning a server and for pennies on the dollar.  Not to shabby!

At this point you can use your analytics tool of choice whether it be QuickSight, Tableau, PowerBi, or the many other tools that have flooded the market over the past few years.  Since I don’t make any revenue from these blog posts, I like to go the cheap and easy route of using Amazon QuickSight.

After completing the initial setup of QuickSight I was ready to go.  The next step was to create a new data set.  For that I clicked the Manage Data button and selected New Data Set.

Screen Shot 2019-01-31 at 8.57.15 PM.png

On the Create a Data Set screen I selected the Athena option and created a name for the data source.

screenshot2019-01-31at9.01.48pm

From there I selected the database in Athena which for me was named azuread.  The tables within the database are then populated and I chose the tbl_signin_demo which points to the test S3 bucket I mentioned previously.

Screen Shot 2019-01-31 at 9.04.22 PM.png

Due to the complexity of the data structure I opted to use a custom SQL query.  There is no reason why you couldn’t create the table I’m about to create in Athena and then connect to that table instead to make it more consumable for a wider array of users.  It’s really up to you and I honestly don’t know what the appropriate “big data” way of doing it is.  Either way, those of you with real SQL skills may want to look away from this query lest you experience a Raiders of The Lost Ark moment.

indianjones

You were warned.

SELECT records.id, records.createddatetime, records.userprincipalname, records.userDisplayName, records.userid, records.appid, records.appdisplayname, records.ipaddress, records.clientappused, records.mfadetail.authdetail AS mfadetail_authdetail, records.mfadetail.authmethod AS mfadetail_authmethod, records.correlationid, records.conditionalaccessstatus, records.appliedconditionalaccesspolicy.displayname AS cap_displayname, array_join(records.appliedconditionalaccesspolicy.enforcedgrantcontrols,' ') AS cap_enforcedgrantcontrols, array_join(records.appliedconditionalaccesspolicy.enforcedsessioncontrols,' ') AS cap_enforcedsessioncontrols, records.appliedconditionalaccesspolicy.id AS cap_id, records.appliedconditionalaccesspolicy.result AS cap_result, records.originalrequestid, records.isinteractive, records.tokenissuername, records.tokenissuertype, records.devicedetail.browser AS device_browser, records.devicedetail.deviceid AS device_id, records.devicedetail.iscompliant AS device_iscompliant, records.devicedetail.ismanaged AS device_ismanaged, records.devicedetail.operatingsystem AS device_os, records.devicedetail.trusttype AS device_trusttype,records.location.city AS location_city, records.location.countryorregion AS location_countryorregion, records.location.geocoordinates.altitude, records.location.geocoordinates.latitude, records.location.geocoordinates.longitude,records.location.state AS location_state, records.riskdetail, records.risklevelaggregated, records.risklevelduringsignin, records.riskstate, records.riskeventtypes, records.resourcedisplayname, records.resourceid, records.authenticationmethodsused, records.status.additionaldetails, records.status.errorcode, records.status.failurereason  FROM "azuread"."tbl_signin_demo" CROSS JOIN (UNNEST(value) as t(records))

This query will de-nest the data and give you a detailed (possibly extremely large depending on how much data you are storing) parsed table. I was now ready to create some data visualizations.

The first visual I made was a geospatial visual using the location data included in the logs filtered to failed logins. Not surprisingly our friends in China have shown a real interest in my and my wife’s Office 365 accounts.

screenshot2019-01-31at9.26.24pm

Next up I was interested in seeing if there were any patterns in the frequency of the failed logins.  For that I created a simple line chart showing the number of failed logins per user account in my tenant.  Interestingly enough the new year meant back to work for more than just you and me.

screenshot2019-01-31at9.28.45pm

Like I mentioned earlier Microsoft provides a ton of great detail in the sign-in logs.  Beyond just location, they also provide reasons for login failures.  I next created a stacked bar chat to show the different reasons for failed logs by user.  I found the blocked sign-ins by malicious IPs interesting.  It’s nice to know that is being tracked and taken care of.

screenshot2019-01-31at9.31.24pm

Failed logins are great, but the other thing I was interested in is successful logins and user behavior.  For this I created a vertical stacked bar chart that displayed the successful logins by user by device operating system (yet more great data captured in the logs).  You can tell from the bar on the right my wife is a fan of her Mac!

screenshot2019-01-31at9.38.02pm

As I gather more data I plan on creating some more visuals, but this was great to start.  The geo-spatial one is my favorite.  If you have access to a larger data set with a diverse set of users your data should prove fascinating.  Definitely share any graphs or interesting data points you end up putting together if you opt to do some of this analysis yourself.  I’d love some new ideas!

That will wrap up this series.  As you’ve seen the modern tool sets available to you now can do some amazing things for cheap without forcing you to maintain the infrastructure behind it.  Vendors are also doing a wonderful job providing a metric ton of data in their logs.  If you take the initiative to understand the product and the data, you can glean some powerful information that has both security and business value.  Even better, you can create some simple visuals to communicate that data to a wide variety of audiences making it that much more valuable.

Have a great weekend!

 

Capturing and Visualizing Office 365 Security Logs – Part 1

Welcome back again my fellow geeks!

I’ve been busy over the past month nerding out on some pet projects.  I thought it would be fun to share one of those pet projects with you.  If you had a chance to check out my last series, I walked through my first Python experiment which was to write a re-usable tool that could be used to pull data from Microsoft’s Graph API (Microsoft Graph).

For those of you unfamiliar with Microsoft Graph, it’s the Restful API (application programming interface) that is used to interact with Microsoft cloud offerings such as Office 365 and Azure.  You’ve probably been interacting with it without even knowing it if through the many PowerShell modules Microsoft has released to programmatically interact with those services.

One of the many resources which can be accessed through Microsoft Graph are Azure AD (Active Directory) security and audit reports.  If you’re using Office 365, Microsoft Azure, or simply Azure AD as an identity platform for SSO (single sign-on) to third-party applications like SalesForce, these reports provide critical security data.  You’re going to want to capture them, store them, and analyze them.  You’re also going to have to account for the window that Microsoft makes these logs available.

The challenge is they are not available via the means logs have traditionally been captured on-premises by using syslogd, installing an SIEM agent, or even Windows Event Log Forwarding.  Instead you’ll need to take a step forward in evolving the way you’re used to doing things. This is what moving to the cloud is all about.

Microsoft allows you to download the logs manually via the Azure Portal GUI (graphical user interface) or capture them by programmatically interacting with Microsoft Graph.  While the former option may work for ad-hoc use cases, it doesn’t scale.  Instead we’ll explore the latter method.

If you have an existing enterprise-class SIEM (Security Information and Event Management) solution such as Splunk, you’ll have an out of box integration.  However, what if you don’t have such a platform, your organization isn’t yet ready to let that platform reach out over the Internet, or you’re interested in doing this for a personal Office 365 subscription?  I fell into the last category and decided it would be an excellent use case to get some experience with Python, Microsoft Graph, and take advantage of some of the data services offered by AWS (Amazon Web Services).   This is the use case and solution I’m going to cover in this post.

Last year I had a great opportunity to dig into operational and security logs to extract useful data to address some business problems.  It was my first real opportunity to examine large amounts of data and to create different visualizations of that data to extract useful trends about user and application behavior.  I enjoyed the hell out of it and thought it would be fun to experiment with my own data.

I decided that my first use case would be Office 365 security logs.  As I covered in my last series my wife’s Office 365 account was hacked.  The damage was minor as she doesn’t use the account for much beyond some crafting sites (she’s a master crocheter as you can see from the crazy awesome Pennywise The Clown she made me for Christmas).

img_4301

The first step in the process was determining an architecture for the solution.  I gave myself a few requirements:

  1. The solution must not be dependent on my home lab infrastructure
  2. Storage for the logs must be cheap and readily available
  3. The credentials used in my Python code needs to be properly secured
  4. The solution must be automated and notify me of failures
  5. The data needs to be available in a form that it can be examined with an analytics solution

Based upon the requirements I decided to go the serverless (don’t hate me for using that tech buzzword 🙂 ) route.  My decisions were:

  • AWS Lambda would run my code
  • Amazon CloudWatch Events would be used to trigger the Lambda once a day to download the last 24 hours of logs
  • Amazon S3 (Simple Storage Service) would store the logs
  • AWS Systems Manager Parameter Store would store the parameters my code used leveraging AWS KMS (Key Management Service) to encrypt the credentials used to interact with Microsoft Graph
  • Amazon Athena would hold the schema for the logs and make the data queryable via SQL
  • Amazon QuickSight would be used to visualize the data by querying Amazon Athena

The high level architecture is pictured below.

untitled

I had never done a Lambda before so I spent a few days looking at some examples and doing the typical Hello World that we all do when we’re learning something new.  From there I took the framework of Python code I put together for general purpose queries to the Microsoft Graph, and adapted it into two Lambdas.  One Lambda would pull Sign-In logs while the other would pull Audit Logs.  I also wanted a repeatable way to provision the Lambdas to share with others and get some CloudFormation practice and brush up on my very dusty Bash scripting.   The results are located here in one of my Github repos.

I’m going to stop here for this post because we’ve covered a fair amount of material.  Hopefully after reading this post you understand that you have to take a new tact with getting logs for cloud-based services such as Azure AD.  Thankfully the cloud has brought us a whole new toolset we can use to automate the extraction and storage of those logs in a simple and secure manner.

In my next post I’ll walk through how I used Athena and QuickSight to put together some neat dashboards to satisfy my nerdy interests and get better insight into what’s happening on a daily basis with my Office 365 subscription.

See you next post and go Pats!