Capturing and Visualizing Office 365 Security Logs – Part 1

Welcome back again my fellow geeks!

I’ve been busy over the past month nerding out on some pet projects.  I thought it would be fun to share one of those pet projects with you.  If you had a chance to check out my last series, I walked through my first Python experiment which was to write a re-usable tool that could be used to pull data from Microsoft’s Graph API (Microsoft Graph).

For those of you unfamiliar with Microsoft Graph, it’s the Restful API (application programming interface) that is used to interact with Microsoft cloud offerings such as Office 365 and Azure.  You’ve probably been interacting with it without even knowing it if through the many PowerShell modules Microsoft has released to programmatically interact with those services.

One of the many resources which can be accessed through Microsoft Graph are Azure AD (Active Directory) security and audit reports.  If you’re using Office 365, Microsoft Azure, or simply Azure AD as an identity platform for SSO (single sign-on) to third-party applications like SalesForce, these reports provide critical security data.  You’re going to want to capture them, store them, and analyze them.  You’re also going to have to account for the window that Microsoft makes these logs available.

The challenge is they are not available via the means logs have traditionally been captured on-premises by using syslogd, installing an SIEM agent, or even Windows Event Log Forwarding.  Instead you’ll need to take a step forward in evolving the way you’re used to doing things. This is what moving to the cloud is all about.

Microsoft allows you to download the logs manually via the Azure Portal GUI (graphical user interface) or capture them by programmatically interacting with Microsoft Graph.  While the former option may work for ad-hoc use cases, it doesn’t scale.  Instead we’ll explore the latter method.

If you have an existing enterprise-class SIEM (Security Information and Event Management) solution such as Splunk, you’ll have an out of box integration.  However, what if you don’t have such a platform, your organization isn’t yet ready to let that platform reach out over the Internet, or you’re interested in doing this for a personal Office 365 subscription?  I fell into the last category and decided it would be an excellent use case to get some experience with Python, Microsoft Graph, and take advantage of some of the data services offered by AWS (Amazon Web Services).   This is the use case and solution I’m going to cover in this post.

Last year I had a great opportunity to dig into operational and security logs to extract useful data to address some business problems.  It was my first real opportunity to examine large amounts of data and to create different visualizations of that data to extract useful trends about user and application behavior.  I enjoyed the hell out of it and thought it would be fun to experiment with my own data.

I decided that my first use case would be Office 365 security logs.  As I covered in my last series my wife’s Office 365 account was hacked.  The damage was minor as she doesn’t use the account for much beyond some crafting sites (she’s a master crocheter as you can see from the crazy awesome Pennywise The Clown she made me for Christmas).

img_4301

The first step in the process was determining an architecture for the solution.  I gave myself a few requirements:

  1. The solution must not be dependent on my home lab infrastructure
  2. Storage for the logs must be cheap and readily available
  3. The credentials used in my Python code needs to be properly secured
  4. The solution must be automated and notify me of failures
  5. The data needs to be available in a form that it can be examined with an analytics solution

Based upon the requirements I decided to go the serverless (don’t hate me for using that tech buzzword 🙂 ) route.  My decisions were:

  • AWS Lambda would run my code
  • Amazon CloudWatch Events would be used to trigger the Lambda once a day to download the last 24 hours of logs
  • Amazon S3 (Simple Storage Service) would store the logs
  • AWS Systems Manager Parameter Store would store the parameters my code used leveraging AWS KMS (Key Management Service) to encrypt the credentials used to interact with Microsoft Graph
  • Amazon Athena would hold the schema for the logs and make the data queryable via SQL
  • Amazon QuickSight would be used to visualize the data by querying Amazon Athena

The high level architecture is pictured below.

untitled

I had never done a Lambda before so I spent a few days looking at some examples and doing the typical Hello World that we all do when we’re learning something new.  From there I took the framework of Python code I put together for general purpose queries to the Microsoft Graph, and adapted it into two Lambdas.  One Lambda would pull Sign-In logs while the other would pull Audit Logs.  I also wanted a repeatable way to provision the Lambdas to share with others and get some CloudFormation practice and brush up on my very dusty Bash scripting.   The results are located here in one of my Github repos.

I’m going to stop here for this post because we’ve covered a fair amount of material.  Hopefully after reading this post you understand that you have to take a new tact with getting logs for cloud-based services such as Azure AD.  Thankfully the cloud has brought us a whole new toolset we can use to automate the extraction and storage of those logs in a simple and secure manner.

In my next post I’ll walk through how I used Athena and QuickSight to put together some neat dashboards to satisfy my nerdy interests and get better insight into what’s happening on a daily basis with my Office 365 subscription.

See you next post and go Pats!

Using Python to Pull Data from MS Graph API – Part 2

Using Python to Pull Data from MS Graph API – Part 2

Welcome back my fellow geeks!

In this series I’m walking through my experience putting together some code to integrate with the Microsoft Graph API (Application Programming Interface).  In the last post I covered the logic behind this pet project and the tools I used to get it done.  In this post I’ll be walking through the code and covering what’s happening behind the scenes.

The project consists of three files.  The awsintegration.py file contains functions for the integration with AWS Systems Manager Parameter Store and Amazon S3 using the Python boto3 SDK (Software Development Kit).  Graphapi.py contains two functions.  One function uses Microsoft’s Azure Active Directory Library for Python (ADAL) and the other function uses Python’s Requests library to make calls to the MS Graph API.  Finally, the main.py file contains the code that brings everything together. There are a few trends you’ll notice with all of the code. First off it’s very simple since I’m a long way from being able to do any fancy tricks and the other is I tried to stay away from using too many third-party modules.

Let’s first dig into the awsintegration.py module.  In the first few lines above I import the required modules which include AWS’s Boto3 library.

import json
import boto3
import logging

Python has a stellar standard logging module that makes logging to a centralized location across a package a breeze.  The line below configures modules called by the main package to inherit the logging configuration from the main package.  This way I was able to direct anything I wanted to log to the same log file.

log = logging.getLogger(__name__)

This next function uses Boto3 to call AWS Systems Manager Parameter Store to retrieve a secure string.  Be aware that if you’re using Parameter Store to store secure strings the security principal you’re using to make the call (in my case an IAM User via Cloud9) needs to have appropriate permissions to Parameter Store and the KMS CMK.  Notice I added a line here to log the call for the parameter to help debug any failures.  Using the parameter store with Boto3 is covered in detail here.

def get_parametersParameterStore(parameterName,region):
    log.info('Request %s from Parameter Store',parameterName)
    client = boto3.client('ssm', region_name=region)
    response = client.get_parameter(
        Name=parameterName,
        WithDecryption=True
    )
    return response['Parameter']['Value']

The last function in this module again uses Boto3 to upload the file to an Amazon S3 bucket with a specific prefix.  Using S3 is covered in detail here.

def put_s3(bucket,prefix,region,filename):
    s3 = boto3.client('s3', region_name=region)
    s3.upload_file(filename,bucket,prefix + "/" + filename)

Next up is the graphapi.py module.  In the first few lines I again import the necessary modules as well as the AuthenticationContext module from ADAL.  This module contains the AuthenticationContext class which is going to get the OAuth 2.0 access token needed to authenticate to the MS Graph API.

import json
import requests
import logging
from adal import AuthenticationContext

log = logging.getLogger(__name__)

In the function below an instance of the AuthenticationContext class is created and the acquire_token_with_client_credentials method is called.   It uses the OAuth 2.0 Client Credentials grant type which allows the script to access the MS Graph API without requiring a user context.  I’ve already gone ahead and provisioned and authorized the script with an identity in Azure AD and granted it the appropriate access scopes.

Behind the scenes Azure AD (authorization server in OAuth-speak) is contacted and the script (client in OAuth-speak) passes a unique client id and client secret.  The client id and client secret are used to authenticate the application to Azure AD which then looks within its directory to determine what resources the application is authorized to access (scope in OAuth-speak).  An access token is then returned from Azure AD which will be used in the next step.

def obtain_accesstoken(tenantname,clientid,clientsecret,resource):
    auth_context = AuthenticationContext('https://login.microsoftonline.com/' +
        tenantname)
    token = auth_context.acquire_token_with_client_credentials(
        resource=resource,client_id=clientid,
        client_secret=clientsecret)
    return token

A properly formatted header is created and the access token is included. The function checks to see if the q_param parameter has a value and it if does it passes it as a dictionary object to the Python Requests library which includes the key values as query strings. The request is then made to the appropriate endpoint. If the response code is anything but 200 an exception is raised, written to the log, and the script terminates.  Assuming a 200 is received the Python JSON library is used to parse the response.  The JSON content is searched for an attribute of @odata.nextLink which indicates the results have been paged.  The function handles it by looping until there are no longer any paged results.  It additionally combines the paged results into a single JSON array to make it easier to work with moving forward.

def makeapirequest(endpoint,token,q_param=None):
 
    headers = {'Content-Type':'application/json', \
    'Authorization':'Bearer {0}'.format(token['accessToken'])}

    log.info('Making request to %s...',endpoint)
        
    if q_param != None:
        response = requests.get(endpoint,headers=headers,params=q_param)
        print(response.url)
    else:
        response = requests.get(endpoint,headers=headers)    
    if response.status_code == 200:
        json_data = json.loads(response.text)
            
        if '@odata.nextLink' in json_data.keys():
            log.info('Paged result returned...')
            record = makeapirequest(json_data['@odata.nextLink'],token)
            entries = len(record['value'])
            count = 0
            while count < entries:
                json_data['value'].append(record['value'][count])
                count += 1
        return(json_data)
    else:
        raise Exception('Request failed with ',response.status_code,' - ',
            response.text)

Lastly there is main.py which stitches the script together.  The first section adds the modules we’ve already covered in addition to the argparse library which is used to handle arguments added to the execution of the script.

import json
import requests
import logging
import time
import graphapi
import awsintegration
from argparse import ArgumentParser

A simple configuration for the logging module is setup instructing it to write to the msapiquery.log using a level of INFO and applies a standard format.

logging.basicConfig(filename='msapiquery.log', level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

This chunk of code creates an instance of the ArgumentParser class and configures two arguments.  The sourcefile argument is used to designate the JSON parameters file which contains all the necessary information.

The parameters file is then opened and processed.  Note that the S3 parameters are only pulled in if the –s3 switch was used.

parser = ArgumentParser()
parser.add_argument('sourcefile', type=str, help='JSON file with parameters')
parser.add_argument('--s3', help='Write results to S3 bucket',action='store_true')
args = parser.parse_args()

try:
    with open(args.sourcefile) as json_data:
        d = json.load(json_data)
        tenantname = d['parameters']['tenantname']
        resource = d['parameters']['resource']
        endpoint = d['parameters']['endpoint']
        filename = d['parameters']['filename']
        aws_region = d['parameters']['aws_region']
        q_param = d['parameters']['q_param']
        clientid_param = d['parameters']['clientid_param']
        clientsecret_param = d['parameters']['clientsecret_param']
        if args.s3:
            bucket = d['parameters']['bucket']
            prefix = d['parameters']['prefix']

Next up the get_parametersParameterStore function from the awsintegration module is executed twice.  Once to get the client id and once to get the client secret.  Note that the get_parameters method for Boto3 Systems Manager client could have been used to get both of the parameters in a single call, but I didn’t go that route.

    logging.info('Attempting to contact Parameter Store...')
    clientid = awsintegration.get_parametersParameterStore(clientid_param,aws_region)
    clientsecret = awsintegration.get_parametersParameterStore(clientsecret_param,aws_region)

In these next four lines the access token is obtained by calling the obtain_accesstoken function and the request to the MS Graph API is made using the makeapirequest function.

    logging.info('Attempting to obtain an access token...')
    token = graphapi.obtain_accesstoken(tenantname,clientid,clientsecret,resource)

    logging.info('Attempting to query %s ...',endpoint)
    data = graphapi.makeapirequest(endpoint,token,q_param)

This section creates a string representing the current day, month, and year and prepends the filename that was supplied in the parameters file.  The file is then opened using the with statement.  If you’re familiar with the using statement from C# the with statement is similar in that it ensures resources are cleaned up after being used.

Before the data is written to file, I remove the @odata.nextLink key if it’s present.  This is totally optional and just something I did to pretty up the results.  The data is then written to the file as raw text by using the Python JSON encoder/decoder.

    logging.info('Attempting to write results to a file...')
    timestr = time.strftime("%Y-%m-%d")
    filename = timestr + '-' + filename
    with open(filename,'w') as f:
        
        ## If the data was paged remove the @odata.nextLink key
        ## to clean up the data before writing it to a file

        if '@odata.nextLink' in data.keys():
            del data['@odata.nextLink']
        f.write(json.dumps(data))

Finally, if the s3 argument was passed when the script was run, the put_s3 method from the awsintegration module is run and the file is uploaded to S3.

    logging.info('Attempting to write results to %s S3 bucket...',bucket)
    if args.s3:
        awsintegration.put_s3(bucket,prefix,aws_region,filename)

Exceptions thrown anywhere in the script are captured here written to the log file.  I played around a lot with a few different ways of handling exceptions and everything was so interdependent that if there was a failure it was best for the script to stop altogether and inform the user.  Naftali Harris has an amazing blog that walks through the many different ways of handling exceptions in Python and the various advantages and disadvantages.  It’s a great read.

except Exception as e:
    logging.error('Exception thrown: %s',e)
    print('Error running script.  Review the log file for more details')

So that’s what the code is.  Let’s take a quick look at the parameters file below.  It’s very straight forward.  Keep in mind both the bucket and prefix parameters are only required when using the –s3 option.  Here are some details on the other options:

  • The tenantname attribute is the DNS name of the Azure AD tenant being queries.
  • The resource attribute specifies the resource the access token will be used for.  If you’re going to be hitting the MS Graph API, more than likely it will be https://graph.microsoft.com
  • The endpoint attribute specifies the endpoint the request is being made to including any query strings you plan on using
  • The clientid_param and clientsecret_param attributes are the AWS Systems Manager Parameter Store parameter names that hold the client id and client secret the script was provisioned from Azure AD
  • The q_param attribute is an array of key value pairs intended to story OData query strings
  • The aws_region attribute is the region the S3 bucket and parameter store data is stored in
  • The filename attribute is the name you want to set for the file the script will produce
{
    "parameters":{
        "tenantname": "mytenant.com",
        "resource": "https://graph.microsoft.com",
        "endpoint": "https://graph.microsoft.com/beta/auditLogs/signIns",
        "clientid_param":"myclient_id",
        "clientsecret_param":"myclient_secret",
        "q_param":{"$filter":"createdDateTime gt 2019-01-09"},
        "aws_region":"us-east-1",
        "filename":"sign_in_logs.json",
        "bucket":"mybucket",
        "prefix":"myprefix"
    }
}

Now that the script has been covered, let’s see it action.  First I’m going to demonstrate how it handles paging by querying the MS Graph API endpoint to list out the users in the directory.  I’m going to append the $select query parameter and set it to return just the user’s id to make the output more simple and set the $top query parameter to one to limit the results to one user per page.  The endpoint looks like this https://graph.microsoft.com/beta/users?$top=1&select=id.

I’ll be running the script from an instance of Cloud9.  The IAM user I’m using with AWS has appropriate permissions to the S3 bucket, KMS CMK, and parameters in the parameter store.  I’ve set each of the parameters in the parameters file to the appropriate values for the environment I’ve configured.  I’ll additionally be using the –s3 option.

 

run_script.png

Once the script is complete it’s time to look at the log file that was created.  As seen below each step in the script to aid with debugging if something were to fail.  The log also indicates the results were paged.

log

The output is nicely formatted JSON that could be further transformed or fed into something like Amazon Athena for further analysis (future post maybe?).

json.png

Cool right?  My original use case was sign-in logs so let’s take a glance that.  Here I’m going to use an endpoint of https://graph.microsoft.com/beta/auditLogs/signIns with a OData filter option of createdDateTime gt 2019-01-08 which will limit the data returned to today’s sign-ins.

In the logs we see the script was successfully executed and included the filter specified.

graphapi_log_sign.png

The output is the raw JSON of the sign-ins over the past 24 hours.  For your entertainment purposes I’ve included one of the malicious sign-ins that was captured.  I SO can’t wait to examine this stuff in a BI tool.

sign_in_json

Well that’s it folks.  It may be ugly, but it works!  This was a fun activity to undertake as a first stab at making something useful in Python.  I especially enjoyed the lack of documentation available on this integration.  It really made me dive deep and learn things I probably wouldn’t have if there were a billion of examples out there.

I’ve pushed the code to Github so feel free to muck around with it to your hearts content.

Using Python to Pull Data from MS Graph API – Part 1

Welcome to 2019 fellow geeks! I hope each of you had a wonderful holiday with friends and family.

It’s been a few months since my last post. As some of you may be aware I made a career move last September and took on a new role with a different organization. The first few months have been like drinking from multiple fire hoses at once and I’ve learned a ton. It’s been an amazing experience that I’m excited to continue in 2019.

One area I’ve been putting some focus in is learning the basics of Python. I’ve been a PowerShell guy (with a bit of C# thrown in there) for the past six years so diving into a new language was a welcome change. I picked up a few books on the language, watched a few videos, and it wasn’t clicking. At that point I decided it was time to jump into the deep end and come up with a use case to build out a script for. Thankfully I had one queued up that I had started in PowerShell.

Early last year my wife’s Office 365 account was hacked. Thankfully no real damage was done minus some spam email that was sent out. I went through the wonderful process of changing her passwords across her accounts, improving the complexity and length, getting her on-boarded with a password management service, and enabling Azure MFA (Multi-factor Authentication) on her Office 365 account and any additional services she was using that supported MFA options.  It was not fun.

Curious of what the logs would have shown, I had begun putting together a PowerShell script that was going to pull down the logs from Azure AD (Active Directory), extract the relevant data, and export it CSV (comma-separate values) where I could play around with it in whatever analytics tool I could get my hands on. Unfortunately life happened and I never had a chance to finish the script or play with the data. This would be my use case for my first Python script.

Azure AD offers a few different types of logs which Microsoft divides into a security pillar and an activity pillar. For my use case I was interested in looking at the reports in the Activity pillar, specifically the Sign-ins report. This report is available for tenants with an Azure AD Premium P1 or P2 subscription (I added P2 subscriptions to our family accounts last year).  The sign-in logs have a retention period of 30 days and are available either through the Azure Portal or programmatically through the MS Graph API (Application Programming Interface).

My primary goals were to create as much reusable code as possible and experiment with as many APIs/SDKs (Software Development Kits) as I could.  This was accomplished by breaking the code into various reusable modules and leveraging AWS (Amazon Web Services) services for secure storage of Azure AD application credentials and cloud-based storage of the exported data.  Going this route forced me to use the MS Graph API, Microsoft’s Azure Active Directory Library for Python (or ADAL for short), and Amazon’s Boto3 Python SDK.

On the AWS side I used AWS Systems Manager Parameter Store to store the Azure AD credentials as secure strings encrypted with a AWS KMS (Key Management Service) customer-managed customer master key (CMK).  For cloud storage of the log files I used Amazon S3.

Lastly I needed a development environment and source control.  For about a day I simply used Sublime Text on my Mac and saved the file to a personal cloud storage account.  This was obviously not a great idea so I decided to finally get my GitHub repository up and running.  Additionally I moved over to using AWS’s Cloud9 for my IDE (integrated development environment).   Cloud9 has the wonderful perk of being web based and has the capability of creating temporary credentials that can do most of what my AWS IAM user can do.  This made it simple to handle permissions to the various resources I was using.

Once the instance of Cloud9 was spun up I needed to set the environment up for Python 3 and add the necessary libraries.  The AMI (Amazon Machine Image) used by the Cloud9 service to provision new instances includes both Python 2.7 and Python 3.6.  This fact matters when adding the ADAL and Boto3 modules via pip because if you simply run a pip install module_name it will be installed for Python 2.7.  Instead you’ll want to execute the command python3 -m pip install module_name which ensures that the two modules are installed in the appropriate location.

In my next post I’ll walk through and demonstrate the script.

Have a great week!

 

 

 

 

Azure AD Password Protection – Hybrid Deep Dive

Azure AD Password Protection – Hybrid Deep Dive

Welcome back fellow geeks.  Today I’m going to be looking at a brand new capability Microsoft announced entered public preview this week.  With the introduction of Hybrid Azure Active Directory Password Protection Microsoft continues to extend the protection it has based into its Identity-as-a-Service (IDaaS) offering Azure Active  Directory (AAD).

If you’ve administered Windows Active Directory (AD) in an environment with a high security posture, you’re very familiar with the challenges of ensuring the use of “good” passwords.  In the on-premises world we’ve typically used the classic Password Policies that come out of the box with AD which provide the bare minimum.  Some of you may even have leveraged third-party password filters to restrict the usage of commonly used passwords such as the classic “P@$$w0rd”.  While the third-party add-ins filled a gap they also introduce additional operational complexity (Ever tried to troubleshoot a misbehaving password filter?  Not fun) and compatibility issues.  Additionally the filters that block “bad” passwords tend to use a static data set or a data set that has to be manually updated and distributed.

In comes Microsoft’s Hybrid Azure Active Directory Password Protection to save the day.  Here we have a solution that comes directly from the vendor (no more third-party nightmares) that uses the power of telemetry and security data collected from Microsoft’s cloud to block the use of some of the most commonly used passwords (extending that even further with the use of fuzzy logic) as well as custom passwords you can provide to the service yourself.  In a refreshing turn of events, Microsoft has finally stepped back from the insanity (yes I’m sorry it’s insanity for most organizations) of requiring Internet access on your domain controllers.

After I finished reading the announcement this week I was immediately interested in taking a peek behind the curtains on how the solution worked.  Custom password filters have been around for a long time, so I’m not going to focus on that piece of the solution.  Instead I’m going to look more closely at two areas, deployment and operation of the solution.  Since I hate re-creating existing documentation (and let’s face it, I’m not going to do it nearly as well as those who do it for a living) I’ll be referencing back to Microsoft documentation heavily during this post so get your dual monitors powered up.

I’ll be using my Geek In The Weeds tenant for this demonstration.  The tenant is synchronized and federated with AAD with Azure Active Directory Connect (AADC) and Active Directory Federation Services (AD FS).  The tenant is equipped with some Office 365 E5 and Enterprise Mobility+Security E5 licenses.  Since I’ll need some Windows Servers for this, I’ll be using the lab seen in the diagram below.

1aadpp1.png

The first thing I needed to do was verify that my AAD tenant was configured for Azure Active Directory Password Protection.  For that I logged into the portal as a global administrator and selected the Azure Active Directory blade.  Scrolling down to the Security section of the menu shows an option named Authentication Methods.

1aadpp2.png

After selecting the option a new blade opens with only one menu item, Password Protection.  Since it’s the option there, it opens right up.  Here we can see the configuration options available for Azure Active Directory Password Protection and Smart Lockout.  Smart Lockout is at this time a feature specific to AAD so I’m not going to cover it.  You can read more about that feature in the Microsoft announcement.  The three options that we’re interested in for this post are within the Custom Banned Passwords and Password protection for Windows Server Active Directory.

1aadpp3.png

The custom banned passwords section allows organizations to add additional blocked passwords beyond the ones Microsoft provides.  This is helpful if organizations have a good grasp on their user’s behavior and have some common words they want to block to keep users from creating passwords using those words.  Right now it’s limited to 1000 words with one word per line.  You can copy and paste from a another document as long as what you paste is a single word per line.

I’d like to see Microsoft lift the cap of 1000 words as well as allowing for programmatic updating of this list.  I can see some real cool opportunities if this is combined with telemetry obtained from on-premises.  Let’s say the organization has some publicly facing endpoints that use a username and password for authentication.  That organization could capture the passwords used during password spray and brute force attacks, record the number of instances of their use, and add them to this list as the number of instances of those passwords reach certain thresholds.  Yes, I’m aware Microsoft is doing practically the same thing (but better) in Azure AD, but not everything within an organization uses Azure AD for authentication.  I’d like to see Microsoft allow for programmatic updates to this list to allow for such a use case.

Let’s enter two terms in the custom banned password list for this demonstration.  Let’s use geekintheweeds and journeyofthegeek.  We’ll do some testing later to see if the fuzzy matching capabilities extend to the custom banned list.

Next up we configuration options have Password protection for Windows Server Active Directory.  This is will be my focus.  Notice that the Enable password protection on Windows Server Active Directory option is set to Yes by default.  This option is going to control whether or not I can register the Azure AD Password Protection proxy service to Azure AD as you’ll see later in the post.  For now let’s set to that to No because it’s always interesting to see how things fail.

I’m going to leave Mode option at Audit for now.  This is Microsoft’s recommendation out of the gates.  It will give you time to get a handle on user behavior to determine how disruptive this will be to the user experience, give you an idea as to how big of a security issue this is for your organization, as well as giving you an idea as to the scope of communication and education you’ll need to do within your organization.

1aadpp4

There are two components we’ll need to install within on-premises infrastructure.  On the domain controllers we’ll be installing the Azure AD Password Protection DC Agent Service and the DC Agent Password Filter dynamic-link library (DLL).  On the member server we’ll be installing the Azure AD Password Protection Proxy Service.  The Microsoft documentation explains what these two services do at a high level.  In short, the DC Agent Password Filter functions like any other password filter and captures the clear text password as it is being changed.  It sends the password to the DC Agent Service which validates the password according to the locally cached copy of password policy that it has gotten from Azure AD.  The DC Agent Service also makes requests for new copies of the password policy by sending the request to the Proxy Service running on the member server which reaches out to Azure AD on the Agent Service’s behalf.  The new policies are stored in the SYSVOL folder so all domain controllers have access to them.  I sourced this diagram directly from Microsoft, so full credit goes to the product team for producing a wonderful visual representation of the process.

1aadpp5

The necessary installation files are sourced from the Microsoft Download Center.  After downloading the two files I distributed the DC Agent to my sole domain controller and the Proxy Service file to the member server.

Per Microsoft instructions we’ll be installing the Proxy Service first.  I’d recommend installing multiple instances of the Proxy Service in a production environment to provide for failover.  During the public preview stage you can deploy a maximum of two proxy agents.

The agent installation could be pushed by your favorite management tool if you so choose.  For the purposes of the blog I’ll be installing it manually.  Double-clicking the MSI file initiates the installation as seen below.

1aadpp6.png

The installation takes under a minute and then we receive confirmation the installation was successful.

1aadpp7.png

Opening up the Services Microsoft Management Console (MMC) shows the new service having been registered and that it is running. The service runs as Local System.

1aadpp8.png

Before I proceed further with the installation I’m going to startup Fiddler under the Local System security context using PSEXEC.  For that we open an elevated command prompt and run the command below.  The -s parameter opens the application under the LOCAL SYSTEM user context and the -i parameter makes the window interactive.

1aadpp9.png

Additionally we’ll setup another instance of Fiddler that will run under the user’s security context that will be performing the PowerShell cmdlets below.  When running multiple instances of Fiddler different ports needs to be used so agree to the default port suggested by Fiddler and proceed.

Now we need to configure the agent.  To do that we’ll use the PowerShell module that is installed when the proxy agent is installed.  We’ll use a cmdlet from the module to register the proxy with Azure Active Directory.  We’ll need a non-MFA enforced (public preview doesn’t support MFA-enforced global admins for registration) global admin account for this.  The account running the command also needs to be a domain administrator in the Active Directory domain (we’ll see why in a few minutes).

The cmdlet successfully runs.  This tells us the Enable password protection on Windows Server Active Directory option doesn’t prevent registration of the proxy service.   If we bounce back to the Fiddler capture we can see a few different web transactions.

1aadpp10

First we see a non-authenticated HTTP GET sent to https://enterpriseregistration.windows.net/geekintheweeds.com/discover?api-version=1.6.  For those of you familiar with device registration, this endpoint will be familiar.  The endpoint returns a JSON response with a variety of endpoint information.  The data we care about is seen in the screenshot below.  I’m not going to bother hiding any of it since it’s a publicly accessible endpoint.

1aadpp11.png

Breaking this down we can see a security principal identifier, a resource identifier indicating the device registration service, and a service endpoint which indicates the Azure Active Directory Password Protection service.  What this tells us is Microsoft is piggybacking off the existing Azure Active Directory Device Registration Service for onboarding of the proxy agents.

Next up an authenticated HTTP POST is made to https://enterpriseregistration.windows.net/aadpasswordpolicy/<tenantID>/proxy?api-version=1.0.  The bearer token for the global admin is used to authenticate to the endpoint.  Here we have the Proxy Service posting a certificate signing request (CSR) and providing its fully qualified domain name (FQDN).  The request for a CSR tells us the machine must have provisioned a new private/public key pair and once this transaction is complete we should have a signed certificate identifying the proxy.

1aadpp12

The endpoint responds with a JSON response.

1aadpp13.png

If we open up and base64 decode the value in the SignedProxyCertificateChain we see another signed JSON response. Decoding the response and dropping it into Visual Studio shows us three attributes of note, TenantID, CreationTime, and the CertificateChain.

1aadpp14.png

Dropping the value of the CertificateChain attribute into Notepad and saving it as a certificate yields the result below. Note the alphanumeric string after the AzureADBPLRootPolicyCert in the issued to section below.

1aadpp15.png

My first inclination after receiving the certificate was to look into the machine certificate stores. I did that and they were empty. After a few minutes of confusion I remembered the documentation stating the registration of the proxy is a onetime activity and that it was mentioned it requires domain admin in the forest root domain and a quick blurb about a service connection point (SCP) and that it needed to be done once for a forest. That was indication enough for me to pop open ADSIEDIT and check out the Configuration directory partition. Sure enough we see that a new container has been added to the CN=Services container named Azure AD Password Protection.

1aadpp16.png

Within the container there is a container named Forest Certs and a service connection point named Proxy Presence. At this point the Forest Certs container is empty and the object itself doesn’t have any interesting attributes set. The Proxy Presence service connection point equally doesn’t have any interesting attributes set beyond the keywords attribute which is set to an alphanumeric string of 636652933655882150_5EFEAA87-0B7C-44E9-B25C-4F665F2E0807. Notice the bolded part of the string has the same pattern as the what was in the certificate included in the CertificateChain attribute. I tried deleting the Azure AD Password Protection container and re-registering to see if these two strings would match, but they didn’t. So I’m not sure what the purpose of that string is yet, just that it probably has some relationship to the certificate referenced above.

The next step in the Proxy Service configuration process is to run the Register-AzureADPasswordProtectionForest cmdlet. This cmdlet again requires the Azure identity being used is a member of the global admins role and that the security principal running the cmdlet has membership in the domain administrators group. The cmdlet takes a few seconds to run and completes successfully.

Opening up Fiddler shows additional conversation with Azure AD.

1aadpp17.png

Session 12 is the same unauthenticated HTTP GET to the discovery endpoint that we saw above.  Session 13 is another authenticated HTTP POST using the global admin’s bearer token to same endpoint we saw after running the last cmdlet.  What differs is the information posted to the endpoint.  Here we see another CSR being posted as well as providing the DNS name, however the attributes are now named ForestCertificateCSR and ForestFQDN.

1aadpp18.png

The endpoint again returns a certificate chain but instead using the attribute SignedForestCertificateChain.

1aadpp19.png

The contents of the attribute look very similar to what we saw during the last cmdlet.

1aadpp20.png

Grabbing the certificate out of the CertificateChain attribute, pasting it into Notepad, and saving as a certificate yields a similar certificate.

1aadpp21.png

Bouncing back to ADSIEDIT and refreshing the view I saw that the Proxy Presence SCP didn’t change.  We do see a new SCP was created under the Forest Certs container.  Opening up the SCP we have a keywords attribute of {DC7F004B-6D59-46BD-81D3-BFAC1AB75DDB}.  I’m not sure what the purpose of that is yet.  The other attribute we now have set is the msDS-Settings attribute.

1aadpp22.png

Editing the msDS-Settings attribute within the GUI shows that it has no values which obviously isn’t true.  A quick Google search on the attribute shows it’s up to the object to store what it wants in there.

1aadpp23.png

Because I’m nosey I wanted to see the entirety of the attribute so in comes PowerShell.  Using a simple Get-ADObject query I dumped the contents of the attribute to a text file.

1aadpp26.png

The result is a 21,000+ character string.  We’ll come back to that later.

At this point I was convinced there was something I was missing.  I started up WireShark, put a filter on to capture LDAP and LDAPS traffic and I restarted the proxy service.  LDAP traffic was there alright over port 389 but it was encrypted via Kerberos (Microsoft’s typical habit).  This meant a packet capture wouldn’t give me what I wanted so I needed to be a bit more creative.  To get around the encryption I needed to capture the LDAP queries on the domain controller as they were processed.  To do that I used a script. The script is quite simple in that it enables LDAP debug logging for a specific period of time with settings that capture every query made to the device.  It then parses the event log entries created in the Directory Services Event Log and creates a pipe-delimited file.

1aadpp25

The query highlighted in red is what caught my eye.  Here we can see the service for performing an LDAP query against Active Directory for any objects one level under the GIWSERVER5 computer object and requesting the properties of objectClass, msds-settings, and keywords attributes.  Let’s replicate that query in PowerShell and see what the results look like.

1aadpp26.png

The results, which are too lengthy to paste here are there the computer object has two service connection point objects.  Here is a screenshot from the Active Directory Users and Computers MMC that makes it a bit easier to see.

1aadpp27.png

In the keywords attribute we have a value of {EBEFB703-6113-413D-9167-9F8DD4D24468};Domain=geekintheweeds.com.  Again, I’m not sure what the purpose of the keyword attribute value is.  The msDS-Settings value is again far too large to paste.  However, when I dump the value into the TextWizard in Fiddler and base64 decode it, and dump it into Visual Studio I have a pretty signed JSON web token.

1aadpp28.png

If we grab the value in the x509 Certificate (x5c) header and save it to a certificate, we see it’s the signed using the same certificate we received when we registered the proxy using the PowerShell cmdlets mentioned earlier.

1aadpp29.png

Based upon what I’ve found in the directory, at this point I’m fairly confident the private key for the public private key pair isn’t saved within the directory.  So my next step was to poke around the proxy agent installation directory of

C:\Program Files\Azure AD Password Protection Proxy\.  I went directly to the logs directory and saw the following logs.

1aadpp30.png

Opening up the most recent RegisterProxy log shows a line towards the bottom which was of interest. We can see that the encrypted proxy cert is saved to a folder on the computer running the proxy agent.

1aadpp31.png

Opening the \Data directory shows the following three ppfxe files. I’ve never come across a ppfxe file extension before so I didn’t have a way of even attempting to open it. A Google search on the file extension comes up with nothing. I can only some it is some type of modified PFX file.

1aadpp32.png

Did you notice the RegisterForest log file in the screenshot above? I was curious on that one so I popped it open. Here were the lines that caught my eye.

1aadpp33.png

Here we can see the certificate requested during the Register-AzureADPasswordProtectForest cmdlet had the private key merged back into the certificate, then it was serialized to JSON, encoded in UTF8, encrypted, base64 encoded, and written to the directory to the msDS-Settings attribute.  That jives with what we observed earlier in that dumping that attribute and base-64 decoding it gave us nothing decipherable.

Let’s summarize what we’ve done and what we’ve learned at this point.

  • The Azure Active Directory Password Protection Proxy Service has been installed in GIWSERVER5.
  • The cmdlet Register-AzureADPasswordProtectionProxy was run successfully.
  • When the Register-AzureADPasswordProtectionProxy was run the following actions took place:
    • GIWSERVER5 created a new public/private keypair
    • Proxy service performs discovery against Azure AD to discover the Password Protection endpoints for the tenant
    • Proxy service opened a connection to the Password Protection endpoints for the tenant leveraging the capabilities of the Azure AD Device Registration Service and submits a CSR which includes the public key it generated
    • The endpoint generates a certificate using the public key the proxy service provided and returns this to the proxy service computer
    • The proxy service combines the private key with the public key certificate and saves it to the C:\Program Files\Azure AD Password Protection Proxy\Data directory as a PPFXE file type
    • The proxy service connects to Windows Active Directory domain controller over LDAP port 389 using Kerberos for encryption and creates the following containers and service connection points:
      • CN=Azure AD Password Protection,CN=Configuration,DC=XXX,DC=XXX
      • CN=Forest Certs,CN=Azure AD Password Protection,CN=Configuration,DC=XXX,DC=XXX
        • Writes keyword attribute
      • CN=Proxy Presence,CN=Azure AD Password Protection,CN=Configuration,DC=XXX,DC=XXX
      • CN=AzureADPasswordProtectionProxy,CN=GIWSERVER5,CN=Computers,DC=XXX,DC=XXX
        • Writes signed JSON Web Token to msDS-Settings attribute for
        • Writes keyword attribute (can’t figure out what this does yet)
  • The cmdlet Register-AzureADPasswordProtectionForest was run successfully
  • When the Register-AzureADPasswordProtectionForest was run the following actions took place:
    • GIWSERVER5 created a new public/private keypair
    • Proxy service performs discovery against Azure AD to discover the Password Protection endpoints for the tenant
    • Proxy service opened a connection to the Password Protection endpoints for the tenant leveraging the capabilities of the Azure AD Device Registration Service and submits a CSR which includes the public key it generated
    • The endpoint generates a certificate using the public key the proxy service provided and returns this to the proxy service computer
    • The proxy service combines the private key with the public key certificate and saves it to the C:\Program Files\Azure AD Password Protection Proxy\Data directory as a PPFXE file type
    • The proxy service connects to Windows Active Directory domain controller over LDAP port 389 using Kerberos for encryption and creates the following containers:
      • CN=<UNIQUE IDENTIFIER>,CN=Forest Certs,CN=Azure AD Password Protection,CN=Configuration,DC=XXX,DC=XXX
        • Writes to msDS-Settings the encoded and encrypted certificate it received back from Azure AD including the private key
        • Writes to keyword attribute (not sure on this one either)

Based upon observation and review of the logs the proxy service creates when registering, I’m fairly certain the private key and certificate provisioned during the Register-AzureADPasswordProtectionProxy cmdlet is used by the proxy to make queries to Azure AD for updates on the banned passwords list.  Instead of storing the private key and certificate in the machine’s certificate store like most applications do, it stores them in a PPFXE file format.  I’m going to assume there is some symmetric key stored someone on the machine that is used to unlock the use of that information, but I couldn’t determine it with Rohitab API Monitor or Sysinternal Procmon.

I’m going to theorize the private key and certificate provisioned during the Register-AzureADPasswordProtectionForest cmdlet is going to be used by the DC agents to communicate with the proxy service.  This would make sense because the private key and certificate are stored in the directory and it would make for easy access by the domain controllers.  In my next post I’ll do a deep dive into the DC agent so I’ll have a chance to get more evidence to determine if the theory holds.

On a side note, I attempted to capture the web traffic between the proxy service and Azure AD once the service was installed and registered.  Unfortunately the proxy service doesn’t honor the system proxy even when it’s configured in the global machine.config.  I confirmed that the public preview of the proxy service doesn’t support the usage of a web proxy.  Hopefully we’ll see that when it goes general availability.

Have a great week.