Capturing and Visualizing Office 365 Security Logs – Part 1

Posted on January 30, 2019 by mattfeltonma

Welcome back again my fellow geeks!

I’ve been busy over the past month nerding out on some pet projects. I thought it would be fun to share one of those pet projects with you. If you had a chance to check out my last series, I walked through my first Python experiment which was to write a re-usable tool that could be used to pull data from Microsoft’s Graph API (Microsoft Graph).

For those of you unfamiliar with Microsoft Graph, it’s the Restful API (application programming interface) that is used to interact with Microsoft cloud offerings such as Office 365 and Azure. You’ve probably been interacting with it without even knowing it if through the many PowerShell modules Microsoft has released to programmatically interact with those services.

One of the many resources which can be accessed through Microsoft Graph are Azure AD (Active Directory) security and audit reports. If you’re using Office 365, Microsoft Azure, or simply Azure AD as an identity platform for SSO (single sign-on) to third-party applications like SalesForce, these reports provide critical security data. You’re going to want to capture them, store them, and analyze them. You’re also going to have to account for the window that Microsoft makes these logs available.

The challenge is they are not available via the means logs have traditionally been captured on-premises by using syslogd, installing an SIEM agent, or even Windows Event Log Forwarding. Instead you’ll need to take a step forward in evolving the way you’re used to doing things. This is what moving to the cloud is all about.

Microsoft allows you to download the logs manually via the Azure Portal GUI (graphical user interface) or capture them by programmatically interacting with Microsoft Graph. While the former option may work for ad-hoc use cases, it doesn’t scale. Instead we’ll explore the latter method.

If you have an existing enterprise-class SIEM (Security Information and Event Management) solution such as Splunk, you’ll have an out of box integration. However, what if you don’t have such a platform, your organization isn’t yet ready to let that platform reach out over the Internet, or you’re interested in doing this for a personal Office 365 subscription? I fell into the last category and decided it would be an excellent use case to get some experience with Python, Microsoft Graph, and take advantage of some of the data services offered by AWS (Amazon Web Services). This is the use case and solution I’m going to cover in this post.

Last year I had a great opportunity to dig into operational and security logs to extract useful data to address some business problems. It was my first real opportunity to examine large amounts of data and to create different visualizations of that data to extract useful trends about user and application behavior. I enjoyed the hell out of it and thought it would be fun to experiment with my own data.

I decided that my first use case would be Office 365 security logs. As I covered in my last series my wife’s Office 365 account was hacked. The damage was minor as she doesn’t use the account for much beyond some crafting sites (she’s a master crocheter as you can see from the crazy awesome Pennywise The Clown she made me for Christmas).

The first step in the process was determining an architecture for the solution. I gave myself a few requirements:

The solution must not be dependent on my home lab infrastructure
Storage for the logs must be cheap and readily available
The credentials used in my Python code needs to be properly secured
The solution must be automated and notify me of failures
The data needs to be available in a form that it can be examined with an analytics solution

Based upon the requirements I decided to go the serverless (don’t hate me for using that tech buzzword 🙂 ) route. My decisions were:

AWS Lambda would run my code
Amazon CloudWatch Events would be used to trigger the Lambda once a day to download the last 24 hours of logs
Amazon S3 (Simple Storage Service) would store the logs
AWS Systems Manager Parameter Store would store the parameters my code used leveraging AWS KMS (Key Management Service) to encrypt the credentials used to interact with Microsoft Graph
Amazon Athena would hold the schema for the logs and make the data queryable via SQL
Amazon QuickSight would be used to visualize the data by querying Amazon Athena

The high level architecture is pictured below.

untitled

I had never done a Lambda before so I spent a few days looking at some examples and doing the typical Hello World that we all do when we’re learning something new. From there I took the framework of Python code I put together for general purpose queries to the Microsoft Graph, and adapted it into two Lambdas. One Lambda would pull Sign-In logs while the other would pull Audit Logs. I also wanted a repeatable way to provision the Lambdas to share with others and get some CloudFormation practice and brush up on my very dusty Bash scripting. The results are located here in one of my Github repos.

I’m going to stop here for this post because we’ve covered a fair amount of material. Hopefully after reading this post you understand that you have to take a new tact with getting logs for cloud-based services such as Azure AD. Thankfully the cloud has brought us a whole new toolset we can use to automate the extraction and storage of those logs in a simple and secure manner.

In my next post I’ll walk through how I used Athena and QuickSight to put together some neat dashboards to satisfy my nerdy interests and get better insight into what’s happening on a daily basis with my Office 365 subscription.

See you next post and go Pats!

Using Python to Pull Data from MS Graph API – Part 1

Posted on January 3, 2019 by mattfeltonma

Welcome to 2019 fellow geeks! I hope each of you had a wonderful holiday with friends and family.

It’s been a few months since my last post. As some of you may be aware I made a career move last September and took on a new role with a different organization. The first few months have been like drinking from multiple fire hoses at once and I’ve learned a ton. It’s been an amazing experience that I’m excited to continue in 2019.

One area I’ve been putting some focus in is learning the basics of Python. I’ve been a PowerShell guy (with a bit of C# thrown in there) for the past six years so diving into a new language was a welcome change. I picked up a few books on the language, watched a few videos, and it wasn’t clicking. At that point I decided it was time to jump into the deep end and come up with a use case to build out a script for. Thankfully I had one queued up that I had started in PowerShell.

Early last year my wife’s Office 365 account was hacked. Thankfully no real damage was done minus some spam email that was sent out. I went through the wonderful process of changing her passwords across her accounts, improving the complexity and length, getting her on-boarded with a password management service, and enabling Azure MFA (Multi-factor Authentication) on her Office 365 account and any additional services she was using that supported MFA options. It was not fun.

Curious of what the logs would have shown, I had begun putting together a PowerShell script that was going to pull down the logs from Azure AD (Active Directory), extract the relevant data, and export it CSV (comma-separate values) where I could play around with it in whatever analytics tool I could get my hands on. Unfortunately life happened and I never had a chance to finish the script or play with the data. This would be my use case for my first Python script.

Azure AD offers a few different types of logs which Microsoft divides into a security pillar and an activity pillar. For my use case I was interested in looking at the reports in the Activity pillar, specifically the Sign-ins report. This report is available for tenants with an Azure AD Premium P1 or P2 subscription (I added P2 subscriptions to our family accounts last year). The sign-in logs have a retention period of 30 days and are available either through the Azure Portal or programmatically through the MS Graph API (Application Programming Interface).

My primary goals were to create as much reusable code as possible and experiment with as many APIs/SDKs (Software Development Kits) as I could. This was accomplished by breaking the code into various reusable modules and leveraging AWS (Amazon Web Services) services for secure storage of Azure AD application credentials and cloud-based storage of the exported data. Going this route forced me to use the MS Graph API, Microsoft’s Azure Active Directory Library for Python (or ADAL for short), and Amazon’s Boto3 Python SDK.

On the AWS side I used AWS Systems Manager Parameter Store to store the Azure AD credentials as secure strings encrypted with a AWS KMS (Key Management Service) customer-managed customer master key (CMK). For cloud storage of the log files I used Amazon S3.

Lastly I needed a development environment and source control. For about a day I simply used Sublime Text on my Mac and saved the file to a personal cloud storage account. This was obviously not a great idea so I decided to finally get my GitHub repository up and running. Additionally I moved over to using AWS’s Cloud9 for my IDE (integrated development environment). Cloud9 has the wonderful perk of being web based and has the capability of creating temporary credentials that can do most of what my AWS IAM user can do. This made it simple to handle permissions to the various resources I was using.

Once the instance of Cloud9 was spun up I needed to set the environment up for Python 3 and add the necessary libraries. The AMI (Amazon Machine Image) used by the Cloud9 service to provision new instances includes both Python 2.7 and Python 3.6. This fact matters when adding the ADAL and Boto3 modules via pip because if you simply run a pip install module_name it will be installed for Python 2.7. Instead you’ll want to execute the command python3 -m pip install module_name which ensures that the two modules are installed in the appropriate location.

In my next post I’ll walk through and demonstrate the script.

Have a great week!

The Evolution of AD RMS to Azure Information Protection – Part 1

Posted on March 25, 2018 by mattfeltonma

Collaboration. It’s a term I hear at least a few times a day when speaking to my user base. The ability to seamlessly collaborate with team members, across the organization, with trusted partners, and with customers is a must. It’s a driving force between much of the evolution of software-as-a-service collaboration offerings such as Office 365. While the industry is evolving to make collaboration easier than ever, it’s also introducing significant challenges for organizations to protect and control their data.

In a recent post I talked about Microsoft’s entry into the cloud access security broker (CASB) market with Cloud App Security (CAS) and its capability to provide auditing and alerting on activities performed in Amazon Web Services (AWS). Microsoft refers to this collection of features as the Investigate capability of CAS. Before I cover an example of the Control features in action, I want to talk about the product that works behind the scenes to provide CAS with many of the Control features.

That product is Azure Information Protection (AIP) and it provides the capability to classify, label, and protect files and email. The protection piece is provided by another Microsoft product, Azure Active Directory Rights Management Services (Azure RMS). Beyond just encrypting a file or email, Azure RMS can control what a user can do with a file such as preventing a user from printing a document or forwarding an email. The best part? The protection goes with the data even when it leaves your security boundary.

For those of you that have read my blog you can see that I am a huge fanboy of the predecessor to Azure RMS, Active Directory Rights Management Services (AD RMS, previously Rights Management Service or RMS for you super nerds). AD RMS has been a role available in Microsoft Windows Server since Windows Server 2003. It was a product well ahead of its time that unfortunately never really caught on. Given my love for AD RMS, I thought it would be really fun to do a series looking at how AIP has evolved from AD RMS. It’s a dramatic shift from a rather unknown product to a product that provides capabilities that will be as standard and as necessary as Antivirus was to the on-premises world.

I built a pretty robust lab environment (two actually) such that I could demonstrate the different ways the solutions work as well as demonstrate what it looks to migrate from AD RMS to AIP. Given the complexity of the lab environment, I’m going to take this post to cover what I put together.

The layout looks like this:

On the modern end I have an Azure AD tenant with the custom domain assigned of geekintheweeds.com. Attached to the tenant I have some Office 365 E5 and Enterprise Mobility + Security E5 trial licenses For the legacy end I have two separate labs setup in Azure each within its own resource group. Lab number one contains three virtual machines (VMs) that run a series of services included Active Directory Domain Services (AD DS), Active Directory Certificate Services (AD CS), AD RMS, and Microsoft SQL Server Express. Lab number two contains four VMs that run the same set as services as Lab 1 in addition to Active Directory Federation Services (AD FS) and Azure Active Directory Connect (AADC). The virtual network (vnet) within each resource group has been peered and both resource groups contain a virtual gateway which has been configured with a site-to-site virtual private network (VPN) back to my home Hyper-V environment. In the Hyper V environment I have two workstations.

Lab 1 is my “legacy” environment and consists of servers running Windows 2008 R2 and Windows Server 2012 R2 (AD RMS hasn’t changed in any meaningful manner since 2008 R2) and a client running Windows 7 Pro running Office 2013. The DNS namespace for its Active Directory forest is JOG.LOCAL. Lab 2 is my “modern” environment and consists of servers running Windows Server 2016 and a Windows 10 client running Office 2016 . It uses a DNS namespace of GEEKINTHEWEEDS.COM for its Active Directory forest and is synchronized with the Azure AD tenant I mentioned above. AD FS provides SSO to Office 365 for Geek in The Weeds users.

For AD RMS configuration, both environments will initially use Cryptographic Mode 1 and will have a trusted user domain (TUD). SQL Server Express will host the AD RMS database and I will store the cluster key locally within the database. The use of a TUD will make the configuration a bit more interesting for reasons you’ll see in a future post.

Got all that?

In my next post I’ll cover how the architecture changes when migrating from AD RMS to Azure Information Protection.

AWS and Microsoft’s Cloud App Security

Posted on March 13, 2018 by mattfeltonma

It seems like it’s become a weekly occurrence to have sensitive data exposed due to poorly managed cloud services. Due to Amazon’s large market share with Amazon Web Services (AWS) many of these instances involve publicly-accessible Simple Storage Service (S3) buckets. In the last six months alone there were highly publicized incidents with FedEx and Verizon. While the cloud can be empowering, it can also be very dangerous when there is a lack of governance, visibility, and acceptance of the different security mindset cloud requires.

Organizations that have been in operation for many years have grown to be very reliant on the network boundary acting as the primary security boundary. As these organizations begin to move to a software defined data center model this traditional boundary quickly becomes less and less effective. Unfortunately for these organizations this, in combination with a lack of sufficient understanding of cloud, gives rise to mistakes like sensitive data being exposed.

One way in which an organization can protect itself is to leverage technologies such as cloud access security brokers (cloud security gateways if you’re Forrester reader) to help monitor and control data as it travels between on-premises and the cloud. If you’re unfamiliar with the concept of a CASB, I covered it in a previous entry and included a link to an article which provides a great overview.

Microsoft has its own CASB offering called Microsoft Cloud App Security (CAS). It’s offered as part of Microsoft’s Enterprise Mobility and Security (EMS) E5/A5 subscription. Over the past several months multiple connectors to third party software-as-a-service (SaaS) providers have been introduced, including one for AWS. The capabilities with AWS are limited at this point to pulling administrative logs and user information but it shows promise.

As per usual, Microsoft provides an integration guide which is detailed in button pushing, but not so much in concepts and technical details as to what is happening behind the scenes. Since the Azure AD and AWS blog series has attracted so many views, I thought it would be fun and informative to do an entry for how Cloud App Security can be used with AWS.

I’m not in the habit of re-creating documentation so I’ll be referencing the Microsoft integration guide throughout the post.

The first thing that needs doing is the creation of a security principal in AWS Identity and Access Management (AWS IAM) that will be used by your tenant’s instance of CAS to connect to resources in your AWS account. The first four steps are straightforward but step 5 could a bit of an explanation.

Here we’re creating a custom IAM policy for the security principal granting it a number of permissions within the AWS account. IAM policies are sets of permissions which are attached to a human or non-human identity or AWS resource and are evaluated when a call to the resource is made. In the traditional on-premises world, you can think of it as something somewhat similar to a set of NTFS file permissions. When the policy pictured above is created the security principal is granted a set of permissions across all instances of CloudTrail, CloudWatch, and IAM within the account.

If you’re unfamiliar with AWS services, CloudTrail is a service which audits the API calls made to AWS resources. Some of the information included in the events include the action taken, the resource the action was taken upon, the security principal that made the action, the date time, and source IP address of the security principal who performed the action. The CloudWatch service allows for monitoring of metrics and optionally triggering events based upon metrics reaching specific thresholds. The IAM service is AWS’s identity store for the cloud management layer.

Now that we have a basic understanding of the services, let’s look at the permissions Microsoft is requiring for CAS to do its thing. The CloudTrail permissions of DescribeTrails, LookupEvents, and GetTrailStatus allow CAS to query for all trails enabled on an AWS account (CloudTrail is enabled by default on all AWS resources), lookup events in a trail, and get information about the trail such as start and stop logging times. The CloudWatch permissions of Describe* and Get* are fancy ways of asking for READ permissions on CloudWatch resources. These permissions include describe-alarms-history, describe alarms, describe-alarms-for-metric, get-dashboard, and get-metric-statistics. The IAM permissions are similar to what’s being asked for in CloudWatch, basically asking for full read.

Step number 11 instructs us to create a new CloudTrail trail. AWS by default audits all events across all resources and stores them for 90 days. Trails enable you to direct events captured by CloudTrail to an S3 bucket for archiving, analysis, and responding to events.

The trail created is consumed by CAS to read the information captured via CloudTrail. The permissions requested above become a bit more clear now that we see CAS is requesting read access for all trails across an account for monitoring goodness. I’m unclear as to why CAS is asking for read for CloudWatch alarms unless it has some integration in that it monitors and reports on alarms configured for an AWS account. The IAM read permissions are required so it can pull user information it can use for the User Groups capability.

After the security principal is created and a sample trail is setup, it’s time to configure the connector for CAS. Steps 12 – 15 walk through the process. When it is complete AWS now shows as a connected app.

After a few hours data will start to trickle in from AWS. Navigating to the Users and Accounts section shows all of the accounts found in the IAM instance for my AWS account. Envision this as your one stop shop for identifying all of the user accounts across your many cloud services. A single pane of glass to identity across SaaS.

On the Activity Log I see all of the API activities captured by CloudTrail. If I wanted to capture more audit information, I can enable CloudTrail for the relevant resource and point it to the trail I configured for CAS. I haven’t tested what CAS does with multiple trails, but based upon the permissions we configured when we setup the security principal, it should technically be able to pull from any trail we create.

Since the CAS and AWS integration is limited to pulling logging information, lets walk through an example of how we could use the data. Take an example where an organization has a policy that the AWS root user should not be used for administrative activities due to the level of access the account gets by default. The organization creates AWS IAM users accounts for each of its administrators who administer the cloud management layer. In this scenario we can create a new policy in CAS to detect and alert on instances where the AWS root user is used.

First we navigate to the Policies page under the Control section of CAS.

On the Policies page we’re going to choose to create a new policy settings in the image below. We’ll designate this as a high severity privileged account alert. We’re interested in knowing anytime the account is used so we choose the Single Activity option.

We’ll pretend we were smart users of CAS and let it collect data for a few weeks to get a sampling of the types of events which are captured and to give us some data to analyze. We also went the extra mile and leveraged the ability of CAS to pull in user information from AWS IAM such that we can choose the appropriate users from the drop-down menus.

Since this is a demo and my AWS lab has a lot of activity by the root account we’re going to limit our alerts to the creation of new AWS IAM users. To do that we set our filter to look for an Activity type equal to Create user. Our main goal is to capture usage of the root account so we add another filter rule that searches for a User with the name equal to aws root user where it is the actor in an event.

Finally we configure the alert to send an email to the administrator when the event occurs. The governance capabilities don’t come into play in this use case.

Next we jump back to AWS and create a new AWS IAM user named testuser1. A few minutes after the user is created we see the event appearing in CloudTrail.

After a few minutes, CAS generates and alert and I receive an email seen in the image below. I’m given information as to the activity, the app, the date and time it was performed, and the client’s IP address.

If I bounce back to CAS I see one new Alert. Navigating to the alert I’m able to dismiss it, adjust the policy that generated it, or resolve it and add some notes to the resolution.

I also have the option to dig deeper to see some of the patterns of the user’s behavior or the pattern of the behaviors from a specific IP address as seen below.

All this information is great, but what can we do with it? In this example, it delivers visibility into the administrative activities occurring at the AWS cloud management layer by centralizing the data into a single repository which I can then send other data such as O365 activity, Box, SalesForces, etc. By centralizing the information I can begin doing some behavioral analytics to develop standard patterns of behavior of my user base. Understanding standard behavior patterns is key to being ahead of the bad guys whether they be insiders or outsiders. I can search for deviations from standard patterns to detect a threat before it becomes too widespread. I can also be proactive about putting alerts and enforcement (available for other app connectors in CAS but not AWS at this time) to stop the behavior before the threat is realized. If I supplemented this data with log information from my on-premises proxy via Cloud App Discovery, I get an even larger sampling improving the quality of the data as well as giving me insight into shadow IT. Pulling those “shadow” cloud solutions into the light allow me to ensure the usage of the services complies with organizational policies and opens up the opportunity of reducing costs by eliminating redundant services.

Microsoft categorizes the capabilities that help realize these benefits as the Discover and Investigate capabilities of CAS. The solution also offers a growing number of enforcement mechanisms (Microsoft categorized these enforcement mechanisms as Control) which add a whole other layer of power behind the solution. Due to the limited integration with AWS I can’t demo those capabilities with this post. I’ll cover those in a future post.

I hope this post helped you better understand the value that CASB/CSGs like Microsoft’s Cloud App Security can bring to the table. While the product is still very new and a bit sparse on support with 3rd party applications, the list is growing every day. I predict the capabilities provided by technology such as Microsoft’s Cloud App Security will be as standard to IT as a firewall in the years to come. If you’re already in Office 365 you should be ensuring you integrate these capabilities into your arsenal to understand the value they can bring to your business.

Thanks and have a wonderful week!

Journey Of The Geek

The chronicles of a Bostonian tech geek navigating through life and technology

Tag Archives: azure

Capturing and Visualizing Office 365 Security Logs – Part 1

Using Python to Pull Data from MS Graph API – Part 1

The Evolution of AD RMS to Azure Information Protection – Part 1