Welcome back again my fellow geeks!
I’ve been busy over the past month nerding out on some pet projects. I thought it would be fun to share one of those pet projects with you. If you had a chance to check out my last series, I walked through my first Python experiment which was to write a re-usable tool that could be used to pull data from Microsoft’s Graph API (Microsoft Graph).
For those of you unfamiliar with Microsoft Graph, it’s the Restful API (application programming interface) that is used to interact with Microsoft cloud offerings such as Office 365 and Azure. You’ve probably been interacting with it without even knowing it if through the many PowerShell modules Microsoft has released to programmatically interact with those services.
One of the many resources which can be accessed through Microsoft Graph are Azure AD (Active Directory) security and audit reports. If you’re using Office 365, Microsoft Azure, or simply Azure AD as an identity platform for SSO (single sign-on) to third-party applications like SalesForce, these reports provide critical security data. You’re going to want to capture them, store them, and analyze them. You’re also going to have to account for the window that Microsoft makes these logs available.
The challenge is they are not available via the means logs have traditionally been captured on-premises by using syslogd, installing an SIEM agent, or even Windows Event Log Forwarding. Instead you’ll need to take a step forward in evolving the way you’re used to doing things. This is what moving to the cloud is all about.
Microsoft allows you to download the logs manually via the Azure Portal GUI (graphical user interface) or capture them by programmatically interacting with Microsoft Graph. While the former option may work for ad-hoc use cases, it doesn’t scale. Instead we’ll explore the latter method.
If you have an existing enterprise-class SIEM (Security Information and Event Management) solution such as Splunk, you’ll have an out of box integration. However, what if you don’t have such a platform, your organization isn’t yet ready to let that platform reach out over the Internet, or you’re interested in doing this for a personal Office 365 subscription? I fell into the last category and decided it would be an excellent use case to get some experience with Python, Microsoft Graph, and take advantage of some of the data services offered by AWS (Amazon Web Services). This is the use case and solution I’m going to cover in this post.
Last year I had a great opportunity to dig into operational and security logs to extract useful data to address some business problems. It was my first real opportunity to examine large amounts of data and to create different visualizations of that data to extract useful trends about user and application behavior. I enjoyed the hell out of it and thought it would be fun to experiment with my own data.
I decided that my first use case would be Office 365 security logs. As I covered in my last series my wife’s Office 365 account was hacked. The damage was minor as she doesn’t use the account for much beyond some crafting sites (she’s a master crocheter as you can see from the crazy awesome Pennywise The Clown she made me for Christmas).
The first step in the process was determining an architecture for the solution. I gave myself a few requirements:
- The solution must not be dependent on my home lab infrastructure
- Storage for the logs must be cheap and readily available
- The credentials used in my Python code needs to be properly secured
- The solution must be automated and notify me of failures
- The data needs to be available in a form that it can be examined with an analytics solution
Based upon the requirements I decided to go the serverless (don’t hate me for using that tech buzzword 🙂 ) route. My decisions were:
- AWS Lambda would run my code
- Amazon CloudWatch Events would be used to trigger the Lambda once a day to download the last 24 hours of logs
- Amazon S3 (Simple Storage Service) would store the logs
- AWS Systems Manager Parameter Store would store the parameters my code used leveraging AWS KMS (Key Management Service) to encrypt the credentials used to interact with Microsoft Graph
- Amazon Athena would hold the schema for the logs and make the data queryable via SQL
- Amazon QuickSight would be used to visualize the data by querying Amazon Athena
The high level architecture is pictured below.
I had never done a Lambda before so I spent a few days looking at some examples and doing the typical Hello World that we all do when we’re learning something new. From there I took the framework of Python code I put together for general purpose queries to the Microsoft Graph, and adapted it into two Lambdas. One Lambda would pull Sign-In logs while the other would pull Audit Logs. I also wanted a repeatable way to provision the Lambdas to share with others and get some CloudFormation practice and brush up on my very dusty Bash scripting. The results are located here in one of my Github repos.
I’m going to stop here for this post because we’ve covered a fair amount of material. Hopefully after reading this post you understand that you have to take a new tact with getting logs for cloud-based services such as Azure AD. Thankfully the cloud has brought us a whole new toolset we can use to automate the extraction and storage of those logs in a simple and secure manner.
In my next post I’ll walk through how I used Athena and QuickSight to put together some neat dashboards to satisfy my nerdy interests and get better insight into what’s happening on a daily basis with my Office 365 subscription.
See you next post and go Pats!