During brainstorming sessions with peers or customer conversations, “what if” type scenarios pop up. These are typically scenarios that aren’t documented at all or buried deep in the depths of the Internet. I thought it would be fun to create an ongoing series where I share some of the “what if” scenarios I run into and what I found when I labbed them out.
Lately I’ve been having a lot of internal and customer conversations around DNS and resolution with Azure Private Endpoints. Out of those conversations came some interesting “what if” scenarios. Those scenarios will be the subject of this post.
What if I create a second Private Endpoint for a single Azure resource and register it to the same Azure Private DNS zone?
This question popped up in some ongoing discussions around disaster recovery when Private Endpoints are in use. Specifically, the discussion was around Azure Storage accounts configured for GRS (geo-redundant storage). In this scenario a customer is accessing a storage account from on-premises via a Private Endpoints and is blocking all access to the storage account over the public endpoint. The customer has configured the storage account to register its record in the Azure Private DNS zone.
In the event an entire region fails, the private endpoint (seen above with IP of 10.0.1.4) in the primary region will become unavailable causing traffic to drop. If the customer creates a second private endpoint (seen above with the IP of 10.1.1.4) either ahead of time or after the failure what would happen when the private endpoint tried to register a duplicate record in Azure Private DNS?
Would the registration of the A record fail? Would it add an IP to the record set? Would it overwrite it?
The answer is it will overwrite the record. This means that the A record see in the above diagram for st1.privatelink.blob.core.windows.net would be overwritten to point to the new Private Endpoint address of 10.1.1.4.
What if I don’t want to depend solely on Azure Private DNS?
This conversation has popped up a lot lately with customers and in comments in my Azure DNS and Private Link blog series. As I discuss in that series, in its current state Private Endpoints depend heavily on Azure Private DNS. Azure Private DNS is relatively new so it’s a very basic DNS service with no fancy geo-load balancing capabilities or probes. This can create administrative overhead in the case of disaster recovery scenarios. Customers are eager to leverage their own DNS products such as InfoBlox or F5 GTMs due to the advanced capabilities of those products.
The challenge is customers are stuck using the privatelink namespaces (such as privatelink.database.windows.net) Microsoft has defined for the services. Now that wouldn’t be a problem if the customer could access the service directly by the privatelink FQDN (fully-qualified domain name), but that won’t work for encryption in transit to Microsoft PaaS (platform-as-a-service) offerings because the privatelink FQDN isn’t supported on the certificates provisioned to the services. This results in the dreaded certificate name mismatch scenario where the client (your machine) can’t verify the identity of the server because the server’s certificate doesn’t contain the identity you’re trying to access (mystuff.privatelink.database.windows.net). This requires you access the server using the public FQDN (mystuff.database.windows.net) which goes through the resolution path I discuss in my private link DNS series.
Unfortunately, there aren’t a ton of great options. I walk through some of the options in my series and Dan Mauser has done a wonderful job walking through some others in his postings. In his post, he discusses two solutions:
Creating a conditional forwarder for the FQDN of the Azure resource
Creating a forward lookup zone for the FQDN of the Azure resource.
The most common on-premises integration pattern looks something like the below In this pattern on-premises DNS servers are configured with a conditional forwarder to send all DNS queries for Azure PaaS services (such as database.windows.net) to a DNS resolver in Microsoft Azure. That resolver is configured with a standard forwarder to send all of its queries to the 220.127.116.11 virtual IP.
One of the downfalls of this pattern if you’re forwarding queries for all of public namespaces of the Azure PaaS DNS services you’re using up to Azure. If your ExpressRoute drops or S2S VPN (Site-to-Site VPN) drops, those queries will either timeout and fail to resolve or timeout and resolve to the public IP addresses.
In scenario 1, customers try to avoid that problem by creating conditional forwarders for each Azure resource’s private endpoint. For example, if you have a database named mydb.database.windows.net, you would create a conditional forwarder on-premises with the name of mydb.database.windows.net and point it to your upstream Azure DNS server which would go through the standard resolution path to resolve to the A record in Azure Private DNS.
In scenario 2, customers try to avoid the same problem by creating a forward zone for each Azure resource’s private endpoint. The concept is the same as a conditional forwarder where the forward zone is named the same as your resource (mydb.database.windows.net) but with the difference that your DNS server will resolve the record authoritatively.
So in short, both of these options work but in no way are they scalable to manage from a lifecycle perspective because of the scale and ephemeral nature of cloud. Creation or deletion of a forward zone or conditional forwarder is server-level change making it more likely someone will break something versus modification of an A record thus increasing the risk of this pattern. Finally, if you are using an Active Directory-integrated DNS zone, you get the added shi*t show of bloating your Active Directory DIT with the creations and deletions of all these records at scale.
My recommendation is to stick with the standard pattern I outlined above if you can. The Azure Private DNS service will evolve over time and more than likely new capabilities will be added to help address these gaps.
Hi there and welcome to the second post in my series about Azure Files integration with AD DS. In the first post I gave an overview of the service, the value proposition, its current limitations, and described the lab I’ll be using for this post. For this post I’ll be walking through the setup, examining some packet captures and Fiddler captures, and touching on a few of the gotchas I ran into.
Before I jump into the technical gooey goodness, I’m going to cover some prerequisites.
One obvious factoid is you’ll need a Windows AD domain up and running and the machine you connect to the share from will need to be joined to that domain. One disclaimer to keep in mind is you have a multiple Windows AD forest scenario, such as an account and resource forest, you’ll need to be aware of which domain you’re integrating the Azure File share with. If you integrate it with a resource forest but have your user accounts in the account forest, you’ll need to use name suffix routing. I’m not going to go into the details of name suffix routing, but if you’re curious you can read through this article. The short of it is the service principal name associated with the computer or service account used to represent the Azure Storage account the file share is created on uses the domain of files.core.windows.net. When performing the Kerberos authentication, the domain controller in the account forest wouldn’t know how to to direct the request to the resource forest because that domain will not be associated with your resource forest. For this lab I created a Windows AD domain with the namespace jogcloud.local
You will also need to ensure that you are synchronizing the users and groups from your Windows AD domain you want to be able to access the Azure File share to Azure AD. The tenant you synchronize to must be the same tenant the Azure subscription containing the Azure Storage account is associated with. Don’t worry about why right now, I’ll cover that later. For this lab I’ll be using my jogcloud.com tenant.
To store those wonderful files you’ll need an Azure Storage account. The storage account should be created in the same region (or closest region if on-premises) to the clients that will access the file share. This will ensure optimal performance and avoid cross region costs if your clients are in Azure. You can use either a storage account with the standard GPv2 (General Purpose v2) SKU or Premium FileStorage SKU if you need better performance and scale. For this lab I’ll be using the GPv2 SKU.
Lastly, networking requirements. Like all Azure PaaS offerings, Azure Storage is by default available over the public Internet. Since no sane human being wants to send SMB traffic over the public Internet, you have the option of using a private endpoint. For this lab I’ll be going the private endpoint route.
So prerequisites are now set, let’s jump into the setup.
Integrating Azure Storage account with Windows AD
The first step in the process to get this integration working is to get the Azure Storage account you’ll be using setup with an identity in Windows AD. A kind human being over at Microsoft wrote a wonderful Azure PowerShell module that makes what I’m about to do a hundred times easier and is the recommended way to go about this. I’m not going to use it for this demonstration because I want to walk through each of the steps in the process to better your understanding of the magic within the module.
Before you run any commands you’ll need to ensure you have the PowerShell modules below installed. You can validate this by running Get-Module -ListAvailable to display the PowerShell modules installed on the machine.
Now we need to create the security principal that is going to represent the Azure Storage account in Windows AD. You have the option of either using a traditional service account (user account) or a computer account. My advice is to use a service account since it’s a more traditional pattern for this type of use case and it’s more likely your organization has processes built around the lifecycles of service accounts than it is computer accounts.
One important thing to note here is you need to treat this just like you would a traditional service account. By this I mean you will want to create the account with a non-expiring password and put in appropriate controls to perform a controlled rotation of the credential to avoid service disruption.
Here I’ve created a service account with the name azurestorage and have set it with a non-expiring password.
Next up I’m going create a SPN (service principal name) for the service account. The SPN is going to identify the Azure Storage account to Windows AD and instruct the user’s system which service it needs to obtain a Kerberos ticket for. The SPN is going to use the CIFS service class and include the FQDN of the Azure Files endpoint on your storage account. It will look like cifs/stjogfileshare.file.core.windows.net. You can register the SPN using the setspn -S <SPN> <ACCOUNT_NAME>. The -S switch will validate there the SPN is not already registered to another security principal in the domain.
So you have a service account and an SPN. Now you need to create a credential in Azure Storage and associate that credential with the service account. To create that credential you’ll need to hop over to PowerShell and connect to Azure. Once connected you’ll use the New-AzStorageAccountKey and Get-AzStorageAccountKey cmdlets to create and retrieve the storage account key used for the integration. It’s important to note that this key (named kerb1 or kerb2) is only used to setup this integration and can’t be used for any control or data plane operations against the storage account.
Configure the service account with this key as its password.
The last piece in this step of the process is to enable the AD DS feature support for the storage account. To do this you’ll use the Set-AzStorageAccount cmdlet using the syntax below.
All of the inputs between Name and ActiveDirectoryAzureStorageSid can be obtained by using the Get-ADDomain cmdlet as seen below.
The ActiveDirectoryAzureStorageSid parameter can be obtain by using the Get-ADUser cmdlet as seen below.
Once you have the inputs, you’ll plug them in Set-AzStorageAccount cmdlet. If successful you’ll get a return similar to below.
If you’d like you can confirm the feature is enabled you can do that with the steps documented here. The AzFilesHybrid module I mentioned earlier also has some great debugging tools as outlined here.
Now that the integration is complete, I need to create a file share and configure authorization at the management plane. There are two separate layers of authorization occurring, one for access to the file share itself and the other for access to the files and folders within the file share. Access to the share itself is controlled by Azure RBAC and thus controlled by the Azure management plane. There are three roles built in roles provided that should service most use cases and these are:
Storage File Data SMB Share Read which allows read access to the file share over SMB
Storage File Data SMB Share Contributor which allows read, write, and delete access over the file share over SMB
Storage File Data SMB Share Elevated Contributor which allows read, write, delete, and modify of Windows ACLs of the file share over SMB
You are free to design your own custom roles, but those three built in roles are pretty much spot on as to what you’d see in your typical Windows File share-level permissions.
The second layer of authorization is controlled by the Windows ACLs (access control lists) associated with the share, files, and folders. These are your classic Windows ACLs you know and love and will be enforced by the Windows OS.
Just like on a traditional Windows file share, the most restrictive of controls will apply. This means if you’ve only been granted the Storage File Data SMB Share Read role, it won’t matter if you have full permissions in the Windows ACLs, you will only be able to read and will not be able to write.
Let me demonstrate this.
Here I have assigned the Bob Gray user the Storage File Data SMB Share Reader role on the stjogfileshare storage account. Bob Gray is Domain administrator on the jogcloud.local Windows AD domain and is a local administrator on the member server.
As seen below I’m able to successfully map the shared folder, but I’m unable to create folder on it because the management plane is restricting my access.
Running the klist command on the machine shows I successfully obtained a Kerberos ticket for file share.
The packet capture I ran when I mapped the share shows in the SMB conversation that the client and server are using the negotiate protocol (which includes the Kerberos protocol).
After the Kerberos ticket is obtained from the domain controller the client sets up the session with the storage account.
From this point forward, the encryption capabilities of SMB 3 are used to encrypt the session between the client and Azure storage account.
Note that the Kerberos encryption algorithm being used is RC4-HMAC. This is an important limitation to take note of in that the AES algorithm is not yet supported. If you have internal security policies that prohibits the use of RC4-HMAC and have created a group policy to block its use, you will not be able to connect to the service. What you’ll experience if that is the case is the share will not be accessible. If you do a packet capture you’ll see the Azure Storage account throw a KRB Error: KRB5RB_AP_ERR_MODIFIED indicating the Azure Storage account was unable to validate the ticket that was passed because it doesn’t support the encryption algorithm used to secure it.
I went through and created a group in Windows AD named engineering and added Bob Gray to it. I then removed the Storage File Data SMB Share Read role assignment for Bob Gray and created a role assignment for the engineering group for the Storage File Data SMB Share Contributor role. I’m now able to create files and folders on the share as seen below.
Bringing up the permissions on the folders you’ll observe that that are a few default permissions which come out of the box. You can modify these default permission if you’d like (for example by removing Authenticated Users Read/Modify which is overly permissive). You do this by mounting the share as a super user using the standard storage keys. The process is outlined here.
That is pretty much all there is to it to the technical configuration.
So when you use this service what are some of the best practices that I would recommend?
Use a service account instead of a computer account. Your enterprise is more than likely more adept at handling the lifecycle of service accounts and managing the password is far easier.
Ensure you rotate the password at whatever interval aligns with your organizational security policy and any laws and regulations you may be subject to.
When you create your Azure RBAC role assignments, use synchronized groups vs synchronized users. You do this for the same reason you would on-premises, granting access per user is not scalable.
Dedicate the storage account you use for the file share to only file shares. Storage accounts have fixed limits that are shared across blobs, queues, tables, and files. You don’t want to get into a situation where you have to share those limits.
Do your research to determine if Premium FileStorage makes more sense than GPv2. It’s more costly but provides better performance and scale.
Try to deploy one file share per storage account if possible to ensure you get the maximum IOPS available for that file share. You can certainly put multiple file shares in the same storage account, but they will share the total IOPS available for the storage account.
Ensure you are replicating files to another storage account. Unlike blobs, you can’t read from the second region if you’re using a RA-GRS storage account. If you’re using the Premium Files SKU, the storage account will only support LRS and ZRS which makes this replication to a storage account in another region so important. You could use AzCopy, PowerShell, or Azure Data Factory.
That’s it folks! Hope this post helped you understand feature that much better.
I recently had a few customers reach out to me with questions around Azure Files integration with Windows Active Directory Domain Services (AD DS). Since I had never used it, I decided to build a small lab and test the functionality and better understand the service. In this series I’ll be walking through what the functionality provides, how I observed it working, and how to set it up.
In any enterprise you will have a number of Windows file shares hosting critical corporate data. If you’ve ever maintained or supported those file shares you’re quite familiar with the absolute sh*t show that occurs across an organization with the coveted H drive is no longer accessible. Maintaining a large cluster of Windows Servers backing corporate file shares can be significantly complex to support. You have your upgrades, patches, needs to scale, failures with DFS-R (distributed file system replication) or FRS-R (file replication service replication) for some of you more unfortunate souls. Wouldn’t it be wonderful if all that infrastructure could be abstracted and managed by someone else besides you? That is the major value proposition of Azure Files.
Azure Files is a PaaS (platform-as-a-service) offering provided by Microsoft Azure that is built on top of Azure Storage. It provides fully managed file shares over a protocol you know and love, SMB (Server Message Block). You simply create an Azure File share within an Azure Storage account and connect to the file share using SMB from a Windows, Linux, or MacOS machine.
Magic right? Well what about authentication and authorization? How does Microsoft validate that you are who you say you are and that you’re authorized to connect to the file share? That my friends will be what we cover from this point on.
Azure File shares supports methods of authentication:
Storage account access keys
Azure AD Domain Services
AD DS (Active Directory Domain Services)
Of the three methods, I’m going cover authentication using AD DS (which I’ll refer to as Windows AD).
Support for Windows AD with Azure Files graduated to general availability last month. As much as we’d like it to not be true, Windows AD and traditional SMB file shares will be with us many years to come. This is especially true for enterprises taking the hybrid approach to cloud, which is a large majority of the customer base I work with. The Windows AD integration allows organizations to leverage their existing Windows AD identities for authentication and protect the files using Windows ACLs (access control lists) they’ve grown to love. This provides a number of benefits:
Single sign-on experience for users (via Kerberos)
Existing Windows ACLs can be preserved if moving files to a Azure File share integrated with Windows AD
Easier to supplement existing on-premises file servers without affecting the user experience
Better support for lift and shift workloads which may have dependencies on SMB
No infrastructure to manage
Support for Azure File Sync which allows you to store shares in Azure Files and create cache on Windows Servers
There are a few key dependencies and limitations I want to call out. Keep in mind you’ll want to visit the official documentation as these will change over time.
The Windows AD identities you want to use to access the file shares must be synchronized to Azure AD (we’ll cover the why later)
The storage account hosting the Azure File share must be in the same tenant you’re syncing the identities to
Linux VMs are not supported at this time
Services using the computer account will not be able to access an Azure File share so plan on using a traditional service account (aka User account) instead
Clients accessing the file share must be Window 7/Server 2008 R2 ore above
So I’ve given you the marketing pitch, let’s take a look at the lab environment I’ll be using for this walkthrough.
For my lab I’ve provisioned three VMs (virtual machines) in a VNet (virtual network). I have a domain controller which provides the jogcloud.local Windows AD forest, an Azure AD Connect server which is synchronizing users to the jogcloud Azure AD tenant, and a member server which I’ll use to access the file share.
In addition to the virtual machines, I also have an Azure Storage account where I’ll create the shares. I’ve also configured a PrivateLink Endpoint which will allow me to access the file share without having to traverse the Internet. Lastly, I have an Azure Private DNS zone hosting the necessary DNS namespace needed to handle resolution to my Private Endpoint. I won’t be covering the inner workings of Azure Private DNS and Private Endpoints, but you can read my series on how those two features work together here.
In my next post I’ll dive in how to setup the integration, walk through some Wireshark and Fiddler captures, and walk through some of the challenges I ran into when running through this lab for this series.
Welcome back folks! I recently had a few customers ask me about using certificates with Azure Key Vault and switching from using a client secret to a client certificate for their Azure AD (Active Directory) service principals. The questions put me on a path of diving deeper around the topics which results in some great learning and opportunity to create some Python code samples.
Azure Key Vault is Microsoft’s solution for secure secret, key, and credential management. If you’re coming from the AWS (Amazon Web Services) realm, you can think of it as AWS KMS (Key Management Services) with a little bit of AWS Secrets Manager and AWS Certificate Manager thrown in there. The use cases for secrets and keys are fairly well known and straightforward, so I’m going instead focus time on the certificates use case.
In a world where passwordless is the newest buzzword, there is an increasing usage of secrets (or passwords) in the non-human world. These secrets are often used to programmatically interact with APIs. In the Microsoft world you have your service principals and client secrets, in the AWS world you have your IAM Users with secret access keys, and many more third-parties out there require similar patterns that require the use of an access key. Vendors like Microsoft and AWS have worked to mitigate this growing problem in the scope of their APIs by introducing features such as Azure Managed Identities and AWS IAM Roles which use short lived dynamic secrets. However, both of these solutions work only if your workload is running within the relevant public cloud and the service it’s running within supports the feature. What about third-party APIs, multi-cloud workloads, or on-premises workloads? In those instances you’re many of times forced to fall back to the secret keys.
There is a better option to secret keys, and that is client certificates. While a secret falls into the “something you know” category, client certificates fall into the “something you have” category. They provide an higher assurance of identity (assuming you exercise good key management practices) and can have more flexibility in their secure storage and usage. Azure Service Principals support certificate-based authentication in addition to client secrets and Azure Key Vault supports the secure storage certificates. Used in combination, it can yield some pretty cool patterns.
Before I get into those patterns, I want to cover some of the basics in how Azure Key Vault stores certificates. There are some nuances to how it’s designed that is incredibly useful to understand. I’m not going to provide a deep dive on the inner workings of Key Vault, the public documentation does a decent enough job of that, but I am going to cover some of the basics which will help get you up and running.
Certificates can be both imported into and generated within Azure Key Vault. These certificates generated can be self-signed, generated from a selection of public CAs (certificate authorities) it is integrated with, or can be used to generate a CSR (certificate signing request) you can full-fill with your own CA. These processes are well detailed in the documentation, so I won’t be touching further on them.
Once you’ve imported or generated a certificate and private key into Key Vault, we get into the interesting stuff. The components of the certificate and private key are exposed in different ways through different interfaces as seen below.
Metadata about the certificate and the certificate itself are accessible via the certificates interface. This information includes the certificate itself provided in DER (distinguished encoded rules) format, properties of the certificate such as the expiration date, and metadata about the private key. You’ll use this interface to get a copy of the certificate (minus private key) or pull specific properties of the certificate such as the thumbprint.
Operations using the private key such as sign, verify, encrypt, and decrypt, are made available through the key interface. Say you want to sign a JWT (JSON Web Token) to authenticate to an API, you would use this interface.
Lastly, the private key is available through the secret interface. This is where you could retrieve the private key in PEM (privacy enhanced mail) or PKCS#12 (public key cryptography standards) format if you’ve set the private key to be exportable. Maybe you’re using a library like MSAL (Microsoft Authentication Library) which requires the private key as an input when obtaining an OAuth access token using a confidential client.
Now that you understand those basics, let’s look at some patterns that you could leverage.
In the first pattern consider that you have a CI/CD (continuous integration / continuous delivery) running on-premises that you wish to use to provision resources in Azure. You have a strict requirement from your security team that the infrastructure remain on-premises. In this scenario you could provision a service principal that is configured for certificate authentication and use the MSAL libraries to authenticate to Azure AD to obtain the access tokens needed to access the ARM API (Azure Resource Manager). Here is Python sample code demonstrating this pattern.
In the next pattern let’s consider you have a workload running in the Azure AD tenant you dedicate to internal enterprise workloads. You have a separate Azure AD tenant used for customer workloads. Within an Azure subscription associated with the customer tenant, there is an instance of Azure Event Hub you need to access from a workload running in the enterprise tenant. For this scenario you could use a pattern where the workload running in the enterprise tenant uses an Azure Managed Identity to retrieve a client certificate and private key from Key Vault to use with the MSAL library to obtain an access token for a service principal in the customer tenant which it will use to access the Event Hub.
For the last pattern, let’s consider you have the same use case as above, but you are using the Premium SKU of Azure Key Vault because you have a regulatory requirement that the private key never leaves the HSM (hardware security module) and all cryptographic operations are performed on the HSM. This takes MSAL out of the picture because MSAL requires the private key be provided as a variable when using a client certificate for authentication of the OAuth client. In this scenario you can use the key interface of Key Vault to sign the JWT used to obtain the access token from Azure AD. This same pattern could be leveraged for other third-party APIs that support certificate-based authentication.
Well folks I’m going to keep it short and sweet. Hopefully this brief blog post has helped to show you the value of Key Vault and provide some options to you for moving away from secret-based credentials for your non-human access to APIs. Additionally, I really hope you get some value out of the Python code samples. I know there is a fairly significant gap in Python sample code for these types of operations, so hopefully this begins filling it.