DNS in Microsoft Azure Part 2 – Azure Private DNS

DNS in Microsoft Azure Part 2 – Azure Private DNS

Updates:

7/2025 – Removed bullet point highlighting lack of DNS query logging and added that this can be achieved with DNS Security Policy

This is part of my series on DNS in Microsoft Azure.

Welcome back fellow geek to part two of my series on DNS in Azure.  In the first post I covered some core concepts behind the DNS offerings in Azure.  One of the core concepts was the 168.63.129.16 virtual IP address which serves as a mechanism for services within an Azure Virtual Network to communicate with platform services such as the Azure DNS Service. I also covered the basic DNS offering, Azure-provided DNS.  For this post I’m going to cover the Azure Private DNS service.

Azure-provided DNS may serve your needs if you’re doing basic proof-of-concept testing, but not much use beyond that.  The limited capabilities around supported record types and scaling challenges when requiring resolution across virtual networks make it a non-starter for anything with production-scale needs  Prior to Azure Private DNS, customers were forced to roll their own DNS servers to host any private namespaces they wanted to use in Azure.  Programmatic management of records in traditional DNS servers can be limited, making it challenging to balance with the ephemeral nature of the cloud.

Microsoft introduced Azure Private DNS into public preview back in early 2018 to help address these problems.  The service officially went general availability in October 2019.  It addresses many of the gaps Azure-provided DNS has such as support for:

  • Custom DNS namespaces
  • Manually created records
  • Common DNS record types such as A, MX, CNAME, PTR
  • Automatic lifecycle management of DNS records for some Azure resources such as virtual machines
  • DNS namespaces can be shared across virtual networks

Before we jump into the weeds, we’ll first want to cover the basic concepts of the service.

Azure Private DNS zones are an Azure resource under the Microsoft.Network resource provider with a path of /providers/Microsoft.Network/privateDnsZones/.  Each DNS zone you want to create is represented as a separate resource.  Zones created in one subscription can be used for resolution within another subscription in the same tenant or even a different tenant.

Once the Private DNS Zone is created you need to create a virtual network link to the resource. This resource is also under the Microsoft.Network resource provider and has a path of /providers/Microsoft.Network/privateDnsZones/virtualNetworkLinks/. VNets can resolve and and optionally register DNS records with the zones you create after you create a virtual network link between the VNet and the zone.  Each zone can be linked to multiple VNets for registration and resolution.  On other hand, VNets can be linked to multiple zones for resolution but only one zone for registration.  Once a zone is linked to the VNet, resources within the VNet can resolve and/or register DNS records for those zones through the 168.63.129.16 virtual IP.

In addition to DNS resolutions, zones can be linked for auto-registration. When a virtual network is linked for auto-registration the VMs will register A records within the Azure Private DNS Zone and the Azure-provided DNS zone.  There are a few things to note about the A records automatically created in the private zone:

  • Each record has a property called isAutoRegistered which has a boolean value of true for any records created through the auto-registration process.
  • Auto-registered records have an extremely short TTL of 10 seconds.  If you have plans of performing DNS scavenging, take note of this and that these records are automatically deleted when the VM is deleted.
  • Virtual networks can only be linked to one virtual network for auto-registration.
Private DNS zone viewed in portal
Records within an Azure Private DNS Zone

The Azure-provided DNS zone dynamically created for the VNet is still created even when linking an Azure Private DNS zone to a VNet.  Additionally, if you try to resolve the IP address using a single label hostname on a Windows machine, you’ll get back the A record for the Azure-provided DNS zone as seen in the image below.  This is by design and allows you to control the DNS suffix automatically appended by your VMs.  It also means you need to use the FQDN in any application configuration to ensure the record is resolved correctly.

dnsquery.PNG
Single label lookup results in Azure-provided DNS virtual network namespace

Let’s take a look at a resolution scenario.

Scenario: VM1 wants to resolve the IP address of VM3.mydomain.com

In the image below we have resolution between two virtual networks.  In this scenario we have Virtual Network 1 and Virtual Network 2.  Virtual Network 1 is linked for both registration and resolution to the Azure Private DNS Zone of mydomain.com.  Virtual Network 2 is linked to the same zone for resolution.  In this configuration, both Virtual Network 1 and Virtual Network 2 are able to resolve records in the mydomain.com Azure Private DNS Zone namespace.

Let’s walk through this query resolution process:

  1. VM1 creates a DNS query for vm3.mydomain.com. VM1 does not have a cached entry for vm3.mydomain.com so the query is passed on to the DNS Server configured for the VMs virtual network interface (VNIC). The DNS Server has been configured by the Azure DHCP Service to the 168.63.129.16 virtual IP which is the default configuration for virtual network DNS Server settings.
  2. The DNS query is passed through the virtual IP and on to the Azure-provided DNS Service. The Azure-provided DNS Service identifies there is an Azure Private DNS Zone named mydomain.com linked to the virtual network so the query is resolved against this zone and returned to VM1.

Cross-virtual network resolution with Azure Private DNS Zones

Beyond the auto-registration of records, you can also manually create a variety of record types as I mentioned above. There isn’t anything special or different in the way Azure is handling these records.  The only thing worth noting is the records have a standard 1 hour TTL..

One other important thing to note about Azure Private DNS Zones is they are a global resource. This means that the data from an Azure Private DNS Zone is replicated across all Azure regions. If you have a zone linked to virtual network in regionA it can also be linked to virtual network in regionB. This is best practice. Now the caveat of being a global resource is the Private DNS Zone still needs to exist within a resource group, which is regional. In the event of a regional outage in the region where the resource group exists, you will not be able to modify the Azure Private DNS Zone. It will however continue services DNS queries for virtual networks in other regions.

Azure Private DNS Zones can be used to provide custom DNS namespaces for internally-facing applications you build within Azure. Remember that the Azure-provided DNS service cannnot be used by on-premises machines without a DNS Proxy (such as BIND server, Windows DNS Server, InfoBlox, etc) or the Azure Private DNS Resolver. This means you cannot resolve records in an Azure Private DNS Zone unless you have that in place.

An important use case for understanding Azure Private DNS Zones is they play a very important role with Azure PaaS services that support Azure Private Link Private Endpoints. For those use cases you will not be able to pick the namespace, you’ll need to use what Microsoft provides. I cover this in a later post in this series.

When using custom namespaces for applications you develop in Azure where you control the certificate the service serves up there are a few strategies you can employ:

  • Separate private DNS zone for each application –  In this scenario you could grant business units full control of the zone letting them manage the records as they see fit.  This would improve the application team’s agility while reducing operational burden on central IT.
  • Separate private DNS zones for each environment (Dev/QA/Prod) – In this scenario you establish separate zones for each environment which are shared across business units and these zones are managed by central IT.

Summing up the Azure Private DNS Zones when used with only the Azure-provided DNS Service:

  • Benefits
    • Managed service where you don’t have to worry the managing the underlining infrastructure
    • Scalability and availability
    • Use of custom DNS namespaces
    • Global resource that can provide resolution for virtual networks spread across Azure regions
    • Lifecycle of Azure VMs DNS records are automatically managed by the platform if using auto-registration
    • Applications could be assigned their own DNS zones and application owners delegated some level control over that zone
    • Azure-provided DNS can support query logging when used in combination with DNS Security Policy
  • Considerations
    • The records in these zones cannot be resolved by on-premises endpoints unless you incorporate a DNS Proxy (such as BIND server, Windows DNS Server, InfoBlox, etc) or the Azure Private DNS Resolver.
    • While Azure Private DNS Zone resources are global the service’s control plane is regionally dependent on the region the of the resource group the zone is deployed in
    • Linking the Private DNS Zones to every virtual network could risk hitting the limits of the service
    • No support for WINS or NETBIOS

In my next post I’ll cover the how the Azure Private DNS Resolver builds on these two components and begins addressing some of these considerations..

DNS in Microsoft Azure Part 1 – Azure-provided DNS

DNS in Microsoft Azure Part 1 – Azure-provided DNS

Updates:

  • 7/2025 – Removed bullet point highlighting lack of DNS query logging and added that this can be achieved with DNS Security Policy

This is part of my series on DNS in Microsoft Azure.

Hi everyone,

In this series of posts I’m going to talk about a technology, that while old, still provides a critical foundational service.  Yes folks, we’re going to cover Domain Naming System (DNS).  Specifically, we’re going to look at how internal DNS (non-public) works in Microsoft Azure and what the positives and negatives are of each pattern.  I’m going to go into this assuming you have a basic knowledge of DNS and understand the namespaces, different record types, forward and reverse lookup zones, recursive and iterative queries, DNS forwarding and conditional forwarding, and other core DNS concepts. If those topics are unfamiliar to you, I’d suggest reading through DNS 101 by RedHat

I’m a big fan of establishing a shared vocabulary. Below I’m going to define some terms I’ll be using throughout the series.

  • A record – Resolves a hostname to an IP address such as http://www.journeyofthegeek.com to 5.5.5.5.
  • PTR Record – Resolves an IP address to a hostname.
  • CNAME record – Alias record where you can point on (FQDNs) fully qualified domain name to another to make it a domain a human can remember and for high availability configurations.
  •  Recursive Name Resolution – A DNS query where the client asks for a definitive answer to a query and relies on the upstream DNS server to resolve it to completion.  Forwarders such as Google DNS function as recursive resolvers.
  • Iterative Name Resolution – A DNS query where a client receives back a referral to another server in place of resolving the query to completion.  Querying root hints often involves iterate name resolution.
  • Standard DNS Forwarder – Forward all queries the DNS service can’t resolve to an upstream DNS service.  These upstream servers are typically configure to perform recursive or iterative name resolution.
  • Conditional Forwarder – Forward queries for a specific DNS namespace to an upstream DNS service for resolution. This is referred to as a forward zone in BIND.
  • Split-brain / Split Horizon DNS – A DNS configuration where a DNS namespace exists authoritatively across one or more DNS implementations.  A common use case is to have a single DNS namespace defined on Internet-resolvable public facing DNS servers and also on Intranet private facing DNS servers.  This allows trusted clients to reach the service via a private IP address and untrusted clients to reach the service via a public IP address.

Now that I’ve established our vocabulary for DNS, I want to cover the 168.63.129.16 address.  If you’ve ever done anything even basic in Azure, you’ve probably run into this address or used it without knowing it.  This public IP address is owned by Microsoft and is presented as a virtual IP address serving as a communication channel to the host node  for a number of platform resources.  It provides functionality such as virtual machine (VM) agent communication of the VM’s ready state, health state, enables the VM to obtain an IP address via DHCP, and you guessed it, enables the VM to leverage Azure DNS services.  The address is static and is the same for any VNet you create in every Azure region.

Traffic is routed to and from this virtual IP address through the subnet gateway.  If you run a route print on a Windows machine, you can see this route defined in the routing table of the VM.

route
Output of route print on Azure VM

The IP address is also defined in the VirtualNetwork service tag meaning the default rules within a network security group (NSG) allow this traffic to and from the VM. DNS traffic to the IP address is not filtered by NSGs by default, but you can block it with an NSG if you wish to using the instructions outlined here. You might do this if you do not want clients using the Azure platform for DNS and instead want all of these lookups to occur through another DNS mechanism such as the Azure Private DNS Resolver or a 3rd-party DNS service you have deployed.

Now that you understand what the 168.63.129.16 virtual IP address is, let’s first cover the very basics of DNS in Azure. You can configure Azure’s DHCP service to push a custom set of DNS servers to Azure resources within the virtual network or leave the default. The default DNS Server settings for a virtual network is 168.63.129.16 IP address which provides access to the Azure-provided DNS service. DNS Server settings pushed through the Azure DHCP Service can be configured at the virtual network or virtual network interface (VNI). Best practice is to configure this at the virtual network. I’ve never come across a use case to configure it at the VNI.

Configure DNS on VNet
Configure DNS Server DHCP option on VNet

This brings us to the first option for DNS resolution in Azure, Azure-provided name resolution.  Each time you spin up a virtual network Azure assigns it a unique private DNS namespace using the format <randomly generated>.internal.cloudapp.net.  This namespace is pushed to any virtual machines with VNIs in the virtual network via DHCP Option 15. An A record for each VM deployed in the virtual network is automatically registered which allows each VM the built-in ability to resolve the names of virtual machines within the same virtual network. The platform also creates PTR records in reverse lookup zones created for each of the subnets in the virtual network where VMs have VNICs in.

Let’s look at an example with a single VNet.  I’ve created a single VNet named vnet1.  I’ve assigned the CIDR block of 10.101.0.0/16 and created a single subnet assigned the 10.101.0.0/24 block.  Two Windows Server 2016 VMs have been created named azuredns and azuredns1 with the IP addresses 10.101.0.4 and 10.101.0.5.  Azure has assigned the a namespace of r0b5mqxog0hu5nbrf150v3iuuh.bx.internal.cloudapp.net to the VNet.  Note the DHCP Server and DNS Server settings in the ipconfig output of the azuredns vm shown below.

ipconfig
IPConfig output of Azure VM

If azuredns1 is pinged from azuredns you can see the in below Wireshark capture that prior to executing the ping, azuredns performs a DNS query to the 168.63.129.16 VIP and gets back a query response with the IP address of azuredns1. Pinging the single label name of the virtual machine will work as well because the Azure-provided virtual network DNS namespace is automatically prepended to the label by the operating system due to DHCP option 15 (assuming you haven’t configured the operating system to do anything different).

wireshark
Wireshark packet capture of DNS query

An example of the resolution path is diagrammed below.

In this example, the following happens:

  1. VM1 creates a DNS query for vm2 and the FQDN configured for the virtual network is automatically added to the single label resulting in a query for vm2.random.internal.cloudapp.net. VM1 does not have a cached entry for vm2.random.internal.cloudapp.net so the query is passed on to the DNS Server configured for the VMs virtual network interface (VNIC). The DNS Server has been configured by the Azure DHCP Service to the 168.63.129.16 virtual IP which is the default configuration for virtual network DNS Server settings.
  2. The DNS query is passed through the virtual IP and on to the Azure-provided DNS Service. The Azure-provided DNS Service resolves this query against the virtual network namespaces and returns the IP address for vm2.
Azure-provided DNS resolution within a virtual network

That’s all well and good for very basic DNS resolution, but who the heck has a single VNet in anything but a test environment?  So can we expand Azure-provided DNS to multiple VNets?  The answer is yes.  Recall that each VNet has its own private DNS namespace.  The only way to resolve names contained within that namespace is for a VM in that VNet to send the query to the 168.63.129.16 address.  Yes folks, this means you would need to drop a DNS server in each VNet in order to resolve the Azure-provided DNS host names assigned to VMs within that VNet by another VMs in another VNet as illustrated in the diagram below.

In this example, the following happens:

  1. VM1 creates a DNS query for vm3.random2.internal.cloudapp.net. The fully-qualified domain name (FQDN) must be provided because each virtual network uses a different randomly generated namespace. VM1 does not have a cached entry for vm3.random2.internal.cloudapp.net so the query is passed on to the DNS Server configured for the VMs virtual network interface (VNIC). Here the virtual network DNS server settings have been configured to for 10.0.2.4 which has been passed down to VM1 through the Azure DHCP Service.
  2. The DNS query arrives at DNS Server 1. DNS Server 1 does not have a cached entry. It determines it is not authoritative for the random2.internal.cloudapp.net namespace but determines it has a conditional forwarder configured for the zone pointing to 10.100.2.4. The DNS query is recursively passed on to 10.100.2.4 over the virtual network peering.
  3. The DNS query arrives at DNS Server 2. DNS Server 2 does not have a cached entry. It determines it is not authoritative for the random2.internal.cloudapp.net namespace and it does not have a conditional forwarder configured. DNS Server 2 has been configured with a standard forwarder to the 168.63.129.16 virtual IP. The query is passed on to the virtual IP and to Azure-provided DNS which resolves the query for requested record and returns the response.
Azure-provided DNS with multiple virtual networks

You can see as the number of VNets increases the scalability of this solution quickly breaks down because who the heck wants to have a DNS Server deployed to evey virtual network. Take note that if you wanted to resolve these host names from on-premises you could use a similar conditional forwarder pattern where you would pass the query to the DNS Server in Azure and on to Azure-provided DNS.

Let’s sum up the positives and negatives of Azure-provided DNS with the default virtual network namespaces..

  • Positives
    • No need to provision your own DNS servers and worry about high availability or scalability
    • DNS service provided by Azure automatically scales
    • VMs within a VNet can resolve each other’s IP addresses out of the box
    • VMs within a VNet can perform reverse lookups to get the IP address of another VM
    • DNS Query Logging is supported with use of DNS Security Policy
  • Negatives
    • Solution doesn’t scale with multiple VNets
    • You’re stuck with the namespace assigned to the VNet
    • WINS and NetBIOS are not supported
    • Only A records and PTR records that are automatically registered by the service are supported (no manual registration of records)

As you can see from the above the negatives far outweigh the positives for using the default virtual network namespaces and you’ll likely never use them. The important thing to take away from this post is an understanding of how DNS Server settings are configured and how you can configure a DNS Server to communicate with the Azure-provided DNS service. This will be relevant for everything we talk about moving forward.

In the next post l cover Azure’s new offering in the DNS space, Azure Private DNS Zones.  I’ll walk through how it works and how we can combine it with BYO DNS to create some pretty neat patterns.

See you then!

Capturing Azure Management Group Activity Logs Using Azure Automation – Part 1

Capturing Azure Management Group Activity Logs Using Azure Automation – Part 1

Hello again fellow geeks!

Over the past few months I’ve been working with a customer who is just beginning their journey into the cloud.  We’ve had a ton of great conversations around security, governance, and operationalizing Microsoft Azure.  We recently finalized the RACI and identified the controls required by both their internal security policy and their industry compliance requirements.  With those two items complete, we put together our Azure RBAC model and narrowed down the Azure Policies we needed to put in place to satisfy our compliance controls.

After a lot of discussion about the customer’s organization, its geographical locations, business unit makeup, and how its developers and central IT operate, we came up with a subscription model.  This customer had decided on an Azure subscription model where each workload would exist in its own subscription.  Further, each workload’s production and non-production environment would be segmented in different subscriptions.  Keeping each workload in a different subscription ensures no workload will compete for resources with other workloads and hit any subscription limits.  Additionally, it allowed the customer to very easily track the costs associated with each workload.

Now why did we use separate production and non-production subscriptions for each workload?  One reason is to address the same risk as above where a non-production workload could potentially consume all resources within a subscription impacting a production workload.  The other more critical reason is it makes it easier for us to apply different governance and access controls on production workloads vs non-production workloads.  The way we do this is through the usage of Azure Management Groups.

Management Groups were introduced into general availability back in late 2018 to help address the challenges organizations were having operating subscriptions at scale.  They provided a hierarchal method to apply governance and access controls across a collection of subscriptions.  For those of you familiar with AWS, Management Groups are somewhat similar to AWS Organizations and Organizational Units.  For my fellow Windows AD peeps, you can think of Management Groups somewhat like the Active Directory container and organizational unit hierarchy in an Active Directory domain where you apply different access control entries and group policy at high levels in the OU hierarchy that is then enforced and inherited down to the children.  Management Groups work in a similar manner in that the Azure RBAC definitions and assignments and Azure Policy you assign to the parent Management Groups are inherited down into the children.

Every Azure AD tenant starts with a top-level management group called the tenant root group.  Additional management groups created within the tenant are children of the group up to a maximum of 10,000 management groups and up to six levels of depth.  Any RBAC assignment or Azure Policy assigned to the tenant root group applies to all children management group in the tenant.  It’s important to understand that Management Groups are a resource within the Azure AD tenant and not a resource of an Azure subscription.  This will matter for reasons we’ll see later.

The tenant root management group can only be administered by a Global Admin by default and even this requires a configuration change in the tenant.  The method is describe here and what it does is places the global administrator performing the action in the User Access Administrator RBAC role at the root of scope.  Once that is complete, the name of the root management group could be changed, role assignments created, or policy assigned.

Screen Shot 2019-10-17 at 9.59.59 PM

Administering Tenant Root Group

Now there is one aspect of Management Groups that is a bit funky.  If you’re very observant you probably noticed the menu option below.

Screen Shot 2019-10-17 at 9.59.59 PM.png

That’s right folks, Management Groups have their own Activity Log.  Every action you perform at the management group scope such creating an Azure RBAC role assignment or assigning or un-assigning an Azure Policy is captured in this Activity Log.  Now as of today, the only way to access these logs is viewing them through the portal or through the Azure REST API.  Unlike the Activity Logs associated with a subscription, there isn’t native integration with Event Hubs or Azure Storage.  Don’t be fooled by the Export To Event Hub link seen in the screenshot below, this will simply send you to the standard menu where you would configure subscription Activity Logs to be exported.

Screen Shot 2019-10-17 at 10.34.19 PM

Now you could log into the GUI every day and export the logs to a CSV (yes that does work with Management Groups) but that simply isn’t scalable and also prevents you from proactively monitoring the logs.  So how do we deal with this gap while the product team works on incorporating the feature?  This will be the challenge we address in this series.

Over the next few posts I’ll walk through the solution I put together using Azure Automation Runbooks to capture these Activity Logs and send them to Azure Storage for retention and an Azure Log Analytics Workspace for analysis and monitoring using Azure Monitor.

Continue the series in my second post.

Debugging Azure SDK for Python Using Fiddler

Debugging Azure SDK for Python Using Fiddler

Hi there folks.  Recently I was experimenting with the Azure Python SDK when I was writing a solution to pull information about Azure resources within a subscription.  A function within the solution was used to pull a list of virtual machines in a given Azure subscription.  While writing the function, I recalled that I hadn’t yet had experience handling paged results the Azure REST API which is the underlining API being used by the SDK.

I hopped over to the public documentation to see how the API handles paging.  Come to find out the Azure REST API handles paging in a similar way as the Microsoft Graph API by returning a nextLink property which contains a reference used to retrieve the next page of results.  The Azure REST API will typically return paged results for operations such as list when the items being returned exceed 1,000 items (note this can vary depending on the method called).

So great, I knew how paging was used.  The next question was how the SDK would handle paged results.  Would it be my responsibility or would it by handled by the SDK itself?

If you have experience with AWS’s Boto3 SDK for Python (absolutely stellar SDK by the way) and you’ve worked in large environments, you are probably familiar with the paginator subclass.  Paginators exist for most of the AWS service classes such as IAM and S3.  Here is an example of a code snipped from a solution I wrote to report on aws access keys.

def query_iam_users():

todaydate = (datetime.now()).strftime("%Y-%m-%d")
users = []
client = boto3.client(
'iam'
)

paginator = client.get_paginator('list_users')
response_iterator = paginator.paginate()
for page in response_iterator:
for user in page['Users']:
user_rec = {'loggedDate':todaydate,'username':user['UserName'],'account_number':(parse_arn(user['Arn']))}
users.append(user_rec)
return users

Paginators make handling paged results a breeze and allow for extensive flexibility in controlling how paging is handled by the underlining AWS API.

Circling back to the Azure SDK for Python, my next step was to hop over to the SDK public documentation.  Navigating the documentation for the Azure SDK (at least for the Python SDK, I can’ t speak for the other languages) is a bit challenging.  There are a ton of excellent code samples, but if you want to get down and dirty and create something new you’re going to have dig around a bit to find what you need.  To pull a listing of virtual machines, I would be using the list_all method in VirtualMachinesOperations class.  Unfortunately I couldn’t find any reference in the documentation to how paging is handled with the method or class.

So where to now?  Well next step was the public Github repo for the SDK.  After poking around the repo I located the documentation on the VirtualMachineOperations class.  Searching the class definition, I was able to locate the code for the list_all() method.  Right at the top of the definition was this comment:

Use the nextLink property in the response to get the next page of virtual
machines.

Sounds like handling paging is on you right?  Not so fast.  Digging further into the method I came across the function below.  It looks like the method is handling paging itself releasing the consumer of the SDK of the overhead of writing additional code.

        def internal_paging(next_link=None):
            request = prepare_request(next_link)

            response = self._client.send(request, stream=False, **operation_config)

            if response.status_code not in [200]:
                exp = CloudError(response)
                exp.request_id = response.headers.get('x-ms-request-id')
                raise exp

            return response

I wanted to validate the behavior but unfortunately I couldn’t find any documentation on how to control the page size within the Azure REST API.  I wasn’t about to create 1,001 virtual machines so instead I decided to use another class and method in the SDK.  So what type of service would be a service that would return a hell of a lot of items?  Logging of course!  This meant using the list method of the ActivityLogsOperations class which is a subclass of the module for Azure Monitor and is used to pull log entries from the Azure Activity Log.  Before I experimented with the class, I hopped back over to Github and pulled up the source code for the class.  Low and behold we an internal_paging function within the list method that looks very similar to the one for the list_all vms.

        def internal_paging(next_link=None):
            request = prepare_request(next_link)

            response = self._client.send(request, stream=False, **operation_config)

            if response.status_code not in [200]:
                raise models.ErrorResponseException(self._deserialize, response)

            return response

Awesome, so I have a method that will likely create paged results, but how do I validate it is creating paged results and the SDK is handling them?  For that I broke out one of my favorite tools Telerik’s Fiddler.

There are plenty of guides on Fiddler out there so I’m going to skip the basics of how to install it and get it running.  Since the calls from the SDK are over HTTPS I needed to configure Fiddler to intercept secure web traffic.  Once Fiddler was up and running I popped open Visual Studio Code, setup a new workspace, configured a Python virtual environment, and threw together the lines of code below to get the Activity Logs.

from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.monitor import MonitorManagementClient

TENANT_ID = 'mytenant.com'
CLIENT = 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX'
KEY = 'XXXXXX'
SUBSCRIPTION = 'XXXXXX-XXXX-XXXX-XXXX-XXXXXXXX'

credentials = ServicePrincipalCredentials(
    client_id = CLIENT,
    secret = KEY,
    tenant = TENANT_ID
)
client = MonitorManagementClient(
    credentials = credentials,
    subscription_id = SUBSCRIPTION
)

log = client.activity_logs.list(
    filter="eventTimestamp ge '2019-08-01T00:00:00.0000000Z' and eventTimestamp le '2019-08-24T00:00:00.0000000Z'"
)

for entry in log:
    print(entry)

Let me walk through the code quickly.  To make the call I used an Azure AD Service Principal I had setup that was granted Reader permissions over the Azure subscription I was querying.  After obtaining an access token for the service principal, I setup a MonitorManagementClient that was associated with the Azure subscription and dumped the contents of the Activity Log for the past 20ish days.  Finally I incremented through the results to print out each log entry.

When I ran the code in Visual Studio Code an exception was thrown stating there was an certificate verification error.

requests.exceptions.SSLError: HTTPSConnectionPool(host='login.microsoftonline.com', port=443): Max retries exceeded with url: /mytenant.com/oauth2/token (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)')))

The exception is being thrown by the Python requests module which is being used underneath the covers by the SDK.  The module performs certificate validation by default.  The reason certificate verification is failing is Fiddler uses a self-signed certificate when configured to intercept secure traffic when its being used as a proxy.  This allows it to decrypt secure web traffic sent by the client.

Python doesn’t use the Computer or User Windows certificate store so even after you trust the self-signed certificate created by Fiddler, certificate validation still fails.  Like most cross platform solutions it uses its own certificate store which has to be managed separately as described in this Stack Overflow article.  You should use the method described in the article for any production level code where you may be running into this error, such as when going through a corporate web proxy.

For the purposes of testing you can also pass the parameter verify with the value of False as seen below.  I can’t stress this enough, be smart and do not bypass certificate validation outside of a lab environment scenario.

requests.get('https://somewebsite.org', verify=False)

So this is all well and good when you’re using the requests module directly, but what if you’re using the Azure SDK?  To do it within the SDK we have to pass extra parameters called kwargs which the SDK refers to as an Operation config.  The additional parameters passed will be passed downstream to the methods such as the methods used by the requests module.

Here I modified the earlier code to tell the requests methods to ignore certificate validation for the calls to obtain the access token and call the list method.

from azure.common.credentials import ServicePrincipalCredentials
from azure.mgmt.monitor import MonitorManagementClient

TENANT_ID = 'mytenant.com'
CLIENT = 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX'
KEY = 'XXXXXX'
SUBSCRIPTION = 'XXXXXX-XXXX-XXXX-XXXX-XXXXXXXX'

credentials = ServicePrincipalCredentials(
    client_id = CLIENT,
    secret = KEY,
    tenant = TENANT_ID,
    verify = False
)
client = MonitorManagementClient(
    credentials = credentials,
    subscription_id = SUBSCRIPTION,
    verify = False
)

log = client.activity_logs.list(
    filter="eventTimestamp ge '2019-08-01T00:00:00.0000000Z' and eventTimestamp le '2019-08-24T00:00:00.0000000Z'",
    verify = False
)

for entry in log:
    print(entry)

After the modifications the code ran successfully and I was able to verify that the SDK was handling paging for me.

fiddler.png

Let’s sum up what we learned:

  • When using an Azure SDK leverage the Azure REST API reference to better understand the calls the SDK is making
  • Use Fiddler to analyze and debug issues with the Azure SDK
  • Never turn off certificate verification in a production environment and instead validate the certificate verification error is legitimate and if so add the certificate to the trusted store
  • In lab environments, certificate verification can be disabled by passing an additional parameter of verify=False with the SDK method

Hope that helps folks.  See you next time!