In this post I’ll be continuing my series on Azure Private Link and DNS. In my first post I gave some background into Private Link, how it came to be, and what it offers. For this post I’ll be diving into some DNS patterns you can use to support name resolution with Private Link Endpoints for Azure services. I’ll be covering the six scenarios below:
- Default DNS pattern without Private Link Endpoint
- Azure Private DNS pattern with a single virtual network
- BYODNS (Bring your own DNS) in a hub and spoke architecture
- BYODNS with a custom DNS forwarder in a hub and spoke architecture
- BYODNS with the use of root hints in a hub and spoke architecture
- BYODNS with the use of a custom DNS zone hosted in the BYODNS in a hub and spoke architecture
Before I jump into the scenarios, I want to cover some basic (and not so basic) DNS concepts. If you know nothing about DNS, I’d highly suggest you stop reading here and take a quick few minutes to read through this DNS 101 by RedHat. If you’ve operated a DNS service in a large enterprise, you can skip this section and jump into the scenarios. If you only know the basics, read through the below or else you may not get much out of this post.
- A record – Translates a hostname to an IP address such as http://www.journeyofthegeek.com to 22.214.171.124
- CNAME record – Alias record where you can point on (FQDNs) fully qualified domain name to another to make it a domain a human can remember and for high availability configurations
- Recursive Name Resolution – A DNS query where the client asks for a definitive answer to a query and relies on the upstream DNS server to resolve it to completion. Forwarders such as Google DNS function as recursive resolvers.
- Iterative Name Resolution – A DNS query where a client receives back a referral to another server in place of resolving the query to completion. Querying root hints often involves iterate name resolution.
- DNS Forwarder – Forward all queries the DNS service can’t resolve to an upstream DNS service. These upstream servers are typically configure to perform recursive name resolution, but depending on your DNS service (such as Infoblox), you can configure it to request iterative name resolution.
- Conditional Forwarder – Forward queries for a specific DNS namespace to an upstream DNS service for resolution.
- Split-brain / Split Horizon DNS – A DNS configuration where a DNS namespace exists authoritatively across one or more DNS implementations. A common use case is to have a single DNS namespace defined on Internet-resolvable public facing DNS servers and also on Intranet private facing DNS servers. This allows trusted clients to reach the service via a private IP address and untrusted clients to reach the service via a public IP address.
If you can grasp the topics above, you’ll be in good shape for the rest of this post.
Scenario 1 – Default DNS Pattern Without Private Link Endpoint
Before we jump into how DNS for Azure services works when Private Link Endpoint is introduced, let’s first look at how it works without it. For this example, let’s look at a scenario where I’m using an VM (virtual machine) running in an VNet (virtual network) and am attempting to connect to an Azure SQL instance named db1.database.windows.net. No Private Link Endpoint has been configured for the Azure SQL instance and the VNet is configured to use Azure-provided DNS and thus sends its DNS queries out the 126.96.36.199 virtual IP. I explain how Azure-provided DNS works with the virtual IP in a prior blog post. When I open SQL Server Management Studio and try to connect to d1.database.windows.net, my VM first needs to determine the IP address of the resource it needs to establish a TCP connection with. For this it issues a DNS query to the Azure DNS service.
The FQDN (fully-qualified domain name) for your specific instance of an Azure service will more than likely have two or more CNAME records associated with it. I don’t have any super secret information as to the official reasons behind these CNAMEs and can only theorize that they are used to orchestrate high availability of the service. By using the CNAMEs Microsoft is able to to provide you with DNS record you can customize to your requirements and place in code. Any failures in the backend require a simple modification of the alias the CNAME is pointing to without requiring changes to your code such as modifications to the connection string.
Since Azure DNS is a recursive DNS resolver, it handles resolving each of these records for you and returns the public IP address of your Azure SQL instance. Your VM will then use this public IP address to setup a TCP connection and establish a connection to your database.
Scenario 2 – Azure Private DNS pattern with a single virtual network
Now let’s cover how things change when we add a Private Link Endpoint and configure it to integrate with Azure Private DNS. If you’re unfamiliar with how Azure Private DNS works take a read from my prior post on the topic.
In this scenario I’ve added a Private Link Endpoint for my Azure SQL instance. I’ve configured the Endpoint to integrate with an Azure Private DNS zone named privatelink.database.windows.net and have linked the VNet to the Azure Private DNS zone.
Notice the changes to the records in Azure Public DNS. The hostname for my Azure SQL instance now has a CNAME record with an alias defined for db1.privatelink.database.windows.net. There is also a new CNAME record for db1.privatelink.database.windows.net which points to the same dataslice4.eastus2.database.windows.net record as we saw in the last scenario. This is done for two reasons. The first reason is it allows clients accessing to instance through a public IP to continue to do so because Microsoft has established a split-brain DNS configuration for the privatelink.database.windows.net zone. The second reason is it allows Microsoft to work some magic in the backend (I have no idea how they’re doing it) that redirects queries originating from an Azure VNet that is linked to the Azure Private DNS zone to be resolved against the record in the Azure Private DNS zone.
This means that clients outside the linked Azure VNet will receive back the public IP address of the Azure SQL instance and clients within the Azure VNet linked to the Azure Private DNS zone will receive back the private IP address of Private Link Endpoint.
Scenario 3 – BYODNS in a Hub and Spoke Architecture
Scenarios 1 and 2 are important to understand, but the reality is very few organization have such a simple DNS pattern for their Azure footprints. Most enterprises using Azure will be using a hub and spoke architecture. Shared services such as a DNS service (Windows DNS, InfoBlox, BIND, whatever) are placed in the hub VNet and are shared among spoke VNets containing various workloads. This DNS service will typically provide advanced features not provided by Azure Private DNS (at this time) such as conditional forwarders and DNS query logging. You can check out my prior post on this pattern if you want to understand the details.
In the scenario below I’ve provisioned a DNS service in the hub VNet and configured it to forward all queries it can’t resolve to the 188.8.131.52 virtual IP. Notice that I’ve now linked the Azure Private DNS zone to the hub VNet instead of the spoke VNet. This is to ensure the DNS service can resolve the queries to this Azure Private DNS zone. It also lets me take advantage of the advanced features of the DNS service such as those I discussed above.
The resolution with Azure-provided DNS occurs in the same manner as scenario 2 with the exception being that the DNS service performs the query and returns the results to the VM running in the spoke.
Scenario 4 – BYODNS With a Custom DNS Forwarder in a Hub and Spoke Architecture
Next up we have a scenario similar to the above where we have a hub and spoke architecture and have the DNS service in the hub configured to forward all queries it can’t resolve to an upstream forwarder. Maybe it’s to some on-premises DNS server, a 3rd party threat service, or simply Google’s DNS service. Whatever the case, this scenario means we now have to care about recursive resolution and conditional forwarders.
If the upstream DNS service you’re using supports recursive name resolution and the DNS service you’re using in your hub is configured to send recursive queries to it, then any queries for db1.database.windows.net will resolve to the public IP address of the service. The reason for this is with recursion you’re asking the upstream DNS service to chase down the answer for you and that upstream DNS service only knows about the public privatelink.database.windows.net DNS zone and does not have access to the Azure Private DNS zone.
To handle this scenario want to create a conditional forwarder for database.windows.net (or the recommended zone for the service you’re using) and point it to Azure-provided DNS via the 184.108.40.206 virtual IP. This enables you to let the Azure platform handle the split-brain DNS challenge as it has been engineered to do.
Scenario 5 – BYODNS With The Use of Root Hints in a Hub and Spoke Architecture
In scenario 5 we again have the same architecture as the prior scenarios with a few differences. First off we are now sending iterative queries to the DNS Root Hints instead of an upstream forwarder. This means our DNS service will chase the entirety of the resolution requesting referrals back from each DNS server in the path to resolve the FQDN. The usage of iterative queries gives us the option of creating a conditional forwarder (our second difference) to the 220.127.116.11 for the privatelink.database.windows.net or optionally sending that query to some other DNS service we’re running in an on-premises data center or another cloud.
The key takeaway of this configuration is that using root hints puts a bigger burden on your DNS service because you are resolving a whole bunch more queries vs using an upstream DNS service like Azure DNS. Additionally, if you opt to maintain your own DNS zone, it’s on you to figure out how to manage the whole lifecycle of the DNS records for the Private Link Endpoints.
Scenario 6 – BYODNS With The Use of a Custom DNS zone Hosted in The BYODNS In a Hub and Spoke Architecture
The last scenario I’ll cover is the use of a custom DNS zone named something outside of the Microsoft recommended zones (more required than recommended) that is hosted in your BYODNS service. Let me save you any pain and suffering by telling you this will not work. You’re probably asking why it won’t work. The answer to that question requires understanding how data is secured in transit to Azure services.
Since you surely don’t want your data flowing through a network in clear text, most Azure services will either require or support encryption of data in transit using TLS (Transport Layer Security). If you’re not familiar with TLS flow, you get a reasonably good overview here. The key thing you want to understand is that TLS session is often established by using the certificate being served up by the Azure service. In addition to confidentiality, it also authenticates the service to your client.
The authentication piece is what we care about here. Without going too deep into the weeds, the certificate contains a property called the SAN (subject alternative name) which lists the identities of the services the certificate should be used for. These identities are typically DNS names such as db1.database.windows.net. If you try to go ahead and create a custom DNS zone and attempt to access the Azure service through that name, you’ll run into a certificate mismatch error which is due to DNS name of the service you typed into your browser or that was called by your library not matching the identities listed in the certificate.
Yes I know there are ways to get around this by ignoring certificate mismatches (terrible security decision) or doing something funky like overriding database.windows.net (this is against Microsoft recommendations) with your own zone. Don’t do this. If you want the service to support this type of functionality, submit a feedback request.
Now if anyone is aware of a way to get around this limitation that is supported and not insane, I’d definitely be interested in hearing about it.
Before I conclude this series I want to provide one more gotcha. Take note that while Private Link Endpoints can be integrated Azure Private DNS and the records can be automatically created, they do not share the full lifecycle. This means that if you delete a private link endpoint and create a new one for the same resource, the NIC (network interface) associated with the endpoint may get a new IP. This will cause your queries to fail to resolve because they will resolve to the prior IP. You will need to manually clean up the A record hosted in the Azure Private DNS zone before creating the new endpoint.
Well folks that wraps it up. Hopefully you found this information helpful and it cleared up some of the mystery of DNS patterns with Private Link Endpoints. I want to plug a stellar write-up by Dan Mauser, who is one of the networking all stars over at Microsoft. He wrote up an incredibly detailed post on this topic which covers the topic more exhaustively than I did above.
Fantastic post, thanks. I have on-premises Active Directory Domain Controllers that each run AD-Integrated DNS. Does this sound like a possible solution to you?
1. Extend my AD into the cloud by creating Azure VM’s and setting up AD + AD-DNS on them.
2. Configure conditional forwarders for privatelink domains on internal DC’s to point to Azure DC’s.
3. Configure conditional forwarders for privatelink domains on Azure DC’s to point to 18.104.22.168
Hi Nathan. That looks like it should work. The one think you’ll want to plan around is ensuring that the conditional forwarders aren’t Active Directory integrated, or else they’ll be replicated as part of the directory data. That would inhibit your ability to have a different set of conditional forwarders across DCs in the same domain.
I’m fairly certain you can control that on a per conditional forwarder basis, but it’s been a long time since I’ve played around it.
I got it working. I created a couple of new DC/DNS servers in Azure. I did not bother with Conditional Forwarders on these, I simply set 1 server-level forwarder and pointed it to 22.214.171.124. Then I configured conditional forwarders on my on-premises DC/DNS servers. I made sure these conditional forwarders were not replicated in my domain, and pointed them to the Azure DC/DNS servers. At first I tried making the conditional forwarders for the privatelink domains, but that didn’t work as expected. So, I changed them to the public domains (azurewebsites.net, database.windows.net, etc.) and that did the trick.
LikeLiked by 1 person
Great to hear!
Hi, Great article here. I am in the scenario 4 situation, with an on-premise infoblox grid. Do you have any specifics on the DNS forwarder. I see some example templates running a linux machine, but no specifics on how to configure it afterward. https://azure.microsoft.com/en-us/resources/templates/301-dns-forwarder/
I’m wondering if it makes sense to make a DNS forwarder with windows instead, as well.
Hi there. My typical advice to customers is to use whatever DNS platform they’re comfortable with. If it’s Linux, throw together a BIND server. If it’s Microsoft, use a Microsoft server running the DNS service (or cheat and leverage an existing domain controller).
In your case where you have an existing InfoBlox grid, you may want to look at a tactical and strategic solution. Tactically, it sounds like Microsoft DNS is what you’re most comfortable with, so that might be good for the short term. Longer term you should begin planning on how you’re going to expand InfoBlox into Azure. Expand the existing grid or create a new one?
With Infoblox you have the Cadillac of DNS solutions, so no sense in driving the Corolla (Microsoft DNS / Bind DNS) for any longer than you need to.
Hope that helps!
Thanks for the write-up! I’m picking up my hair and putting it back on my head. One small correction: 126.96.36.199 should be 188.8.131.52
LikeLiked by 1 person
Thanks Matt! I’ll fix that typo.
I can not just create a direct zone in DNS / DC with the name privatelink.database.windows.net, and inside it create the record db1.privatelink.database.windows.net? I tested it in my environment and it worked.
Remembering that the DNS / DC have a Forwarder configured for Google’s DNS 184.108.40.206.
Hi Felipe. That will work with Windows DNS. This is due to an odd behavior where it will take the results of a recursive query returned by an upstream forwarder, iterate through the returned records looking for CNAMEs of namespaces it is authoritative for, and return the record from its local forward lookup zone. This behavior is specific to Windows DNS and does not work on all third-party DNS products. That is why I did not list it.
It is possible to create a forward conditional with the name of the storage account: storageteste.blob.windows.net, redirecting to 220.127.116.11, being a Windows DNS VM is inside the VNET with link to the Azure Private DNS (zone: privatelink.blob.windows.net)?
Yes, that is possible and I have tested it. You could also create a forward lookup zone with the FQDN of the storage account and create a single A record within the forward lookup zone with a blank name (to reference the parent) and the IP of the private endpoint. While both of these solutions work, there are a ton of negatives.
1. This will not scale for large deployments.
2. Creation/modification/deletion of a forward lookup zone or conditional forwarder is a server-level change and thus presents more risk if a mistake is made.
3. This works in Windows DNS but I’m not sure this will work in 3rd party DNS products.
4. Specific to Active Directory-integrated DNS, in large deployments that are spinning up and spinning down resources, at a large scale this could begin bloating the Active Directory DIT to the point there are performance impacts within AD.
I would not recommend this pattern in a production environment.
I tested in my environment creating a zone called storageteste.blob.windows.net with a blank type A record pointing to the private IP of the private endpoint and it worked.
Also test creating a zone called storageteste.privatelink.blob.windows.net with a blank type A record pointing to the private IP of the private endpoint and it worked.
Windows DNS has a Google DNS forwarder only (18.104.22.168).
What would be the difference between the 2 modes?
There is no real difference beyond the second scenario is specific to DNS products that exhibit the behavior I described above. If you opted to move away from Windows DNS to a third-party DNS product, you might not experience the same result.
So, as I understand it, I could use scenario 1 for both Windows and Linux, that is, creating a zone with the name of my storage account storageteste.blob.windows.net and pointing to the Private IP of Private Link.
With this method, you would be skipping the following DNS hops:
No problem then, skipping these jumps?
I can’t say whether the first method would work for DNS products like BIND, you would need to test to confirm. In the theory method one should work.
On the second question, you are theoretically already skipping those records in the normal pattern of using Azure Private DNS since the query it intercepted at the second CNAME and forwarded to the private DNS zone. You would want to contact Microsoft support to get an authoritative answer.
This doesn’t work across multiple subscriptions. Otherwise, great article.
Hi Tom. I read through your post on the Github issue, but can you describe it further. If I understand correctly, you are saying you are unable to create a Private Endpoint in SubscriptionA and have it register its A record in a Private DNS Zone in SubscriptionB. Is this an accurate description?
If this is your use case, this is actually possible. In the portal when you toggle the option to integrate with Private DNS, you’ll see a drop down menu for subscription. You would then select the appropriate zone from the subscription. To ensure this feature wasn’t broken, I tested it a few minutes ago using both the Portal experience and an ARM template.
Below is an ARM template I wrote that you can try that demonstrates this same concept.
LikeLiked by 1 person
Ahhh, this makes a lot more sense. So the way to do this is a) pre-create the matching required private DNS zone(s) matching the azure private link domains used for a particular service and b) create a private endpoint as a standalone resource to enable linking it back to those central existing zones. Thanks Matt! Note that when you are creating a brand new private link enabled service such as ACR, there is no option to find an existing zone in another subscription. That’s really the problem I was trying to raise here. Looking at the ARM API docs, it looks like that `privateEndpointResourceId` is the key field. The portal does not expose this during the “create resource wizard”.
None of the documentation explains this use case so it’s been a frustrating process to try and find out how to do this. Your blog post definitely helps illustrate the integration scenarios well – I just couldn’t figure out how to get the damn DNS record into the right zone in a predictable way!
LikeLiked by 1 person
Awesome! Glad it helped and cleared it up.
I agree there is a lot of room for improvement in the public documentation for Private DNS and Private Endpoints. I try to look a the bright side in that it gives me something to write about. 🙂
Disclaimer: I am not a networking expert.
I think you have addressed this already here – “The first reason is it allows clients accessing to instance through a public IP to continue to do so because Microsoft has established a split-brain DNS configuration for the privatelink.database.windows.net zone.”
But wanted to understand it in simpler terms.
I am trying to set up SQL Azure with Private Link, so that my new on-prem applications can access the database via the already established Express Route.
But there are other existing on-prem applications using SQL Azure without Private Link.
Am I right in assuming that configuring the DNS as per Scenario #2 (or other scenarios) may cause the applications using public FQDN to connect with SQL Azure to fail?
Hi Fred. Your scenario may be challenging to make work. If you want your on-premises workloads to be able to resolve Private Link IP addresses, you typically have to configure your underlining DNS infrastructure to support that resolution path. Typically this is done by configuring on-premises DNS servers to forward traffic for the public zone, such as blob.windows.core.net, to a DNS resolver in Azure. The challenge is it’s typically a binary decision that affects all machines on-premises.
To make this work you’d need to have a different resolution path for this subset of machines that need to resolve to the Private Link IP address. This could be via HOSTS files configured directly on the machines (for testing purposes) or a whole separate DNS infrastructure dedicated to this type of resolution. My recommendation would be to make an organizational decision on how you’re going to handle PaaS services in the future. Personal opinion, you shouldn’t mix the two.
This is really a very helpful blog post. I am not a networking expert and have one query around private DNS zones.
Let’s say if an organisation have its own DNS infrastructure and they are creating A records for db1.database.windows.net to resolve to 10.0.0.2 (private endpoint for SQL), in this case, we don’t actually need a private DNS zone in Azure with privatelink.database.windows.net. Correct?
Hi Gary. It sounds like you’re saying an organization has decided to create a split brain DNS scenario. In this scenario a DNS namespace is resolvable publicly, but the organization has decided to also create a non-public DNS namespace which they are hosting authoritatively within their own infrastructure.
Technically, the scenario you are describing would work. If any organization decided to host a private database.windows.net DNS zone or a DNS zone named db1.database.windows.net resolution of the records would likely resolve to the values in the A records they create (this could vary on different DNS platforms). There are a few downfalls to this. Creating a split brain DNS scenario for a zone you don’t own could introduce weird issues where that zone is used for other 3rd party services (this is especially true with PaaS zones). If you decided to create a zone to reflect a single record (like a zone for db1.database.windows.net) the downfall there is it’s somewhat of an anti-pattern which could be problematic when troubleshooting problems and would also introduce additional operational overhead of maintaining those records at scale.
Would a reverse proxy / app gateway performing ssl termination work with scenario 6?
LikeLiked by 1 person
Funny you mention that because a peer and I were just talking about that pattern a few weeks ago. I haven’t tested it myself, but we were fairly certain it would work to get around the custom domain limitation at least for HTTP/HTTPS endpoints. Nothing would be stopping you from tossing in a 3rd party reverse proxy like an F5 LTM to handle the non-HTTP/HTTPS traffic.
That topic actually lead into another topic during our discussion which pops up fairly often with Private Endpoints, and that’s multi-region / DR scenarios. Microsoft doesn’t give great guidance on how to handle DR/HA scenarios with Private Endpoints, largely because it’s a gap in Azure’s DNS offerings. Azure DNS is super simplistic from a DNS service perspective and doesn’t offer any type of probing. Traffic Manager and Front Door are public endpoint only, which creates a gaping hole for internally-facing application.
I’m rambling a bit, but the gist of it if you’re going to solve the problem for custom domains, you may as well incorporate a product that can also solve the GSLB solution in the mix to solve both problems. For example, using an F5 LTM/GTM combo.
Either way, you are right on track with potential workarounds for custom domains. Maybe I’ll make it a future blog post. 🙂
Helpful post thank you. My team and I were wondering about DR etc if we private endpoint all the things – in the event of vnet failure we lock ourselves out. I reached out to Microsoft and they recommended the vnet and private endpoint be deployed somewhere else as well (perhaps the B region) with vnet peering so there was another route in, advice that seems reasonable enough to take.
I was also wondering if you had any insight on best practice for private DNS zones. If we looked at privatelink.database.windows.net and we had Dev/Prod/Live subscriptions with separate vnets and Azure SQL servers in all three, would you create one main privatelink.database.windows.net zone that contained all records and was linked to multiple vnets or would you think a single zone per vnet would be more appropriate?
Many thanks for any insight you might have!
I’m a fan of maintaining a single authoritative zone for each private link namespace. The zone is then linked to VNet containing your DNS proxy device. The zone itself is treated as production, so automation is important to ensure lower environments aren’t inhibited by a more tightly controlled resource.
The challenge with doing separate zones has to deal with on-premises resolution requirements. If you have resources on-premises or in another cloud that have to resolve those private DNS zones, you need one authority to point those queries to. If you have separate zones for each environment, you’ll be in a bind.
If you don’t have requirements for resources outside of Azure to resolve records in those zones, you could create separate namespaces for dev/way/prod/etc I suppose. My concern would be unwinding that configuration if new requirements arrive in the future.
Pingback: Private Link and Azure Monitor: what is an AMPLS? – Cloudtrooper
Thanks for replying.
Just to quickly clarify on my first thing the peering isn’t necessary apparently (https://docs.microsoft.com/en-us/azure/dns/dns-faq-private#will-azure-private-dns-zones-work-across-azure-regions-)
Regarding the single zone, I like that too it seems much easier to manage and I don’t see any harm in our internal envs being able to know about IPs in our Live env if all they can do is query them.
We’re Azure cloud native so no on-prem or alternate providers to worry about.
Thanks again for your insight.
LikeLiked by 1 person
Pingback: Azure Private Endpoint, Private Link – HAT's Blog
Thank you for the great articles you have!
I have read them but still have an issue.
I have Azure Vnet 10.6.0.0/16 with several servers VMs accessing Azure SQL using dbname.database.windows.net DNS name. Servers are part of the domain and we use AADDS with two DNS servers 10.0.0.4 and 10.0.0.5.
I have created Azure SQL Private endpoint with the IP 10.6.0.8 and FQDN dbname.privatelink.database.windows.net.
Private DNS zone “privatelink.database.windows.net” created and has A record for my “dbname” and its 10.6.0.8 IP.
When I run SQL connectivity test provided by Microsoft from the VM server – it works but says: “This server has a private link alias but DNS is resolving to a regular Gateway IP address, running public endpoint tests.”
When I do “nslookup dbname.database.windows.net” it returns me:
Why the the FQDN is not resolved to the private link IP? Should something be done in AADDS DNS servers?
Thanks a lot for your help!
It sounds like the Private DNS Zone may not have been linked to the virtual network DNS servers are in. I haven’t touched AADDS in about 4 years, but I believe it gets delegated a subnet in a VNet. You need to ensure the private dns zone is linked to that Vnet.
Thank you for your response!
I had my private DNS zone linked to the VNET which was different from AADDS VNET.
Once I added another link to AADDS VNET – everything started working as expected!!!
You are the best!
Awesome! Glad you got it working.