Welcome back to part two of my series of posts which looks at resolving problems with AD FS. You can check out part 1 here. In this post I’ll look another problem you may encounter while administering the service.
With the introduction of AD FS 2012 R2, Microsoft de-coupled AD FS from IIS. AD FS running on MS versions 2012 R2 or later now use the HTTP Server API (more often referred to as HTTP.SYS). HTTP.SYS is a kernal mode drive that was introduced in Windows Server 2003 and is used by a Windows system to listen for HTTP and HTTPS requests (check out this article for a detailed breakdown of how it works.) Infrastructure services such as IIS and WINRM use the driver. By integrating AD FS directly with HTTP.SYS, Microsoft was able to cut the footprint of the solution by eliminating the need for IIS. Awesome right? Of course it is, however, it is a bit more challenging to troubleshoot.
Issue 2: Replacing the AD FS Service Communications certificate
The service communications certificate is one of the “big three” certificates used within an AD FS implementation. The certificate that is assigned as the service communications certificate is used to protect web communication between clients and the AD FS service (i.e. SSL/TLS). Like any certificate, it will have a standard lifecycle and will eventually need to be replaced. When that time comes, you can run into a very interesting problem depending on how you go about replacing that certificate.
If you’ve been managing an AD FS instance for any period of time, you’ve more than likely become quite familiar with the AD FS Management Console. When replacing the certificate in AD FS 2012 R2 or above, you may be tempted to use the Set Service Communications Certificate action seen below. Let’s give it a try shall we?
I first requested a new web certificate from the instance of AD CS through the Certificate MMC and placed it in the Computer store. I then granted READ access to the private key for the service account AD FS is using. After that I used the Set Service Communications Certificate action and selected the new certificate. A quick check of the thumbprint of the certificate now being used matches the thumbprint of the new certificate (pay attention to the thumbprint, I’ll reference it again later). Last step is to restart the AD FS service.
Let’s now test the sample claim app I described in my first post.
Uh oh. What happened? A check of the Application, System, and AD FS Admin logs shows no errors or warning nor does the AD FS Debug after another attempt. Heck, even the log for the HTTP.SYS kernal driver httperr.log in C:\Windows\System32\LogFiles\HTTPERR is empty. This is yet another instance of where the answer could not be found in any of the logs I reviewed because it’s another error related to the integration with HTTP.SYS. What to do next?
Much of the administration of the integration with HTTP.SYS is doing using netsh. Here we’re going to look at the certificate bindings configured for the HTTP listeners using the command http showsslcert from the netsh command prompt.
Well our bindings are there, but look at the thumbprint: 12506a00b40617b096002089383015bbbb99e970. That thumbprint does not match the thumbprint for the new certificate I set for the Service Communications certificate. So what happened? My best guess is when one of the HTTPS listeners are hit, the configuration in the AD FS database does not match the configuration of the HTTP.SYS listeners causing AD FS to crash. How do we fix it? Come to find out from this blog, there is one additional command that needs to be run to setup the listeners with the proper bindings, Set-AdfsSslCertificate. After using the Set-AdfsSslCertificate and setting it with the new thumbprint then restarting the AD FS service, netsh http showsslcert now shows the correct thumbprint and the sample claim app is now working as expected.
What you should take from this post is that while integrating with HTTP.SYS helps to limit the AD FS footprint, it also adds some intricacies to troubleshooting the service when it stops working. In the next and final post in this series I will cover an issue that can pop up when a Web Application Proxy (WAP) is integrated in the mix.
See you next post!