Obscure Failures

This morning I installed the latest round of critical Microsoft security patches on a Windows host. It was more complicated than expected.

Microsoft has greatly improved the Windows Update service. And while there remain some issues with how it interacts with Automatic Updates and manual patching of the host, it is generally reliable. Its major failing, from a web services perspective, is that it requires Internet Explorer. But our problem today wasn’t with Internet Explorer per se; it was with Microsoft SQL Server.

When I asked Internet Explorer to get the Windows Update web site, it froze. By using my brilliant deductive skills, I observed that it was having a problem connecting. update.microsoft.com does not respond to ping, but traceroute showed that we were using the correct path to that host. I could resolve the name just fine from the Windows shell, so it was not a DNS issue.

Or was it? Internet Explorer was able to get the site by IP address, but froze when it had to request a name. And a brief look with netstat showed that it did not open the TCP connection when it attempted to use the name. While I cursed the lack of a sniffer on this host, I turned my attention to something else. Some time later I glanced back at the display and saw I was being prompted to permit Microsoft to muck with my system. It had found the site!

One of the changes I’ve noticed in the current version of Windows Update is that if you do not run the ActiveX applet within a certain period of time, the operation times out, and the error 0x8DDD0004 is displayed. (This appears to mean ErrorControlFailed.) The error message is less than helpful in its explanation, but the site tends not to function until you clear your cache. So I did that, and waited again.

While waiting, I noticed that the tiny SQL Server Service Manager icon in the tray was red.

That was wrong.

The Service Manager showed SQL Server to be in the “starting” state, as did the Services Control Manager. The event log did not record any starting events for SQL Server. It wasn’t in an uncontrollable state, and shut down just fine:

net stop mssqlserver

It started fine too.

net start mssqlserver

Immediately afterward, Internet Explorer began working. WTF?

I quickly patched the box and rebooted. Some would say that I should have resolved the SQL Server issue first, but the DBMS was coming up fine once the service was restarted. A post-reboot inventory showed that SQL Server was still just pretending to start. And in pretending to start it prevented the Local Security Authority from authenticating in a timely manner, prevented Terminal Services from accepting connections, prevented ASP pages from executing, and prevented Internet Explorer from resolving names.

What caused it to fail, I don’t yet know. Nor do I know why SQL Server’s failure should cause other applications to halt dead in their tracks. The latter is more disturbing.

But then, that’s why I hate Windows. It fails in obscure ways. It is inscrutable.

Inscrutable
Incapable of being analyzed and understood because the essential facts or factors are concealed.