I was debugging some IIS crashes last week and thought I’d follow up with a few basics here as its a common enough problem. Another time I might write a series of posts on using the windows debuggers in detail and how one can go about this from scratch, but for the moment here’s a quick summary of some basic beginning points. I wrote some other more detailed examples of .NET debugging in the past on my MSDN blog, although these ones use slightly differrent CLR versions and extensions which have since been updated.
Firstly I walked into this situation blind as you often do in such matters. The developers of the application in question told me that they had been experiencing crashes across all their web servers since they last did a code deploy. (Insert questions and comments here about the testing regime which allows this to occur). The windows error logs showed the following in the application event log
Faulting application name: w3wp.exe, version: 7.5.7601.17514, time stamp: 0x4ce7afa2
Faulting module name: MSVCR100_CLR0400.dll, version: 10.0.30319.1, time stamp: 0x4ba2211c
Exception code: 0xc00000fd
Fault offset: 0x0000000000057f91
Faulting process id: 0x11f0
Faulting application start time: 0x01cd29d083c0e51e
Faulting application path: c:\windows\system32\inetsrv\w3wp.exe
Faulting module path: C:\Windows\system32\MSVCR100_CLR0400.dll
Report Id: fdd757b8-95ee-11e1-94a4-005056bc00a6
The key here is the Exception code: 0xc00000fd, which translates as stack overflow (never good!). I pulled the logs and agreed with them in their initial assessment, but they said that they couldn’t find any dumps that had been auto produced. As such I immediately attached debugdiag to one of the web servers to ensure that I could capture a full dump the next time it occurred. However once this was in place I went back through the logs and dug around the server in more detail to check out whether it was really the case that the server had not produced any dumps automatically. Sometimes in Windows 2008 and above WER logging is not particularly transparent in what its doing, so I checked manually. After a short while of searching for .dmp or .mdmp files I noted that the default WER location for these servers was
Once I browsed to here I found a treasure trove of old dumps and error logs and all sorts of joy which helped me diagnose the issue. The WER had not written to the event logs that it was taking dumps and collecting information, but all the same I wasn’t surprised to see that it had been doing its stuff since there had been a lot of application crashes. It just goes to show that it’s always worth a look.
In this case the actual debug was fairly simple as a stack overflow crash is pretty simple to debug, it’s just a matter of these steps if you’re familair with windows debuggers:
1. Load the dump
2. set the symbols ensuring you have privates for the customer code
3. load your .net debugger extension (I used psscor4)
4. dump the stack of the thread with the stack overflow exception on
5. send the code to the developers and get them to fix it 🙂
Here’s hoping you don’t encounter any stack overflows yourselves!