Incident on SSC

A couple of weeks ago I came into work as per usual, sat down at my desk, and after logging on to my workstation noticed an odd conversation occurring on Twitter. Two people I follow, Paul White (@SQL_Kiwi) and Dave Dustin (@venzann), were discussing Red Gate’s publically available installation of SQL Monitor in relation to a problem they were seeing on the SQL Server Central web site. As SSC is a site I am partly responsible for, the Twitter posts which kept popping up on my screen immediately got my attention. I had a look at SQL Monitor, and could not see anything obviously wrong. Next I checked the SSC site itself, and something was very wrong – it was completely broken and the main page was showing a ‘Service Unavailable’ type error.

@Venzann
Back on Twitter, there was some talk about whether anything in the SQL Monitor console pointed to something obviously wrong with the SSC databases. Particular attention was being paid to a CPU spike which could be seen in SQL Monitor’s performance counter analysis tab. I had a look at this, and saw that it started at 7.00am GMT … which I knew from experience was when the full backups of the SSC databases are run. @davebally had mentioned that the high CPU was being caused by the SQL Backup process, so it seemed that the high CPU was something that I would expect at that time. After further investigation in SQL Monitor, I could not see anything else obviously wrong with SQL Server.

Transactions per second tweet

Analysis

The next step was to have a look at the web server. I logged in, and firstly had a look at the event logs. Immediately it was obvious that it wasn’t happy – there were a series of events in the Application event log referring to the unexpected termination of an application pool process some time earlier. A further error stated that the application pool serving the SSC web site was failing to respond. It seemed fairly clear that something nasty had happened to the SSC web site application pool, so the first thing I tried was an application pool recycle. Immediately after this was done, the site came to life again. A quick check showed that it seemed to be working … except for the SSC blogs, which were still down, and something that was also quickly mentioned on Twitter!

@SQL_Kiwi
A quick recycle of the application pool used by the blogs site also bought that back online, and all seemed to be good after this.

So it seems that the problem was ultimately with IIS. However it was great to see people out there using the live SQL Monitor site to investigate any potential SQL Server issues after noticing the site was unavailable, and also great to find out that there weren’t any! It was also an interesting experience (and a testament to the power of social networking) to find that I was alerted to the problem more quickly through Twitter than through our enterprise monitoring system!

Support information on using SQL Monitor with IIS

This entry was posted in Uncategorized. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Required fields are marked *

Add an Image

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>