There are many technologies available at the software and hardware level to help achieve high availability on mission-critical servers. We’d like your views on what you’d expect a monitoring tool to tell you about for all these various issues. To kick things off, we’d first like your requirements for monitoring mirrored environments.
When would you expect a monitoring tool to alert you?
Here are some ideas on when SQL Response might trigger a mirroring-based alert:
- When the mirroring state of the principal or mirror database changes to either DISCONNECTED or SUSPENDED
- When roles change, eg. when the mirror becomes principal or vice versa
- When the mirroring witness is not connected (if witness is configured)
- When, in the event of a failover, the estimated time a mirror database will take to finish a redo and become available is longer than x minutes, or when the size of the redo queue is more than x KB in size
- When the estimated catch-up time* is longer than x minutes, or when the Log Send Queue is larger than x KB
*Catch-up time is the time it will take for the mirror to catch up with the principle.
What information would you like in the alert details?
As well as the cause of the alert and relevant metrics, what basic information about the mirror would you like to see in a raised alert:
- Name of the principal, mirror and witness server and their current status.
- Operating mode (High Safety, High Performance or High Safety with automatic failover)
Which performance counters are useful for you?
- On the principal: Log Bytes Sent/sec
- On the principal: Log Send Queue KB
- On the principal: Transaction Delay
- On the mirror: Log Bytes Received/sec
- On the mirror: Redo Bytes/sec
- On the mirror: Redo Queue KB
Is the information mentioned above enough to monitor mirroring? Or are there things you’d like to monitor which we have not mentioned here? We’d love to hear your views.