What do you want from a Global Overview?

One of the really cool features we’re planning for SQL Response 2 is the Global Overview. This is a kind of ‘front page’ that is designed to draw your attention to any significant issues across your entire enterprise. The global overview is the top level of a hierarchy of ‘overview’ pages that include:

  • A Physical Server overview page, that will provide current data about a selected machine, for example disk space usage and processor metrics. It will also list which instances are on that machine and various other useful data.
  • A SQL Server overview page for each monitored instance, that will list current metrics and alerts for that instance. This page will aggregate lots of useful data about a specific instance.

Above both of these, however, lives the Global Overview. Its purpose is simple – to list those SQL Servers and machines with a problem, that require further investigation. Using graphics and colour coding, we want the Global Overview to give you a high-level view across your entire enterprise. From here, with a single click, you can drill down to a SQL Server overview or various other views to see more detailed information.

Our questions

The question we’re discussing at the moment is how best to present this global overview information to you.

  • Do you prefer a list of SQL Server instances only, or do you want to see them in the context of their physical server?
  • Or would you want to toggle between a list of SQL Server instances only and a list showing physical servers and instances (ie SQL Servers grouped by their host machine).
  • What are the key metrics to see at the Global Overview level for a SQL Server/physical server? (These are the sparkline graphs in our example designs)

We’ve sketched out a couple of alternative ideas for how this Global Overview might look.

Our designs

This design (click to enlarge) shows a flat list of SQL Server instances. Each SQL Server also shows its host machine, which has its own status based on any problems specific to the physical server:

Global Overview: flat list of SQL Server instances

Global Overview: flat list of SQL Server instances

This design shows SQL Servers organised by their host machine. The status of the SQL Server may be different from that of the physical server, depending on what type of issues SQL Response has found:

Global Overview: list of SQL Servers grouped by machine

Global Overview: list of SQL Servers grouped by machine

In both designs, clicking pretty much anywhere will take you to a relevant page with more detailed information. For example, clicking on an instance name will take you to the SQL Server Overview for that instance, and clicking on the list of alerts will take you to a dedicated alerts page.

Your ideas

Please let us know whether our ideas for a Global Overview meet your expectations. If neither of these are quite what you want, then why not tell us what you’re looking for or send us your own design?

This entry was posted in Designs. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

10 Comments

  1. Tom
    Posted September 11, 2009 at 1:00 pm | Permalink

    Here’s another design that combines aspects of both these.

    It keeps the machine grouping of the second design, but no longer separates the number of alerts by their severity.

    ‘Last alert’ is replaced with ‘Most serious’. I’ve also removed the Network and Disk columns in order to try and simplify things.

    The Memory column shows ‘Memory used’. I’d have liked to use ‘Memory available’ instead, but unless memory has been allocated to a SQL Server, this number would be the same for a machine and all its instances.

    Do you prefer this one?

    globaloverview7.png

    • Merrill Aldrich
      Posted September 18, 2009 at 11:48 pm | Permalink

      Grouping by machine is definitely helpful; one thing to watch out for is the complexity a cluster introduces. I might have 4 instances on 2 machines in a cluster; some data about the real, physical host is needed, and some about the instances, which might be on one or the other of the pair. Or there could be four machines in a cluster, etc., etc.

  2. Mike Knee
    Posted September 11, 2009 at 2:34 pm | Permalink

    The designs are very clean – personally I like that a lot.

    I’d like to see a container/folder structure in place so I can group servers & see an overview of very large infrastructures, e.g. dev/live or by client/dept with alert overviews “bubbling up” to the container level.

    So instead of 100 instances I’d see maybe 12 depts with an overview of problems within each dept & I could open up containers to drill down & see individual SQL instances.

    Other things I’d like to see (probably obvious & done already)

    - The ability to filter out healthy instances, if there aren’t any problems I don’t care :-)
    - Ordering choices – alphabetical, instances with the biggest/most issues at the top.

    Regarding instances/physical boxes – it would be great to hover over an instance & see a simple topology – physical boxes it is “related” to, other instances/Virtual machines in the cluster

    Maybe even go as far as showing business applications served/affected by this instance so you can see very quickly the implications of problems even if you aren’t familiar with the infrastructure.

    It would also be nice to be able to mark a set of alerts (or an “incident” in ITIL parlance) as acknowledged or work in progress & then drop it’s “severity” down. However I’d want it to tell me if the alert profile changed in the meantime, i.e. something different has gone wrong!

    1_penguin.jpg

  3. Jonathan Allen
    Posted September 14, 2009 at 3:01 pm | Permalink

    I like the changes that Tom made and I would add that the servers be listed in descending severity (or the list is sortable by that value).

    Marking a call as in hand with a colleague would be useful so that a team can deal with the UI from the time they log on without having to contact other DBAs to confirm the state of play, just pick the top one that isnt being handled already…

    Mikes suggestion of being able to attach business systems to the servers/databases would makea lot of sense and if the UI is exportable/printable you can drop it on a mangers desk as your work list for the day or ask them to prioritise should the need arise.

    Just wondering if there could be room for HDD used and db size changes in the panel; knowing that a database just grew by 50% or that you only have 5MB left on C:\ could be useful. Maybe have a hitlist of traces that each show on the panel if they are over a select % variance?

    Jonathan

  4. Posted September 14, 2009 at 10:21 pm | Permalink

    C’mon. Surely we all want a nice topological map of our servers, just like we have on Whatsup Gold. I want something like a Visio diagram where I can place servers according to function or geographical location, put rings around them and links. I want them to glow red when there is a problem.

  5. Jonathan Allen
    Posted September 15, 2009 at 2:02 pm | Permalink

    Are you thinking about having the recommendations in a similar global overview too?
    Personally speaking, I would like to see more detail on how to resolve the situations where the recommendation is in alert state. Double click to get the clipboard full of the SQL to defrag the index or to run the log backup where its overdue or to shrink the database file etc, or even to execute the SQL from SQL Response…

    (Its off topic so I’ll shut up for the minute.)

    Jonathan

  6. Tom
    Posted September 16, 2009 at 9:09 am | Permalink

    Thank you all for your feedback!

    Mike
    We’re planning on implementing ‘tagging’. This would allow you to tag a Machine / Instance with a Location or Department (or anything else of your choosing) then view just the servers with a particular tag.
    A top level view could then be restricted to show just servers with a particular tag.
    Which groups or tags currently have a problem isn’t something we’d really been thinking about, but sounds like a very good idea.
    There are some really good suggestions there!

    Jonathan
    Thanks!
    We’ve been discussing handling ‘work allocation’ at the Alert level. A user could potentially flag individual alerts as ‘in progress’.
    We’ll do our best to make sure the UI prints nicely and we’re going to support a SQL Server back end as well as SQL Lite in Response 2 so you’re data will be fully portable.
    The problem we’re encountering with HDD used and DB size on the Global Overview is that if there’s multiple drives or dbs how to fit it on! With the designs shown that level of detail would be another click away.
    Allowing the user to ‘fix’ problems from the UI is certainly something we’ve discussed! It’s a concept that scares some. We don’t want users to accidentally start re-indexing a production db – however if we implemented it in a way that gave you the script to run yourself it could be a very cool feature.

    Phil
    A topological map would be rather groovy… we’ll have to see!

  7. Mark Allison
    Posted September 17, 2009 at 10:02 am | Permalink

    Not sure how you’d implement this but let’s say you identify a poorly performing server one Monday morning. A frequent question that comes up in my mind is, well how did it perform last Monday, and the Monday before that, or yesterday? A history slider or some kind of report that can gather this information quickly and present the dashboard at a point-in-time in the past would be a very useful feature for me.

    This would be separate from a reporting capability which could be implemented in SSRS or other reporting package for reporting over long periods of time to identify trends – I would also like this.

  8. Merrill Aldrich
    Posted September 18, 2009 at 11:43 pm | Permalink

    I really like Phil’s notion about the graphic being better than a list. BUT, I don’t want a server “icon” that just turns red – there’s no information there. I would much prefer a small set of graphs/charts that actually show recent counter information at a high level, AND turn red when there’s a problem (plug: I just submitted a sample app like this via the Design a Dashboard contest :-)

  9. Tom
    Posted September 21, 2009 at 1:28 pm | Permalink

    Mark

    We are planning on implementing a ‘back in time’ control at the overview level that will allow you to pick a point or range in time to look at.

    We’re not sure whether this would show just the historical values or would be side by side with your current data.
    Side by side’s a trick design problem – it might be something we only support for a single metric at a time.

    Merrill

    Definitely – we’re really keen to show that extra level of detail even at the ‘global overview’.
    We’re excited about checking out your entry!

Post a Comment

Required fields are marked *

Add an Image

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>