Sql Server Alerts Automated By miniDBA Monitoring Tool

Setup Alerts

Having timely alerts inform your team of changing SQL Server health is critical to every organization. Setting these up yourself requires substantial knowledge, development, implementation and testing time. MiniDBA aims to help you have comprehensive alerting setup without investing your time in the process.
MiniDBA provides 40+ alerts out of the box which cover all areas of SQL Server monitoring: sessions, memory, disk, cpu, deadlocks, agent jobs, Always On and much more. After registering a SQL Server with miniDBA, go to the Alerts screen for that server and under configure, view available alerts. Alerts are displayed by categories and expanding each one will show all under that category:

Enabling an alert is as simple as clicking the Enable check box and taking the default values. If you have never setup SQL Server alerts before then the default values can be a good starting point. Alert options are changeable after enabling it to either send you alerts more or less based on their threshold values.

Select an alert then review it's properties at the bottom of the screen. A description on the right of the properties tells you information about the alert. All other properties are shown for both major and minor priorities.

Copy Properties

If you have existing SQL Servers setup in miniDBA with alerts already setup you can copy these setting for your new server. Click the "Import Alerts Config" button at the top of the screen. This will give you the option to select a SQL Server to copy from. This will save you a lot of time if you have more extensive SQL Server estates.

Alert Priorities

All alerts have settings for both high and low priority. This allows you to differentiate between a potential emergency and just degraded performance. You do not have to use both priorities.

High and low come populated with different thresholds so they will fire at different times. You can set different notification settings for each allowing you to do things like only send emails on high alerts and send Slack notification for both high and low. If you want the high priority to notify when triggered but dont want low priority to, just set no threshold or an extreme value that will never be hit for low priority.

Notifications

miniDBA uses several ways to alert users when alerts fire:

In App SQL Server - alerts are shown in the alerts node of a SQL Server on the left of the screen
In App Global - alerts are shown for all SQL Servers in the Enterprise dashboard
In App Windows Popup - alert popups are shown in the Windows system tray
Windows Event Log - use this option to get alerts detected by Enterprise Management software like System Center Operations Manager (SCOM)
Slack - send alerts to channels you configure. More Details
Email - send to individual or groups, potentially different addresses depending on priority and use it to forward to apps like PagerDuty

Email

Email is the most common external notification method used by miniDBA. Click the email address text field to see the extended email properties on the right of the screen. Below is a screenshot of the email configuration for an alert:

You enter one or many email addresses in the box at top right. Clicking save puts that value in either the high or low priority email address field, depending on what you clicked. Email addresses that are already used for alerts can be selected from the list below, saving you time.

The options drop down at the top of the screen gives you access to email SMTP server setup.

MiniDBA aims to give context to alerts where possible to help drill down to the root cause of the problem. It goes beyond just telling you what threshold value has been breached. For example with memory alerts a memory snapshot of the SQL Server is included such as in the example email below:

This is extendended also for any alerts that feature sessions like long running and blocking queries as a list of active sessions is always included in the alert email.

Available Alerts

The following are the alerts that can be raised by miniDBA for SQL Server - click each one for details

Database Off-line

Any databases are not in an online state

Free Disk Space

Free disk space has fallen below a certain threshold
This may mean that the drive will fill completely and the server will not be able to function correctly

Blocking Processes

Processes being blocked by other processes
A degree of this is to be expected in busy environments as locks are taken out on objects
The MaxThreshold represents the number of blocking processes that it takes to trigger an alert

Deadlocks

Number of deadlocks detected in 1 minute
Processes deadlocked by other processes
Often caused by locks being held for too long, in turn caused by poor index design

Long Running Queries

Active processes that have been running the same sql batch for a long period of time
Huge queries may be consuming way too much of the servers resources
The MaxThreshold represents number of minutes the batch must run to trigger the alert

No Connection

The Sql Server cannot be contacted, either miniDBA does not have permission to connect or the server is down

Always On Enabled But Failed

The Always On feature is enabled but it failed to start properly and is an error state

Buffer Cache Hit Ratio

The percentage of reads that use the buffer cache, if dipped below will trigger the alarm

Plan Cache Hit Ratio

the percentage of queries that reuse execution plans from the plan cache, if dipped below will trigger the alarm

Maximum Connections

The number of concurrent client connections to the instance.
This may indicate periods of very high use or inefficient client application code.

Backup Restore

Backup/Restore in progress
This can negatively effect database performance of any databases that share the same drive as the database being backed up.
Can often occur at unexpected times and require better scheduling

File Size Limit

Data or Log file is near maximum size limit
Either the file will autogrow which may impact performance during the operation or it will run out of space if set to have a size limit

Instance Memory

Memory used by this instance (Mb)
Use this alert to tell you if the allocated memory is actually being used or not while tuning memory settings

Page Life Expectancy

If the life expectancy of memory in the buffer pool is very low it can indicate high memory pressure

Instance Cpu %

% of the Cpu used by this Sql Server instance
Sustained high usage may indicate poor T-SQL or index design

Non Instance Cpu %

% of the Cpu used by processes other than this instance
Ideally there should be minimal activity by any non SQL Server processes on the server.
Check virus scanners and other maintenance software is configured correctly

Disk Queue Length

IO items waiting to be serviced by the disk
High disk queue length can signify poor IO configuration.
Check if multiple databases sharing the same drive.
The thresholds are device dependant so experiment with the settings

Agent Not Running

SQL Server Agent is not running
No jobs will be able to run if the agent is not running

OS Memory State

The OS is lacking memory available for programs. SQL Server will attempt to trim its working set of memory pages

Failed Job

SQL Server Agent jobs that have failed
If the latest execution failed, it is raised as an alert
If this server is expected to have failing jobs then disable this alert

Tran Log % Used

Transaction Log file at high % use may need increasing in size
A full or near full transaction log may cause bad database performance and will likely have a run-away query with poor design causing it

File IO Stall Time

Average time in ms, waiting for the IO subsystem to perform operations
Lower waiting times are better - the IO Ms is calculated across total stall ms / read & write operations, giving an average rather than an on the spot calculation

Write Log Time

Time in ms, per second, the system has to wait for the IO subsystem to perform log writes
The figure is server wide not specific to any database and indicates a performance bottleneck around whichever drive the offending log file(s) are on

Error Log High Severity

Errors in the Sql Server error log of severity 17 or above

Error Log Medium Severity

Errors in the Sql Server error log of severity 15 or 16
These levels of severity covers user correctable errors such as problems with High Availability Groups or queries that throw errors

Failed Login

A connection was attempted to the SQL Server but the login was incorrect

Sort Warning

Indicates that sort operations do not fit into memory and are spilling to TempDb.
Performance of the query can be affected because multiple passes over the data are required to sort the data

File Auto Growths

Data or Log file auto growths occurred which will take longer to perform as the database gets bigger
Avoid autogrowths by giving the files a larger, more appropriate maximum size

Database Mirroring State Change

Indicates when the state of a mirrored database changes

Custom T-SQL Query

The alert is raised when the custom TSQL query is executed (every 1 minute) and the result is over a pre-defined threshold or boolean value

Always ON Failed To Start

The alert is raised when the HADR component os SQL Server is enabled but did not start properly

Always ON Failover

The alert is raised when the primary replica fails over from one cluster node to another

Always ON Replica Unhealthy

The alert is raised when replica synchronization state becomes unhealthy

Always ON Database Unhealthy

The alert is raised when database synchronization state becomes unhealthy

Always ON Transaction Delay Too Long

The alert is raised when the time a primary replica has to wait for secondary replicas to harden transmitted log data becomes too much.

IO (Checksum) Error

IO error (incorrect checksum) occurred when reading a data file. The read eventually completed successfully but encountered a checksum error. Check the device for other errors.