Hard disks are of major importance, as well as any other type of data storage devices. Since hard disk failure almost always means important data loss, it is required to monitor disk health state, to detect possible problems as soon as possible.
There are several means to do that; using the so called S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) is preferable, wherever possible, since it provides early warning capability to inform of unhealthy disk state (i.e., too hot). S.M.A.R.T. should be enabled to check it; refer to your motherboard’s references to learn how.
All the modern operating system provide tools to inquire the disk state through S.M.A.R.T. However, even though this is enough to view the overall current state, manual analysis of disk health reports can be too tedious a work; there can be hundreds of HDDs to monitor.
First, let’s use command like smartctl -q silent -H /disk/device (replace /disk/device with actual disk device name; the above utility exists for all popular OS’s) which will return an exit code indicating the current and historical health state of a given device. In most cases, it requires to create a two-line script and use SNMP ‘exec’ feature to run the mentioned script and assign its return value to a given SNMP value.
Now there’s a tricky part: the return value if the monitor is a bit mask. Perhaps the script could remove unnecessary bits (since, for example, a historical record of some value beyond the threshold remains forever). However, it’s easy to set up checks for both warning state and down state of the monitor (the latter means the critical parameter crossed the threshold – is below the threshold value). If down state is reported in such a matter, disk health should immediately be checked thoroughly, since disk is in either unhealthy state, or about to fail soon.
To be even more quick to detect problems, we could use SNMP traps monitor. The trap means it can trigger an action as soon as a given condition is met. Thus, we would better use SNMP traps to react immediately to a down state, without waiting for another polling. However, you should be warned against running these polls too often: the inquire raises server load and in certain situation may take much time to complete and significantly raise disk response time, while its data are being read. 3 to 5 minutes should be fairly good polling intervals for the mentioned monitors.
Free 30-day trial version of IPHost Network Monitor is available. During your trial you can get support by e-mail, please use contact form to send all your inquiries on IPHost Network Monitor features and purchase.
|Windows Interface Screenshot||Web Interface Screenshot|
Download the free trial of IPHost Network Monitor and start to monitor your network and vital applications in a few minutes.
IPHost Network Monitor 5.3 build 14188 of September 03, 2021. File size: 68MB