Linux x86_64: Detecting Hardware Errors

last updated in Categories CentOS, Debian Linux, fedora linux, Gentoo Linux, Hardware, Howto, kernel, Linux, Linux distribution, Networking, package management, RedHat/Fedora Linux, Shell scripting, Sys admin, Tips, Troubleshooting, Ubuntu Linux

The Blue Screen of Death (BSoD) is used for the error screen displayed by Microsoft Windows, after encountering a critical system. Linux / UNIX like operating system may get a kernel panic. It is just like BSoD. The BSoD and a kernel panic generated using a Machine Check Exception (MCE). MCE is nothing but feature of AMD / Intel 64 bit systems which is used to detect an unrecoverable hardware problem.

Program such mcelog decodes machine check events (hardware errors) on x86-64 machines running a 64-bit Linux kernel. It should be run regularly as a cron job on any x86-64 Linux system. This is useful for predicting server hardware failure before actual server crash.