Linux kernel: Uhhuh. NMI received for unknown reason 30

last updated in Categories , , ,

Q. I’ve upgrade my CentOS / RHEL (Red Hat Enterprise Linux) 4.7 on HP ProLiant DL580 G5 and it is showing unknown NMI errors in the logs:

Uhhuh. NMI received for unknown reason 30.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?

Uhhuh. NMI received for unknown reason 20.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?

How do I fix this error?
A. This is caused when the system is hanging under load. Add any one of the following to you /boot/grub.conf file:

  1. Disable the NMI watchdog by adding “nmi_watchdog=0
  2. Disable the high precision event timer (HPET) by adding “nohpet

Open grub.conf, type:
vi grub.conf
Make modification to kernel line as follows:

title Red Hat Enterprise Linux AS (2.6.9-78.0.8.EL)
        root (hd0,0)
        kernel /vmlinuz-2.6.9-78.0.8.EL ro nohpet root=/dev/VolGroup00/LogVol00 nohpet
        initrd /initrd-2.6.9-78.0.8.EL.img

Save and close the file. Reboot the server:
# reboot

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin, DevOps engineer, and a trainer for the Linux operating system/Unix shell scripting. Get the latest tutorials on SysAdmin, Linux/Unix and open source topics via RSS/XML feed or weekly email newsletter.

6 comment

  1. try also the “acpi=off” switch because on some cause it works better than the two ones proposed in this article

    1. So, you recommend turning off the nmi_watchdog, that listens for hardware throwing errors that may compromise your system? Probably not smart. Just out of curiosity why would you additionally suggest changing the kernel timer as a method to avoid a hardware device throwing kill signals? BTW, this is not caused by the system hanging under load.

      even wikipedia knows better

      1. Well for one Virtual Machines cannot use NMI watchdog or HPET for that matter, so disabling will stop the errors and prevent possible problems.

  2. I had see
    kernel:Uhhuh. NMI received for unknown reason 31 on CPU 0.
    kernel:Uhhuh. NMI received for unknown reason 21 on CPU 0.
    kernel:Do you have a strange power saving mode enabled?
    kernel:Dazed and confused, but trying to continue

    I had try :
    add ‘nmi_watchdog=0 pcie_aspm=off nohpet’ to kernel param
    change a older kernel

    Use a older kernel 2.6.32-131.21.1 (default is 2.6.32-358.23.2)。

  3. Hello,

    I am thankful for this posting, which has been a useful resource for testing one of my servers.

    My 2 cents is that while this mechanism is a great tool to diagnose possible problems, configuring it (turning it off) to get rid of the thousand of error messages in the log helped extend the system hang periods, from a few hours to a couple of days, but the hangs did not disappear.

    Nevertheless, as bob dobbs comments, you are only closing your eyes to a problem that needs to be addressed anyways. It is a temporary solution on your log, that does not fix the existing problem, and the system hangs will keep occurring.

    In our case this problem triggered after a system upgrade, but we believe that the upgrade only turned on the watchdog which started logging the errors and increasing the failure rate.

    So, basically leaving the watchdog on (as it is the default on this specific debian kernel), is a sign most likely of an abnormal memory or a hardware problem that needs to be addressed soon, which caused one of your CPU’s to hang, and not respond to the 5 second checks by the NMI Watchdog. Kernel is: 2.6.32-5-686-bigmen

    My advise is to NOT turn it off, but only while you find out what the real error is. Once you find the cause of failure you must turn it back on and see what happens.

    One more piece of advise. In case you have a hard time “sudoing” and echoing 0 to > /proc/sys/kernel/nmi_watchdog, etc. etc. and getting errors doing it, and even turning it off on /etc/sysctl.conf through:

    # Turn OFF NMI watchdog
    kernel.nmi_watchdog = 0

    The only way it worked for me was to set it at the kernel loading level by editing it in /boot/grub/grub.cfg so it gets set during the bootloading process.

    linux /boot/vmlinuz-2.6.32-5-686-bigmem root=UUID=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXX ro quiet nmi_watchdog=1

    where the UUID will be different and specific to your particular system.

  4. FWIW I received these error messages when I was trying to silence a rattling case fan on a running system, I plugged the fan into and out of it’s power cable several times during the attempted fix and then the error messages popped up on all the open consoles, one running as regular user and one running as root (which was me). This was with kernel 3.13.0 on a dual Xeon with SMP (obviously) and ACPI but I don’t think the messages came from NMI watchdog because that was not enabled on this particular system. Anyway, since I’m sure it was my diddling with the fan that caused it I’m not looking into it any further, got too many other systems to adming to fret over problems caused by my own stupidity… :P

    Still, have a question? Get help on our forum!