PDA

View Full Version : SLES12SP2 KVM guest: Uhhuh. NMI received for unknown reason 30 on CPU0.



Franz Sirl
16-Dec-2016, 14:48
Hi,

the complete message is:

[ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
[ 3411.029028] Do you have a strange power saving mode enabled?
[ 3411.029028] Dazed and confused, but trying to continue
[ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
[ 3441.028996] Do you have a strange power saving mode enabled?
[ 3441.028996] Dazed and confused, but trying to continue

When it happens, it seems to happen very often (every 30 seconds) and
the guest typically locks up totally after 1-2 days.

Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
problem when the guest is freshly started (virsh start guest), but as
soon as I issue a 'reboot' within the guest, the messages start after a
few seconds.

Anyone seen this and knows a workaround?

Franz.

smflood
16-Dec-2016, 15:31
On 16/12/16 13:48, Franz Sirl wrote:

> the complete message is:
>
> [ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
> [ 3411.029028] Do you have a strange power saving mode enabled?
> [ 3411.029028] Dazed and confused, but trying to continue
> [ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
> [ 3441.028996] Do you have a strange power saving mode enabled?
> [ 3441.028996] Dazed and confused, but trying to continue
>
> When it happens, it seems to happen very often (every 30 seconds) and
> the guest typically locks up totally after 1-2 days.
>
> Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
> problem when the guest is freshly started (virsh start guest), but as
> soon as I issue a 'reboot' within the guest, the messages start after a
> few seconds.
>
> Anyone seen this and knows a workaround?

Can you try updating a guest (and possibly the host) to latest kernel
available for SLES12 SP2 - 4.4.21-90.1 ?

HTH.
--
Simon
SUSE Knowledge Partner

------------------------------------------------------------------------
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.
------------------------------------------------------------------------

Franz Sirl
16-Dec-2016, 19:31
Am 2016-12-16 um 15:31 schrieb Simon Flood:
> On 16/12/16 13:48, Franz Sirl wrote:
>
>> the complete message is:
>>
>> [ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>> [ 3411.029028] Do you have a strange power saving mode enabled?
>> [ 3411.029028] Dazed and confused, but trying to continue
>> [ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
>> [ 3441.028996] Do you have a strange power saving mode enabled?
>> [ 3441.028996] Dazed and confused, but trying to continue
>>
>> When it happens, it seems to happen very often (every 30 seconds) and
>> the guest typically locks up totally after 1-2 days.
>>
>> Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
>> problem when the guest is freshly started (virsh start guest), but as
>> soon as I issue a 'reboot' within the guest, the messages start after a
>> few seconds.
>>
>> Anyone seen this and knows a workaround?
>
> Can you try updating a guest (and possibly the host) to latest kernel
> available for SLES12 SP2 - 4.4.21-90.1 ?

Hi Simon,

I misremembered, it was with kernel-4.4.21-84/90, the host is still
kernel-4.4.21-84. qemu and xen-libs on the host have the latest updates
though. I just retried the reboot on the guest again, but after ~10min
the messages appeared again.

I'll try to fully update the host too over the weekend, but I don't have
much hope because only 2 CVEs are listed for the -90 kernel update.

Franz.

Franz Sirl
19-Dec-2016, 11:22
Am 2016-12-16 um 19:31 schrieb Franz Sirl:
> Am 2016-12-16 um 15:31 schrieb Simon Flood:
>> On 16/12/16 13:48, Franz Sirl wrote:
>>
>>> the complete message is:
>>>
>>> [ 3411.029025] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>>> [ 3411.029028] Do you have a strange power saving mode enabled?
>>> [ 3411.029028] Dazed and confused, but trying to continue
>>> [ 3441.028990] Uhhuh. NMI received for unknown reason 20 on CPU 0.
>>> [ 3441.028996] Do you have a strange power saving mode enabled?
>>> [ 3441.028996] Dazed and confused, but trying to continue
>>>
>>> When it happens, it seems to happen very often (every 30 seconds) and
>>> the guest typically locks up totally after 1-2 days.
>>>
>>> Host and guest are both SLES12SP2 with kernel-4.4.21-84/88. There is no
>>> problem when the guest is freshly started (virsh start guest), but as
>>> soon as I issue a 'reboot' within the guest, the messages start after a
>>> few seconds.
>>>
>>> Anyone seen this and knows a workaround?
>>
>> Can you try updating a guest (and possibly the host) to latest kernel
>> available for SLES12 SP2 - 4.4.21-90.1 ?
>
> Hi Simon,
>
> I misremembered, it was with kernel-4.4.21-84/90, the host is still
> kernel-4.4.21-84. qemu and xen-libs on the host have the latest updates
> though. I just retried the reboot on the guest again, but after ~10min
> the messages appeared again.
>
> I'll try to fully update the host too over the weekend, but I don't have
> much hope because only 2 CVEs are listed for the -90 kernel update.

I was right, the -90 kernel update doesn't change anything. But I think
I have found a hint now, it only happens with the Q35 emulation, another
VM with the I440FX emulation reboots fine. So there is likely a problem
in qemu with Q35 emulation. I'll try to find more when there is time.

Franz