NMI watchdog: BUG: soft lockup

I am facing an issue, which is NMI watchdog: BUG: soft lockup. The system hangs up and can not be reached via any terminal and ping command.
The issue happens in a virtual machine.
Host CPU information is below

# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                160
On-line CPU(s) list:   0-159
Thread(s) per core:    2
Core(s) per socket:    10
Socket(s):             8
NUMA node(s):          8
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 47
Stepping:              2
CPU MHz:               1064.000
BogoMIPS:              4800.28
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              30720K
NUMA node0 CPU(s):     1-10,41-50
NUMA node1 CPU(s):     11-20,51-60
NUMA node2 CPU(s):     21-30,61-70
NUMA node3 CPU(s):     31-40,71-80
NUMA node4 CPU(s):     0,81-89,120-129
NUMA node5 CPU(s):     90-99,130-139
NUMA node6 CPU(s):     100-109,140-149
NUMA node7 CPU(s):     110-119,150-159

Host virtual machine information is below.

# virt-manager --version
0.9.4

Host OS information is below

# uname -r
3.0.101-0.47.79-default
# cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3

Guest CPU information is below. In virt-manager, I selected "Copy host CPU configuration".

# lscpu
    Architecture:        x86_64
    CPU op-mode(s):      32-bit, 64-bit
    Byte Order:          Little Endian
    Address sizes:       42 bits physical, 48 bits virtual
    CPU(s):              4
    On-line CPU(s) list: 0-3
    Thread(s) per core:  1
    Core(s) per socket:  1
    Socket(s):           4
    NUMA node(s):        1
    Vendor ID:           GenuineIntel
    CPU family:          6
    Model:               2
    Model name:          QEMU Virtual CPU version 1.4.2
    Stepping:            3
    CPU MHz:             2400.084
    BogoMIPS:            4800.16
    Hypervisor vendor:   KVM
    Virtualization type: full
    L1d cache:           32K
    L1i cache:           32K
    L2 cache:            4096K
    NUMA node0 CPU(s):   0-3
    Flags:               fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm rep_good nopl cpuid tsc_known_freq pni cx16 popcnt hypervisor lahf_lm pti

Guest OS information

# uname -r
4.12.14-197.37-default
# lsb-release -a
LSB Version:    n/a
Distributor ID: SUSE
Description:    SUSE Linux Enterprise Server 15 SP1
Release:        15.1
Codename:       n/a

I checked below informatin in guest system.

# cat /proc/sys/kernel/tainted
0
# cat /proc/sys/kernel/watchdog
1
# cat /proc/sys/kernel/watchdog_thresh
10
# cat /proc/sys/kernel/nmi_watchdog
0
# cat /proc/sys/kernel/soft_watchdog
1
# cat /proc/sys/kernel/softlockup_panic
0
# cat /proc/sys/kernel/unknown_nmi_panic
0

And I did below update in guest system.

# echo 0 > /proc/sys/kernel/watchdog
# echo 0 > /proc/sys/kernel/soft_watchdog
# echo 20 > /proc/sys/kernel/watchdog_thresh

The issue is still there and the challenge is that host OS can not be upgraded to higher SLES version. Could you please advise if any solutions.

Sign In or Register to comment.