PDA

View Full Version : SLES11SP2 xen breaks current Fedora15/16/17 guests



Franz Sirl
03-Apr-2012, 09:39
Hi,

after upgrading one of our xen hosts from SP1 to SP2, the Fedora15 DomU
stopped working. Further checking showed it crashed very early in the
boot process:

# /usr/lib64/xen/bin/xenctx -s
/mnt/tmp/boot/System.map-2.6.41.9-1.fc15.x86_64 17
rip: ffffffff810013aa hypercall_page+0x3aa

flags: 00001282 i s nz

rsp: ffffffff81a01a90

rax: 0000000000000000 rcx: ffffffff810013aa rdx: ffffffff81cae300

rbx: 0000000000000000 rsi: ffffffff81a01ab4 rdi: 0000000000000002

rbp: ffffffff81a01ab8 r8: 0000000000000000 r9: 0000000000000100

r10: 0000000000000000 r11: 0000000000000282 r12: ffffffff81cae300

r13: 00000000ffffffff r14: 0000000000000000 r15: 0000000000000000

cs: e033 ss: e02b ds: 0000 es: 0000

fs: 0000 @ 0000000000000000

gs: 0000 @ ffff88003ff99000/0000000000000000

Code (instr addr ffffffff810013aa)

cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b
59 c3 cc cc cc cc cc cc cc


Stack:
0000000000000001 00000000ffffffff ffffffff8100382e ffffffff8100a8df
00000003815af866 ffffffff81a01ac8 ffffffff81003903 ffffffff81a01b08
ffffffff815b345d ffffffff81a01d98 ffffffff817b76b0 ffffffff81a01d98
000000000000000b 0000000000000000 0000000000000000 ffffffff81a01b18

Call Trace:
[<ffffffff810013aa>] hypercall_page+0x3aa <--
[<ffffffff8100382e>] xen_reboot+0x1e
[<ffffffff8100a8df>] xen_restore_fl_direct_end
[<ffffffff81003903>] xen_panic_event+0x13
[<ffffffff815b345d>] notifier_call_chain+0x4d
[<ffffffff815b34ba>] atomic_notifier_call_chain+0x1a
[<ffffffff815a4cc3>] panic+0xbf
[<ffffffff8100a1ed>] xen_force_evtchn_callback+0xd
[<ffffffff8106ff7e>] do_exit+0x88e
[<ffffffff8100a8df>] xen_restore_fl_direct_end
[<ffffffff815af866>] _raw_spin_unlock_irqrestore+0x16
[<ffffffff8106d45a>] kmsg_dump+0x4a
[<ffffffff815b0b0b>] oops_end+0xab
[<ffffffff81016818>] die+0x58
[<ffffffff815b0214>] do_trap+0xc4
[<ffffffff81013eb5>] do_invalid_op+0x95
[<ffffffff8101d14c>] xstate_enable+0x3c
[<ffffffff8100a1ed>] xen_force_evtchn_callback+0xd
[<ffffffff8100a8f2>] check_events+0x12
[<ffffffff8100afb9>] get_phys_to_machine+0x9
[<ffffffff810065e9>] pte_mfn_to_pfn+0x89
[<ffffffff815b95eb>] invalid_op+0x1b
[<ffffffff8101d14c>] xstate_enable+0x3c
[<ffffffff8101d13c>] xstate_enable+0x2c
[<ffffffff81b8176b>] xstate_enable_boot_cpu+0xa9
[<ffffffff8100a8df>] xen_restore_fl_direct_end
[<ffffffff8100464d>] xen_clts+0x8d
[<ffffffff81597ec0>] xsave_init+0x26
[<ffffffff8159a0d7>] cpu_init+0x2dc
[<ffffffff81b7df17>] trap_init+0x169
[<ffffffff81b78a25>] start_kernel+0x1d0
[<ffffffff81b78347>] x86_64_start_reservations+0x132
[<ffffffff81b7bd6f>] xen_start_kernel+0x5b5

After some experimenting I found that all recent Fedora15/16/17-alpha
kernels show the same crash (Just try installing Fedora16 or 17-alpha as
a guest). What still works is the original Fedora15 kernel
2.6.38.6-26.rc1.fc15.x86_64.
So it seems somewhere along the way some kind of incompatibility has
been introduced.

Franz.

Automatic reply
12-Apr-2012, 13:30
Franz,

It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.

Has your issue been resolved? If not, you might try one of the following options:

- Visit http://www.suse.com/support and search the knowledgebase and/or check all
the other support options available.
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.suse.com)

Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php

If this is a reply to a duplicate posting, please ignore and accept our apologies
and rest assured we will issue a stern reprimand to our posting bot.

Good luck!

Your SUSE Forums Team
http://forums.suse.com

Franz Sirl
03-Sep-2012, 09:52
Hi,

I tried to produce a better bugreport with the crash utility, but after
trying crash-6.0.7 from Factory I still get:

# crash vmlinux-3.3.4-5.fc17.x86_64
/var/lib/xen/dump/dev-fedora/2012-0831-1610.20-dev-fedora.14.core

crash 6.0.7
Copyright (C) 2002-2012 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

crash: vmlinux-3.3.4-5.fc17.x86_64: no .gnu_debuglink section
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

crash: cannot find mfn 0 (0x0) in page index

crash: cannot read/find pud page

#

Any ideas?
Franz.


Am 2012-04-03 10:39, schrieb Franz Sirl:
> Hi,
>
> after upgrading one of our xen hosts from SP1 to SP2, the Fedora15 DomU
> stopped working. Further checking showed it crashed very early in the
> boot process:
>
> # /usr/lib64/xen/bin/xenctx -s
> /mnt/tmp/boot/System.map-2.6.41.9-1.fc15.x86_64 17
> rip: ffffffff810013aa hypercall_page+0x3aa
> flags: 00001282 i s nz
> rsp: ffffffff81a01a90
> rax: 0000000000000000 rcx: ffffffff810013aa rdx: ffffffff81cae300
> rbx: 0000000000000000 rsi: ffffffff81a01ab4 rdi: 0000000000000002
> rbp: ffffffff81a01ab8 r8: 0000000000000000 r9: 0000000000000100
> r10: 0000000000000000 r11: 0000000000000282 r12: ffffffff81cae300
> r13: 00000000ffffffff r14: 0000000000000000 r15: 0000000000000000
> cs: e033 ss: e02b ds: 0000 es: 0000
> fs: 0000 @ 0000000000000000
> gs: 0000 @ ffff88003ff99000/0000000000000000
> Code (instr addr ffffffff810013aa)
> cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b
> 59 c3 cc cc cc cc cc cc cc
>
>
> Stack:
> 0000000000000001 00000000ffffffff ffffffff8100382e ffffffff8100a8df
> 00000003815af866 ffffffff81a01ac8 ffffffff81003903 ffffffff81a01b08
> ffffffff815b345d ffffffff81a01d98 ffffffff817b76b0 ffffffff81a01d98
> 000000000000000b 0000000000000000 0000000000000000 ffffffff81a01b18
>
> Call Trace:
> [<ffffffff810013aa>] hypercall_page+0x3aa <--
> [<ffffffff8100382e>] xen_reboot+0x1e
> [<ffffffff8100a8df>] xen_restore_fl_direct_end
> [<ffffffff81003903>] xen_panic_event+0x13
> [<ffffffff815b345d>] notifier_call_chain+0x4d
> [<ffffffff815b34ba>] atomic_notifier_call_chain+0x1a
> [<ffffffff815a4cc3>] panic+0xbf
> [<ffffffff8100a1ed>] xen_force_evtchn_callback+0xd
> [<ffffffff8106ff7e>] do_exit+0x88e
> [<ffffffff8100a8df>] xen_restore_fl_direct_end
> [<ffffffff815af866>] _raw_spin_unlock_irqrestore+0x16
> [<ffffffff8106d45a>] kmsg_dump+0x4a
> [<ffffffff815b0b0b>] oops_end+0xab
> [<ffffffff81016818>] die+0x58
> [<ffffffff815b0214>] do_trap+0xc4
> [<ffffffff81013eb5>] do_invalid_op+0x95
> [<ffffffff8101d14c>] xstate_enable+0x3c
> [<ffffffff8100a1ed>] xen_force_evtchn_callback+0xd
> [<ffffffff8100a8f2>] check_events+0x12
> [<ffffffff8100afb9>] get_phys_to_machine+0x9
> [<ffffffff810065e9>] pte_mfn_to_pfn+0x89
> [<ffffffff815b95eb>] invalid_op+0x1b
> [<ffffffff8101d14c>] xstate_enable+0x3c
> [<ffffffff8101d13c>] xstate_enable+0x2c
> [<ffffffff81b8176b>] xstate_enable_boot_cpu+0xa9
> [<ffffffff8100a8df>] xen_restore_fl_direct_end
> [<ffffffff8100464d>] xen_clts+0x8d
> [<ffffffff81597ec0>] xsave_init+0x26
> [<ffffffff8159a0d7>] cpu_init+0x2dc
> [<ffffffff81b7df17>] trap_init+0x169
> [<ffffffff81b78a25>] start_kernel+0x1d0
> [<ffffffff81b78347>] x86_64_start_reservations+0x132
> [<ffffffff81b7bd6f>] xen_start_kernel+0x5b5
>
> After some experimenting I found that all recent Fedora15/16/17-alpha
> kernels show the same crash (Just try installing Fedora16 or 17-alpha as
> a guest). What still works is the original Fedora15 kernel
> 2.6.38.6-26.rc1.fc15.x86_64.
> So it seems somewhere along the way some kind of incompatibility has
> been introduced.
>
> Franz.