Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: SLES 11 kdumptool - error loading shared libraries

  1. #1

    SLES 11 kdumptool - error loading shared libraries

    Hi,
    We've been having issues with SLES 11 servers freezing/crashing/hanging intermittently. This has happened for almost a year now, and continues even though we've upgraded to SP1 and now to SP2.

    Anyway, I've enabled Magic SysRq on all SLES 11 servers and configured kdump. The goal is to capture crash dumps via the console on unresponsive systems for further analysis.

    I'm currently testing the crash dump combo on a SLES 11 SP2 VM. The console responds to Alt+SysRq+c, but then stops at a bash prompt - see screenshot at bottom of my post. The main errors I can see are:

    Code:
    /sbin/resume: error while loading shared libraries: libgcrypt.so.11: cannot open shared object file: No such file or directory
    ...
    kdumptool:  error while loading shared libraries: libelf.so.0: cannot open shared object file: No such file or directory
    However, both shared libs exist in library directories on the root partition, so I don't know why resume/kumptool can't find them. Must be something with the "resume" option of the kernel that I do not understand.

    Code:
    ## libgcrypt
    myhost:~ # whereis libgcrypt.so.11
    libgcrypt.so: /lib/libgcrypt.so.11 /lib64/libgcrypt.so.11 /usr/lib64/libgcrypt.so /usr/local/lib/libgcrypt.so.11 /usr/local/lib/libgcrypt.so
    
    myhost:~ # dir /lib/libgcrypt.so*
    lrwxrwxrwx 1 root root     19 Feb 29 11:17 /lib/libgcrypt.so.11 -> libgcrypt.so.11.7.0
    -rwxr-xr-x 1 root root 545124 Jan 13 13:00 /lib/libgcrypt.so.11.7.0
    
    ## libelf
    myhost:~ # whereis libelf.so.0
    libelf.so: /usr/lib/libelf.so.1 /usr/lib/libelf.so.0 /usr/lib64/libelf.so.1 /usr/lib64/libelf.so.0 /usr/local/lib/libelf.so.0 /usr/local/lib/libelf.so
    
    myhost:~ # dir /usr/lib/libelf.so*
    lrwxrwxrwx 1 root root    16 Mar  9  2011 /usr/lib/libelf.so.0 -> libelf.so.0.8.12
    -rwxr-xr-x 1 root root 88312 May  5  2010 /usr/lib/libelf.so.0.8.12
    lrwxrwxrwx 1 root root    25 Mar  9  2011 /usr/lib/libelf.so.1 -> /usr/lib/libelf.so.0.8.12

    Any idea what the issue is here? There may be more informative messages higher up in the console, but I can't see them and not sure how to slow down the messages to capture them, or how dump them to a file. My VM is running on VMWare.

    Screenshot:

  2. #2

    Re: SLES 11 kdumptool - error loading shared libraries

    Don't know about those library errors... could be you need check and fix paths in /etc/ld.so.conf and rerun ldconfig. Have not needed to mess with that though on SLES 11.

    Some questions to get a better feel of the environment SLES is running in/as:

    Which VMware version are you running there (version/build) en which VMware tools version (also was this a tar install or rpm).

    I'm also curious to know how these SLES servers have been setup (which version is it? The VMware release and 32 or 64bit) and did you add any special parameters into the boot options?

    Lastly, what are these servers doing? Which software is running and maybe some custom tuning has been done?

    -Willem
    Knowledge Partner (voluntary sysop)
    ---
    If you find a post helpful and are logged into the web interface,
    please show your appreciation and click on the star below it. Thanks!

  3. #3

    Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by Magic31 View Post
    Don't know about those library errors... could be you need check and fix paths in /etc/ld.so.conf and rerun ldconfig. Have not needed to mess with that though on SLES 11
    One more thing to add to this: run ldconfig with the -v switch and check if the libraries kdump is fussing about appear in that list.

    It could also be worth a try to see if running "SuSEconfig --verbose" mentions anything particular.

    Cheers,
    Willem
    Knowledge Partner (voluntary sysop)
    ---
    If you find a post helpful and are logged into the web interface,
    please show your appreciation and click on the star below it. Thanks!

  4. #4

    Re: SLES 11 kdumptool - error loading shared libraries

    Hi Willem,

    The SLES 11 VM is a guest on a VMware host running ESXi 5.0.0, 623860.

    SLES 11 SP2 64-bit. The ISO I used to install it was from the Novell downloads, and not VMware-specific.

    Code:
    myhost:~ # uname -a
    Linux goat 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64 x86_64 x86_64 GNU/Linux
    Actually, the server was installed fresh with SLES 11 SP1, then an in-place upgrade to SP2 was done via Yast > Patch CD Upgrade. Here are my boot options:


    Code:
    myhost:~ # cat /boot/grub/menu.lst
    # Modified by YaST2. Last modification on Wed Feb 29 11:31:23 EST 2012
    default 0
    timeout 8
    ##YaST - generic_mbr
    gfxmenu (hd0,0)/boot/message
    ##YaST - activate
    
    ###Don't change this comment - YaST2 identifier: Original name: linux###
    title SUSE Linux Enterprise Server 11 SP2 - 3.0.13-0.27 (default)
        root (hd0,0)
        kernel /boot/vmlinuz-3.0.13-0.27-default root=/dev/sda1 insmod=qla4xxx resume=/dev/sdb1 splash=silent crashkernel=256M-:128M showopts
        initrd /boot/initrd-3.0.13-0.27-default
    
    ###Don't change this comment - YaST2 identifier: Original name: failsafe###
    title Failsafe -- SUSE Linux Enterprise Server 11 SP2 - 3.0.13-0.27
        root (hd0,0)
        kernel /boot/vmlinuz-3.0.13-0.27-default root=/dev/sda1 showopts ide=nodma apm=off noresume edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 nomodeset x11failsafe
        initrd /boot/initrd-3.0.13-0.27-default
    
    ###Don't change this comment - YaST2 identifier: Original name: linux###
    title Trace -- SUSE Linux Enterprise Server 11 SP2 - 3.0.13-0.27
        root (hd0,0)
        kernel /boot/vmlinuz-3.0.13-0.27-trace root=/dev/sda1 insmod=qla4xxx resume=/dev/sdb1 splash=silent crashkernel=256M-:128M showopts
        initrd /boot/initrd-3.0.13-0.27-trace
    
    ###Don't change this comment - YaST2 identifier: Original name: floppy###
    title Floppy
        rootnoverify (fd0)
        chainloader +1
    This VM is used as a testing ground for proposed software in our SLES 11 environment, but it is currently not running any "extra" software. With that said, I've seen two other VMs crash that are actively used by our developers, then this same issue occurs when attempting to do a dump via Magic SysRq. The other two VMs run OpenLDAP and Shibboleth, but nothing else.

  5. Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by ashbyj
    Hi,
    We've been having issues with SLES 11 servers freezing/crashing/hanging
    intermittently. This has happened for almost a year now, and continues
    even though we've upgraded to SP1 and now to SP2.

    Anyway, I've enabled 'Magic SysRq'
    (http://www.suse.com/support/kb/doc.php?id=3374462) on all SLES 11
    servers and configured kdump. The goal is to capture crash dumps via the
    console on unresponsive systems for further analysis.

    <snip>

    Any idea what the issue is here? There may be more informative
    messages higher up in the console, but I can't see them and not sure how
    to slow down the messages to capture them, or how dump them to a file.
    My VM is running on VMWare.

    *Screenshot*:
    [image: http://i.imgur.com/GqiI2.png]
    Hi
    I would guess the filesystem containing the libraries has not been
    mounted at that point hence it can't find them (/dev/sda1).....

    When your at that prompt, what is mounted? (just run the mount command).
    The dmesg command may offer further information. In you VM if you can
    get to tty10 (ctrl+alt+F10) it has kernel messages.

    I would guess that /dev/sdb1 is you swap partition (and for resume), so
    why would it being resuming, sure the systems are not going into a
    power management mode (seem to be frozen). I would have thought
    anything related to power saving would be disabled?

    --
    Cheers Malcolm °¿° (Linux Counter #276890)
    openSUSE 12.1 (x86_64) Kernel 3.1.10-1.9-desktop
    up 5 days 18:26, 4 users, load average: 0.05, 0.04, 0.05
    CPU Intel i5 CPU M520@2.40GHz | Intel Arrandale GPU


  6. #6

    Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by malcolmlewis View Post
    I would guess the filesystem containing the libraries has not been
    mounted at that point hence it can't find them (/dev/sda1).....

    When your at that prompt, what is mounted? (just run the mount command).
    This is what the mount command returns at that bash prompt (after Alt+SysRq+c is pressed):

    Code:
    bash-3.2# mount
    proc on /proc type proc (rw)
    sysfs on /sys type sysfs (rw)
    udev on /dev type tmpfs (rw,mode=0755,nr_inodes=0)
    tmpfs on /dev/shm type tmpfs (rw,mod=1777)
    devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
    /dev/sda1 on /root type ext3 (rw,acl,user_xattr)
    bash-3.2#
    Quote Originally Posted by malcolmlewis View Post
    The dmesg command may offer further information. In you VM if you can
    get to tty10 (ctrl+alt+F10) it has kernel messages.
    Cool, I did not know about tty10. I think its just Alt+F# in VMWare though to switch terminals. I was unable to get to tty10 when at the bash prompt though. After doing "exit" it seems to finish booting OK and then I'm at the normal login prompt - I'm not sure if this is a kexec/kdump kernel it puts me in or what kernel I'm presented a login prompt for? Anyway, I'm able to get to tty10 at that point, but its past the shared lib error messages I was interested in.



    Quote Originally Posted by malcolmlewis View Post
    I would guess that /dev/sdb1 is you swap partition (and for resume), so
    why would it being resuming, sure the systems are not going into a
    power management mode (seem to be frozen). I would have thought
    anything related to power saving would be disabled?
    Correct, /dev/sdb1 is swap and the setting for resume. Ya know, I've never questioned the "resume" kernel option. If its for hibernating or suspending a system for power-saving reasons, I guess I don't need it? Its a server, hence the reason we run SLES.

  7. Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by ashbyj
    malcolmlewis;4694 Wrote:
    >
    > I would guess the filesystem containing the libraries has not been
    > mounted at that point hence it can't find them (/dev/sda1).....
    >
    > When your at that prompt, what is mounted? (just run the mount
    > command).
    >


    This is what the mount command returns at that bash prompt (after
    Alt+SysRq+c is pressed):


    Code:
    --------------------

    bash-3.2# mount
    proc on /proc type proc (rw)
    sysfs on /sys type sysfs (rw)
    udev on /dev type tmpfs (rw,mode=0755,nr_inodes=0)
    tmpfs on /dev/shm type tmpfs (rw,mod=1777)
    devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
    /dev/sda1 on /root type ext3 (rw,acl,user_xattr)
    bash-3.2#

    --------------------


    malcolmlewis;4694 Wrote:
    >
    > The dmesg command may offer further information. In you VM if you can
    > get to tty10 (ctrl+alt+F10) it has kernel messages.
    >


    Cool, I did not know about tty10. I think its just Alt+F# in VMWare
    though to switch terminals. I was unable to get to tty10 when at the
    bash prompt though. After doing "exit" it seems to finish booting OK
    and then I'm at the normal login prompt - I'm not sure if this is a
    kexec/kdump kernel it puts me in or what kernel I'm presented a login
    prompt for? Anyway, I'm able to get to tty10 at that point, but its
    past the shared lib error messages I was interested in.



    malcolmlewis;4694 Wrote:
    >
    > I would guess that /dev/sdb1 is you swap partition (and for resume),
    > so
    > why would it being resuming, sure the systems are not going into a
    > power management mode (seem to be frozen). I would have thought
    > anything related to power saving would be disabled?
    >


    Correct, /dev/sdb1 is swap and the setting for resume. Ya know, I've
    never questioned the "resume" kernel option. If its for hibernating or
    suspending a system for power-saving reasons, I guess I don't need it?
    Its a server, hence the reason we run SLES.

    Hi
    So /dev/sda1 is mounted as /root (the user) not / hence it can't find
    the libraries. Unless there are directories under /root?

    What does the mount command say when all back and running?

    Maybe a browse through the system BIOS may add some additional
    information on any power saving features that are enabled?

    AFAIK anything power related should be disabled, check down in /etc/pm
    and /etc/pm-profiler if they exist.

    --
    Cheers Malcolm °¿° (Linux Counter #276890)
    openSUSE 12.1 (x86_64) Kernel 3.1.10-1.9-desktop
    up 5 days 19:40, 4 users, load average: 0.00, 0.02, 0.05
    CPU Intel i5 CPU M520@2.40GHz | Intel Arrandale GPU


  8. #8

    Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by malcolmlewis View Post
    Hi
    So /dev/sda1 is mounted as /root (the user) not / hence it can't find
    the libraries. Unless there are directories under /root?

    What does the mount command say when all back and running?
    OK I didn't notice that it was mounted as /root, but now that makes sense that it can't find the libs.

    After issuing "exit" at the bash prompt, it boots to what looks like a healthy state and this is what mount says:

    Code:
    myhost:~ # mount
    /dev/sda1 on / type ext3 (rw,acl,user_xattr)
    proc on /proc type proc (rw)
    sysfs on /sys type sysfs (rw)
    devtmpfs on /dev type devtmpfs (rw,mode=0755)
    tmpfs on /dev/shm type tmpfs (rw,mode=1777)
    devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
    fusectl on /sys/fs/fuse/connections type fusectl (rw)
    securityfs on /sys/kernel/security type securityfs (rw)
    nfs1:/vol/local.sles11 on /usr/local type nfs (rw,nolock,addr=172.xxx.xxx.xxx)
    nfs1:/vol/source.sles11 on /usr/local/src type nfs (rw,nolock,addr=172.xxx.xxx.xxx)
    nfs1:/vol/misc on /usr/misc type nfs (rw,nolock,addr=172.xxx.xxx.xxx)
    rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
    none on /var/lib/ntp/proc type proc (ro,nosuid,nodev)
    (I masked out my IPs in the output). I'm not concerned about the power-saving modules at this point, unless you think its affecting the ability to invoke magic sysrq commands and kdump.

  9. #9

    Re: SLES 11 kdumptool - error loading shared libraries

    On a related note, I can't seem to get the proper crashkernel setting for kdump.
    Code:
    myhost:~ # /etc/init.d/boot.kdump restart
    Loading kdump
    Then try loading kdump kernel
    Memory for crashkernel is not reserved
    Please reserve memory by passing "crashkernel=X@Y" parameter to the kernel
                                                                                                                                                 failed
    You can see my menu.lst in an earlier post. I'm using the default which is crashkernel=256M-:128M and also tried other various settings with and without the @YM offset. I'm clueless as to what this should be set to.

  10. #10

    Re: SLES 11 kdumptool - error loading shared libraries

    Quote Originally Posted by ashbyj View Post
    Code:
        kernel /boot/vmlinuz-3.0.13-0.27-default root=/dev/sda1 insmod=qla4xxx resume=/dev/sdb1 splash=silent crashkernel=256M-:128M showopts
    I'm curios why the VM has that insmod entry for the qla4xxx.... are you using iSCSI within the VM? And unless it would also be for the OS disk, that line is not needed in the grub section AFAIK.

    Cheers,
    Willem
    Knowledge Partner (voluntary sysop)
    ---
    If you find a post helpful and are logged into the web interface,
    please show your appreciation and click on the star below it. Thanks!

Page 1 of 2 12 LastLast

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •