PDA

View Full Version : SLES 11 SP2 - unable to shutdown



lukasz_papierz
09-Jul-2014, 11:26
I installed SLES 11 SP 2 on few machines (System x3850 X5, Type 7143). On 3 of them (the ones that were configured together with SANSurfer Software) there is a problem with shutdown - I'm not able to restart or shutdown server normally. The remaining 2 (with clean OS right after installation) seems to not have that issue.

I thought that this is related to QLogic cards but I disabled them and this didn't solve the problem.

I noticed also one more strange thing - when I type "shutdown -h now" server seems to go to runlevel 0, it is switching from graphical interface to console but that's all, it is still working. When I type "shutdown -r now" after that it works and server is going to reboot. It is working also the other way around. Below some example output from console:


12:05:41 root@my_server: /root
# tail /var/log/messages
Jul 9 12:04:32 my_server kernel: [ 51.004399] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004488] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004577] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004665] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS (flags_type)' failed
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_object_class_install_property: assertion `G_IS_PARAM_SPEC (pspec)' failed
Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)



12:06:04 root@my_server: /root
# shutdown -h now

Broadcast message from root (pts/3) (Wed Jul 9 12:06:15 2014):

The system is going down for system halt NOW!




12:06:15 root@my_server: /root
# tail /var/log/messages
Jul 9 12:04:32 my_server kernel: [ 51.004665] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS (flags_type)' failed
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_object_class_install_property: assertion `G_IS_PARAM_SPEC (pspec)' failed
Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
Jul 9 12:06:15 my_server init: Switching to runlevel: 0
Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on console 0 changed to on





12:06:24 root@my_server: /root
# shutdown -h now

Broadcast message from root (pts/3) (Wed Jul 9 12:06:31 2014):

The system is going down for system halt NOW!



12:06:31 root@my_server: /root
# tail /var/log/messages
Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0: Invalid ROM contents
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS (flags_type)' failed
Jul 9 12:04:36 my_server gdm-simple-greeter[5933]: GLib-GObject-CRITICAL: g_object_class_install_property: assertion `G_IS_PARAM_SPEC (pspec)' failed
Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
Jul 9 12:06:15 my_server init: Switching to runlevel: 0
Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on console 0 changed to on
Jul 9 12:06:31 my_server shutdown[6373]: shutting down for system halt



12:06:38 root@my_server: /root
# shutdown -r now

Broadcast message from root (pts/3) (Wed Jul 9 12:06:48 2014):

The system is going down for reboot NOW!




12:06:54 root@my_server: /root
# tail /var/log/messages | sed s/my_server/my_server/g
Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD (/usr/local/script/get_uptime > /dev/null 2>&1)
Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
Jul 9 12:06:15 my_server init: Switching to runlevel: 0
Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on console 0 changed to on
Jul 9 12:06:31 my_server shutdown[6373]: shutting down for system halt
Jul 9 12:06:48 my_server shutdown[6377]: shutting down for system reboot
Jul 9 12:06:48 my_server init: Switching to runlevel: 6
Jul 9 12:06:51 my_server multipathd: --------shut down-------
Jul 9 12:06:54 my_server kernel: [ 193.184703] bootsplash: status on console 0 changed to on




12:06:57 root@my_server: /root
#
Script done, file is /var/adm/sulogs/140709.1228.root-6060
Connection to my_server closed by remote host.


What can be the cause of that?

Thanks in advance for support.

smflood
09-Jul-2014, 12:42
On 09/07/2014 11:34, lukasz papierz wrote:

> I installed SLES 11 SP 2 on few machines (System x3850 X5, Type 7143).
> On 3 of them (the ones that were configured together with SANSurfer
> Software) there is a problem with shutdown - I'm not able to restart or
> shutdown server normally. The remaining 2 (with clean OS right after
> installation) seems to not have that issue.
>
> I thought that this is related to QLogic cards but I disabled them and
> this didn't solve the problem.
>
> I noticed also one more strange thing - when I type "shutdown -h now"
> server seems to go to runlevel 0, it is switching from graphical
> interface to console but that's all, it is still working. When I type
> "shutdown -r now" after that it works and server is going to reboot. It
> is working also the other way around. Below some example output from
> console:
>
>> 12:05:41 root@my_server: /root
>> # tail /var/log/messages
>> Jul 9 12:04:32 my_server kernel: [ 51.004399] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004488] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004577] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004665] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS
>> (flags_type)' failed
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_object_class_install_property: assertion
>> `G_IS_PARAM_SPEC (pspec)' failed
>> Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>>
>>
>>
>> 12:06:04 root@my_server: /root
>> # shutdown -h now
>>
>> Broadcast message from root (pts/3) (Wed Jul 9 12:06:15 2014):
>>
>> The system is going down for system halt NOW!
>>
>>
>>
>>
>> 12:06:15 root@my_server: /root
>> # tail /var/log/messages
>> Jul 9 12:04:32 my_server kernel: [ 51.004665] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS
>> (flags_type)' failed
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_object_class_install_property: assertion
>> `G_IS_PARAM_SPEC (pspec)' failed
>> Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
>> Jul 9 12:06:15 my_server init: Switching to runlevel: 0
>> Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on
>> console 0 changed to on
>>
>>
>>
>>
>>
>> 12:06:24 root@my_server: /root
>> # shutdown -h now
>>
>> Broadcast message from root (pts/3) (Wed Jul 9 12:06:31 2014):
>>
>> The system is going down for system halt NOW!
>>
>>
>>
>> 12:06:31 root@my_server: /root
>> # tail /var/log/messages
>> Jul 9 12:04:32 my_server kernel: [ 51.004753] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:32 my_server kernel: [ 51.004840] pci 0000:03:00.0:
>> Invalid ROM contents
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_param_spec_flags: assertion `G_TYPE_IS_FLAGS
>> (flags_type)' failed
>> Jul 9 12:04:36 my_server gdm-simple-greeter[5933]:
>> GLib-GObject-CRITICAL: g_object_class_install_property: assertion
>> `G_IS_PARAM_SPEC (pspec)' failed
>> Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
>> Jul 9 12:06:15 my_server init: Switching to runlevel: 0
>> Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on
>> console 0 changed to on
>> Jul 9 12:06:31 my_server shutdown[6373]: shutting down for system halt
>>
>>
>>
>> 12:06:38 root@my_server: /root
>> # shutdown -r now
>>
>> Broadcast message from root (pts/3) (Wed Jul 9 12:06:48 2014):
>>
>> The system is going down for reboot NOW!
>>
>>
>>
>>
>> 12:06:54 root@my_server: /root
>> # tail /var/log/messages | sed s/my_server/my_server/g
>> Jul 9 12:05:01 my_server /usr/sbin/cron[6051]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:01 my_server /usr/sbin/cron[6183]: (monitor) CMD
>> (/usr/local/script/get_uptime > /dev/null 2>&1)
>> Jul 9 12:06:15 my_server shutdown[6188]: shutting down for system halt
>> Jul 9 12:06:15 my_server init: Switching to runlevel: 0
>> Jul 9 12:06:16 my_server kernel: [ 155.243882] bootsplash: status on
>> console 0 changed to on
>> Jul 9 12:06:31 my_server shutdown[6373]: shutting down for system halt
>> Jul 9 12:06:48 my_server shutdown[6377]: shutting down for system
>> reboot
>> Jul 9 12:06:48 my_server init: Switching to runlevel: 6
>> Jul 9 12:06:51 my_server multipathd: --------shut down-------
>> Jul 9 12:06:54 my_server kernel: [ 193.184703] bootsplash: status on
>> console 0 changed to on
>>
>>
>>
>>
>> 12:06:57 root@my_server: /root
>> #
>> Script done, file is /var/adm/sulogs/140709.1228.root-6060
>> Connection to my_server closed by remote host.
>
>
> What can be the cause of that?
>
> Thanks in advance for support.

Does TID 7009779[1] help? Note though that System z doesn't support ACPI.

HTH.

[1] https://www.suse.com/support/kb/doc.php?id=7009779
--
Simon
SUSE Knowledge Partner

------------------------------------------------------------------------
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.
------------------------------------------------------------------------

lukasz_papierz
09-Jul-2014, 14:28
Unfortunately this solution is not working for me.

btw. It's system X not Z :-)

smflood
09-Jul-2014, 15:02
On 09/07/2014 14:34, lukasz papierz wrote:

> Unfortunately this solution is not working for me.

So you've tried all the listed reboot parameters?

> btw. It's system X not Z :-)

Sorry I saw the "System" and my brain incorrectly auto-completed the "z"!

Since it's not System z then ACPI is applicable.

Are you able to open a Service Request with SUSE? Given you're running
SLES11 SP2 they'll probably advise to upgrade to SLES11 SP3 so that
might be worth trying beforehand with one of the 3 problem servers.

HTH.
--
Simon
SUSE Knowledge Partner

------------------------------------------------------------------------
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.
------------------------------------------------------------------------

jmozdzen
09-Jul-2014, 15:14
Hi lukasz_papierz,

I installed SLES 11 SP 2 on few machines (System x3850 X5, Type 7143). On 3 of them (the ones that were configured together with SANSurfer Software) there is a problem with shutdown - I'm not able to restart or shutdown server normally. The remaining 2 (with clean OS right after installation) seems to not have that issue.

I thought that this is related to QLogic cards but I disabled them and this didn't solve the problem.

I noticed also one more strange thing - when I type "shutdown -h now" server seems to go to runlevel 0, it is switching from graphical interface to console but that's all, it is still working. When I type "shutdown -r now" after that it works and server is going to reboot. It is working also the other way around.
This typically points to a "hanging" init script. Although, what state are those server in after the shutdown had run for a while, i.e. after 12:06:54 in the example you quoted?


Below some example output from console:
[...]
> Jul 9 12:06:48 my_server init: Switching to runlevel: 6
> Jul 9 12:06:51 my_server multipathd: --------shut down-------
> Jul 9 12:06:54 my_server kernel: [ 193.184703] bootsplash: status on console 0 changed to on

What can be the cause of that?

Thanks in advance for support.

What do you see looking at "ps alx" (or similar) during that period? Focus should be on "rc" scripts - and "pstree" might give you an easier overview. Do you have any exceptional messages on tty1 (the text console via "F1") during the shutdown?

During shutdown (reboot,...) the scripts in /etc/init.d/* are invoked, similar to the call upon system start. When a shutdown "hangs", typically one of these scripts is delaying or even unable to finish. More often than not, this does *not* lead to messages in syslog, but rather on the console, where indications of "retries" would appear, or error messages, or both.

Once you identified the hanging script (or at least have a solid idea), you *might* want to try to shutdown the service manually ("rcYourServer stop" from the command line) - of course, there are some services that won't work that way, i.e. unmounting all file systems before all processes were stopped ;) The manual shutdown gives you a chance to isolate the root cause.

Regards,
Jens

lukasz_papierz
09-Jul-2014, 15:49
Thank you!

You were right - ITM Agent was not stopping, just hanging and blocking whole shutdown process. I killed it and system went down.

I just need to review it now to check where is the problem but as this is clear now there is not much left to do.

Thank you for that advice - it was really helpful. I missed that as script was not giving any output ...

Have a nice day!