PDA

View Full Version : SUSE Linux not booting



eolimal
25-Jan-2013, 18:32
Hi -


I have SUSE Linux 11 SP2. I recently tried to reboot the server and it failed, throwing errors and finally I got the prompt.

By typing 'exit' twice, the bootup process continued and everything worked as expected, although every reboot needs me to type that 'exit' twice now. I was wondering what could be causing this.

The errors in boot.log are (is there any to attach a file to this thread?!):

Kernel logging (ksyslog) stopped.
Kernel log daemon terminating.

Boot logging started on /dev/tty1(/dev/console) at Fri Jan 4 11:07:00 2013


udevadm settle - timeout of 30 seconds reached, the event queue contains:
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0 (1654)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.1 (1655)
/init: line 17: cat: command not found
/init: line 17: cat: command not found
Partial mode. Incomplete logical volumes will be processed.
6 logical volume(s) in volume group "vg00" now active
Partial mode. Incomplete logical volumes will be processed.
6 logical volume(s) in volume group "vg00" now active
Partial mode. Incomplete logical volumes will be processed.
6 logical volume(s) in volume group "vg00" now active

udevadm settle - timeout of 30 seconds reached, the event queue contains:
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0 (1654)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.1 (1655)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0/host4 (2024)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0/host4/scsi_host/host4 (2025)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0/host4/bsg/fc_host4 (2026)
/sys/devices/pci0000:00/0000:00:03.0/0000:04:00.0/host4/fc_host/host4 (2027)
Trying manual resume from /dev/vg00/swap


And so on. Any help on how to troubleshoot? I think the only thing Ive done was to resize the LVM, and this is where my troubleshooting efforts are concentrated so far.

ab
25-Jan-2013, 21:27
> /init: line 17: cat: command not found
> /init: line 17: cat: command not found

If errors like this one are showing up before the hard drives are mounted
I wonder if your initrd (initialization ramdisk) file is corrupt. I've
seen this happen a time or two when the disk was full at the time that the
mkinitrd (the command that builds this file) was executed. If this is the
case, be sure you have several MB free both in / (the root of the
filesystem) and /boot (assuming it is a separate filesystem) and then run
'mkinitrd' as root (I am pretty sure this is a safe thing to do, but maybe
backup your current /boot/initrd* files first). Once done, reboot and see
if that helps.

Also, post something about your free space with the following command:

Code:
----------
df -h
----------

If the 'cat' errors come after mounting the filesystems then... well, but
that's bad.

What'd you do last before this happened? Patches?

Good luck.

KBOYLE
27-Jan-2013, 03:41
ab wrote:

> I wonder if your initrd (initialization ramdisk) file is corrupt.

https://bugzilla.novell.com/show_bug.cgi?id=755924
This bug may not be publicly available but it describes a situation
where initrd may be built incorrectly.

If this is your problem, the solution is simple:


mkinitrd


--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...

eolimal
31-Jan-2013, 20:18
I tried mkinitrd and no change, still same errors.

I didnt do anything except increasing the partition sizes:


"/sbin/lvextend -l +6912 '/dev/vg00/tmp'"
"/sbin/resize2fs -f '/dev/vg00/tmp'"
"/sbin/lvextend -l +6912 '/dev/vg00/home'"
"/sbin/resize2fs -f '/dev/vg00/home'"
"/sbin/lvextend -l +4608 '/dev/vg00/root'"
"/sbin/resize2fs -f '/dev/vg00/root'"
"/sbin/lvextend -l +6144 '/dev/vg00/usr'"
"/sbin/resize2fs -f '/dev/vg00/usr'"


Finally, when I typed 'df -h' or 'who -r' I got both times 'Command not found'.

Then I can type 'exit' twice and the system boots.

Weird.

jmozdzen
04-Feb-2013, 16:09
Hi eolimal,

> /init: line 17: cat: command not found
> Any help on how to troubleshoot?

you might look into unpacking the initrd to some temporary directory and check what's going on at line 17 of the init script.

# mkdir /tmp/xxx && cd /tmp/xxx && gunzip -c < /boot/yourinitfilename | cpio -i && view init

My version of init has a call to "cat" in line 25:


24 name=$(eval echo $name | sed "s/[-_]/\*/g")
25 for i in `cat /proc/cmdline`
26 do
27 case $i in
28 ${name}.*=*)
29 param="${param}${param:+ }${i##${name}.}"
30 ;;
31 esac
32 done

Initrd is just a severely stripped-down Linux environment, mostly with only mission-critical files. After unpacking you might check if the "cat" binary is included:


# ls -l /tmp/xxx/bin/cat
-rwxr-xr-x 1 root root 48032 Feb 4 16:03 /tmp/xxx/bin/cat

Further diagnosis depends on the content of your initrd.

Regards,
Jens

eolimal
08-Feb-2013, 17:25
My version of init has a call to "cat" in line 25:

So does mine. 'cat' is only present in line 25.

And it looks like the 'cat' command is included in my initrd:



#ls bin/cat
bin/cat


Tks for the info on initrd!

jmozdzen
11-Feb-2013, 17:38
Hi eolimal,

sorry for the late reply... I've been on the road

The problem is strange indeed - the reported line number is caused by the function call - seems we're indeed talking about the cat command in line 25. It would be interesting to see what happens if you'd call init manually after a reboot, probably with shell debugging turned on.

BTW, if you reverse the steps to unpack the repos, you can create your own initrd including additional tools you'd like to see... an editor, for example... so that you'll have a more helpful environment to debug these troubles.

Regards,
Jens

eolimal
13-Feb-2013, 19:07
It would be interesting to see what happens if you'd call init manually after a reboot, probably with shell debugging turned on.



Hmmm, not sure how to do this, any help?

Btw, when I get dropped at the shell, I only have access to a subset of commands (of which 'ls' and 'cat' are not), although it was included in my ramdisk (see previous post).

We might try a re-install of the OS. And we are also investigating HW + firmware. (then I'll retire early haha)

jmozdzen
13-Feb-2013, 20:02
Hi eolimal,


Hmmm, not sure how to do this, any help?
using "bash -x init", for instance


Btw, when I get dropped at the shell, I only have access to a subset of commands (of which 'ls' and 'cat' are not), although it was included in my ramdisk (see previous post).

Yes, I'm fully aware that the default initrd contains only the most necessary elements. But since you already unpacked the initrd, it's only a matter of "cp" and "cpio" to enrich that (copy the wanted files to the appropriate initrd directory, repack initrd using cpio, configure grub to use your new initrd - voila).


We might try a re-install of the OS. And we are also investigating HW + firmware. (then I'll retire early haha)

The error message seems strange enough to make your attempts plausible... otoh, were I in your situation, I'd like to know the root cause and therefore debug the init script. Something to learn and something to find, makes a good match :)

Regards,
Jens

eolimal
13-Feb-2013, 20:11
Yes, I'm fully aware that the default initrd contains only the most necessary elements. But since you already unpacked the initrd, it's only a matter of "cp" and "cpio" to enrich that (copy the wanted files to the appropriate initrd directory, repack initrd using cpio, configure grub to use your new initrd - voila).



Should have been more clear. The reason why I was specifying this (not cat command) is that at the start of the boot.msg, you see errors about the command 'cat' not found and indeed, when I get dropped to the shell, that command is not available. Hence any script needing it wont run properly.

But then its present in the ramdisk. So what the...? Am I missing properly? I recreated the ramdisk already, it cant be corrupted...





The error message seems strange enough to make your attempts plausible... otoh, were I in your situation, I'd like to know the root cause and therefore debug the init script. Something to learn and something to find, makes a good match :)



Tks, yes I curious too. Unfortunately, im not the end customer so might have to go with whatever give results.

eolimal
13-Feb-2013, 20:30
using "bash -x init", for instance



# bash -x init
/sbin/init: /sbin/init: cannot execute binary file

:S - init is a command on the linux, not sure where you are going with this...

jmozdzen
13-Feb-2013, 20:39
Hi eolimal,

> when I get dropped to the shell, that command is not available

have you tried with path (/bin/cat), or only by specifying "cat"? I recall that when I hit that boot shell, no path settings were in effect.

Another thing that might be remotely possible: Have you checked that the architecture of the files matches the kernel? In my case, that's a 64 bit kernel, so the "cat" binary matches:


/tmp/xxx # file bin/cat
bin/cat: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), for GNU/Linux 2.6.4, dynamically linked (uses shared libs), stripped

But I couldn't think of any standard situation that might cause that to go wrong, after all the files are copied from the live system during "mkinitrd"...

Any way this turns out: I'd be interested to hear what steps in the end got things working for you/your customer!

Regards,
Jens

jmozdzen
13-Feb-2013, 20:43
> :S - init is a command on the linux, not sure where you are going with this...

sorry for the confusion, I meant to run this when you're dropped to the shell prompt during boot - to debug the /init script that obviously goes wrong, leading to that situation.

With regards,
Jens

eolimal
13-Feb-2013, 21:35
/bin/cat
sh: /bin/cat: Too many levels of symbolic links

Same thing with /bin/ls...

jmozdzen
14-Feb-2013, 17:21
/bin/cat
sh: /bin/cat: Too many levels of symbolic links

Same thing with /bin/ls...

This looks more than strange. I'm without access to my test system atm, but will check tonight if any of the config script run inside init are capable of messing up the directory structure.

Does traversing the directory structure work, i.e. "cd ..;cd ..;cd bin" to go to the /bin directory (the initial "cd .."s are meant to get you up to the top-level directory, you of course could try a "cd /" instead)?

Did this happen with the initrd as created by mkinitrd or was this with a version enhanced by you (I guess the latter, since you tried to run "ls")? If you would like me to take a look at that initrd, contact me via private message so we can exchange contact info (URL to download the initrd or e-mail to send the file to).

Have you double-checked that you're actually looking at the initrd you're using during boot? (No offense intended!)

Do you have a file system structure that would be mountable easily from within initrd's shell (like a simple partition with ext3 file system, no LVM, no iSCSI etc)? Then you could try to put your tools (ls, vi, bash) there, mount it from the prompt and run the tools from there.

Currently, I'm getting kind of desperate. as you can tell from the nature of my questions... I assume that once the source is found, you'll go "oh my, *that* cause leading to so much trouble!"... and I hope we'll be getting you there soon.

If other measures get the machine up & running, or you have enough of my attempts, please let me know. Right now I'm mostly puzzled.

Regards,
Jens

eolimal
05-Mar-2013, 18:55
So... I dont have any conclusive answer to anyone. We decided to re-install and the issue is gone. Tks all for your help.

jmozdzen
07-Mar-2013, 14:41
> So... I dont have any conclusive answer to anyone. We decided to re-install and the issue is gone.

yes, sometimes that's the only way to get back on track... sorry we couldn't help to isolate the problem, but I'm glad that in the end, things at least got back to working order. And thank you for reporting back!

Regards,
Jens