PDA

View Full Version : Boot problem after Restore



stefan_1304
23-Dec-2013, 11:35
Hello,

I made a Backup via DriveSnapshot of my SLES 2011 and wanted to Restore it.
The Restore worked quite fine, but when I want to boot, I get the following errors:



fskc failed for at least one filesystem (not /).
Please repair manually and reboot
The root file system is already mointed read-write.
attention: only control-D will reboot the system in this
maintenance mode. shutdown or reboot will not worl
give root password for maintenance


Then i type in root-password and it says



(repair filesystem) #


I have to say that I am beginner working with SLES 2011. The restore was on different hardware.
You have any ideas to help me? thank you very much.

smflood
23-Dec-2013, 13:53
On 23/12/2013 10:44, stefan 1304 wrote:

> I made a Backup via DriveSnapshot of my SLES 2011 and wanted to Restore
> it.
> The Restore worked quite fine, but when I want to boot, I get the
> following errors:
>
>>
>> fskc failed for at least one filesystem (not /).
>> Please repair manually and reboot
>> The root file system is already mointed read-write.
>> attention: only control-D will reboot the system in this
>> maintenance mode. shutdown or reboot will not worl
>> give root password for maintenance
>>
>
> Then i type in root-password and it says
>
>>
>> (repair filesystem) #
>>
>
> I have to say that I am beginner working with SLES 2011. The restore was
> on different hardware.
> You have any ideas to help me? thank you very much.

Since there is no SLES version 2011 what do you mean by "SLES 2011"?
Perhaps you mean SLES 11? Please post the output from "cat
/etc/*release" so we can see which version and Service Pack (if any)
you're using.

HTH.
--
Simon
SUSE Knowledge Partner

------------------------------------------------------------------------
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below. Thanks.
------------------------------------------------------------------------

stefan_1304
23-Dec-2013, 14:24
Hello,

at the moment I do a restore again. Think I made something wrong when I tried to use system repair option with install DVD.
Yes, it is Version 11, sorry.

I will give you the output on Friday. Thank you so far.

jmozdzen
23-Dec-2013, 14:59
Hi Stefan(_1304),

I you still run into that message, check the lines above where the output from fsck is listed, to identify which file system(s) are affected. To manually repair those, run "fsck -f <the device name of the file system>" and if I may recommend so, run it again until fsck reports it has nothing to fix. Also, I'd run that on any other file system at least once.

Was the image created on a *running* system? Then chances are high you won't have too much trouble after repairing. If the image was taken from a shut-down system, it'd be interesting why there are such problems.

Regards,
Jens

stefan_1304
23-Dec-2013, 15:40
Hi Jens,

it was an offline image via DriveSnapshot.
http://www.drivesnapshot.de/

stefan_1304
23-Dec-2013, 15:53
when I ran fsck -f several times, I geht erros like this:



fsck.ext3: no such file or directory while trying to pen /dev/disk/......-partX
the superblock could not be read or does not describe correct ext2 filesystem
it the device is ready and it really contains an ext2 filesystem....


I restored a correct running system and do not know why there are errors. the only changes are the different hardware.

when I try df -ah, i only get /dev/mapper/system-root_lv

when I try fdisk -l, I can see all the partitions correct...

jmozdzen
23-Dec-2013, 16:15
Hi Stefan,

when I ran fsck -f several times, I geht erros like this:



I restored a correct running system and do not know why there are errors. the only changes are the different hardware.

when I try df -ah, i only get /dev/mapper/system-root_lv

when I try fdisk -l, I can see all the partitions correct...

you seem to be running your system with logical volume management (LVM) - if you look at the /etc/fstab entries, are the partitions in question really mounted via partitions, or via (/dev/mapper/system-*)LVM devices?

If in doubt, please post the entries from /etc/fstab here. Unless these are via i.e. "LABEL=somelabel" in column one, you should run fsck against the device entry listed in that first column.

If you're totally confused by all this, it'd be helpful to also paste (in [CODE] blocks) the output of

- fdisk -l <disk device>
- vgscan -v
- vgdisplay system (and for any other VG found via vgscan)
- ls -l /dev/mapper
- cat /etc/fstab

plus the info which file system you're trying to repair.

Regards,
Jens

stefan_1304
23-Dec-2013, 19:23
Thanks. I will answer everything on friday
Merry xmas :)





Hi Stefan,


you seem to be running your system with logical volume management (LVM) - if you look at the /etc/fstab entries, are the partitions in question really mounted via partitions, or via (/dev/mapper/system-*)LVM devices?

If in doubt, please post the entries from /etc/fstab here. Unless these are via i.e. "LABEL=somelabel" in column one, you should run fsck against the device entry listed in that first column.

If you're totally confused by all this, it'd be helpful to also paste (in [CODE] blocks) the output of

- fdisk -l <disk device>
- vgscan -v
- vgdisplay system (and for any other VG found via vgscan)
- ls -l /dev/mapper
- cat /etc/fstab

plus the info which file system you're trying to repair.

Regards,
Jens

jmozdzen
23-Dec-2013, 21:15
Hi Stefan,

Thanks. I will answer everything on friday
Merry xmas :)

happy holidays to you (and anyone else reading this thread, too)! I may be online only a few moments until early January, so please don't feel forgotten if I won't be responding immediately :)

Regards,
Jens

stefan_1304
27-Dec-2013, 09:14
So hello again, I hope I can help you with the information.

Because it would be much to type in, I hope sp, that photos are quite ok too. thank you.

- /etc/fstab entries
http://s14.directupload.net/file/d/3484/clbarsng_jpg.htm

- fdisk -l <disk device>
http://s1.directupload.net/file/d/3484/tnndnaev_jpg.htm

- vgscan -v
http://s7.directupload.net/file/d/3484/z3mktszx_jpg.htm

- vgdisplay system
http://s1.directupload.net/file/d/3484/qvtefmv6_jpg.htm

ls -l /dev/mapper
http://s14.directupload.net/file/d/3484/gdtql9t9_jpg.htm

cat /etc/fstab
http://s14.directupload.net/file/d/3484/6wtgpw36_jpg.htm



Hi Stefan,


you seem to be running your system with logical volume management (LVM) - if you look at the /etc/fstab entries, are the partitions in question really mounted via partitions, or via (/dev/mapper/system-*)LVM devices?

If in doubt, please post the entries from /etc/fstab here. Unless these are via i.e. "LABEL=somelabel" in column one, you should run fsck against the device entry listed in that first column.

If you're totally confused by all this, it'd be helpful to also paste (in [CODE] blocks) the output of

- fdisk -l <disk device>
- vgscan -v
- vgdisplay system (and for any other VG found via vgscan)
- ls -l /dev/mapper
- cat /etc/fstab

plus the info which file system you're trying to repair.

Regards,
Jens

stefan_1304
27-Dec-2013, 10:37
and when I try fsck:

http://s1.directupload.net/file/d/3484/yhtmvljd_jpg.htm

stefan_1304
27-Dec-2013, 10:48
I tried every task you wrote on the server from which I made backup and the output ist the same like on the server I restored.

live system is running, restored server is not running. hope you can help. thank you very much.

jmozdzen
28-Dec-2013, 00:14
Hi Stefan,

the trouble is that your partitions are referenced by disk id in fstab - as the new server has a different physical disk, its disk id is different, too.

Use an editor of your choice (i.e. "vi") to change the "/dev/disk/by-id/scsi-3600...-partX" device entries in /etc/fstab of the restored system to "/dev/sdaX" (i.e. to "/dev/sda5 /hana/shared ext3 acl,user..."). Of course, you may as well change the entry to the id of the current SCSI disk (see "ls -l /dev/disk/by-id" on the new server).

You may want to manually run "fdisk -f /dev/sda1" (5,6,7,8,9), until no more errors are reported, before rebooting the server.

Regards,
Jens

stefan_1304
28-Dec-2013, 10:02
Hi Jens,

thanks for your excellent answer. that sounds fine.
The disks of the new server are others.

When I understand it right, it is enough when it says in the fstab where every partition is mapped?
For example:
/dev/sda1
/dev/sda2
/dev/sda3
/dev/sda4
/dev/sda5 /hana/shared ext3
/dev/sda6 ...
/dev/sda7 ...
/dev/sda8 ...
/dev/sda9

I can delete the information in the fstab with SCSI disk ID and so on because I does not need it there? ist that correct? thank you very much.

king regards.
Stefan


Hi Stefan,

the trouble is that your partitions are referenced by disk id in fstab - as the new server has a different physical disk, its disk id is different, too.

Use an editor of your choice (i.e. "vi") to change the "/dev/disk/by-id/scsi-3600...-partX" device entries in /etc/fstab of the restored system to "/dev/sdaX" (i.e. to "/dev/sda5 /hana/shared ext3 acl,user..."). Of course, you may as well change the entry to the id of the current SCSI disk (see "ls -l /dev/disk/by-id" on the new server).

You may want to manually run "fdisk -f /dev/sda1" (5,6,7,8,9), until no more errors are reported, before rebooting the server.

Regards,
Jens

stefan_1304
28-Dec-2013, 10:06
Or I will do ls -l /dev/disk/by-id, there I can have a look which are the correct entrys for SCSI and put them instead of the wrong entrys in fstab?
this way would be faster?

jmozdzen
28-Dec-2013, 12:03
Hi Stefan,

Or I will do ls -l /dev/disk/by-id, there I can have a look which are the correct entrys for SCSI and put them instead of the wrong entrys in fstab?
this way would be faster?
unlike in your other response, you here state correctly that you need to *replace* the entries with the correct values.

Your disk device(s) (and partitions, too) can be referenced by different names in Linux. I.e., for decades the typical name would have been /dev/sda for your first SCSI disk (/dev/sda1 being the first partition on that disk), but what if you have more than one disk? They'd be sda, sdb and so on, but in which *order*? Therefore, additional names were added, like "use disk ID", "use device location on system bus", "use file system label" and so on.

In your case, the partitions to mount are referenced by device id plus partition number. As the device ID has changed, you'll either need to adjust that to the correct new value or use some other reference (hence my earlier suggestion to use /dev/sda*, as your server currently has only a single disk, no question on who's who will come up... and it's easier to type for you :D )

You may not *delete* the entries in /etc/fstab, but need to *correct* them. These entries tell the system which file systems on which device (partition) to mount.

Regards,
Jens

stefan_1304
30-Dec-2013, 09:03
Hi Jens,

this moment I tested the changes in fstab and it seems to be good.
I changed for every entry the scsi ID. When I set df -ah I see everything correct.

Another problem is that I do not have any GUI after starting. I tried to do startx and it does not work.

When I try it said


fatals server error:
no screens found


is it a wrong command or are there any other commands left to have GUI?

So far thank you so much for solving the first error.

King regards.
Stefan


Hi Stefan,

unlike in your other response, you here state correctly that you need to *replace* the entries with the correct values.

Your disk device(s) (and partitions, too) can be referenced by different names in Linux. I.e., for decades the typical name would have been /dev/sda for your first SCSI disk (/dev/sda1 being the first partition on that disk), but what if you have more than one disk? They'd be sda, sdb and so on, but in which *order*? Therefore, additional names were added, like "use disk ID", "use device location on system bus", "use file system label" and so on.

In your case, the partitions to mount are referenced by device id plus partition number. As the device ID has changed, you'll either need to adjust that to the correct new value or use some other reference (hence my earlier suggestion to use /dev/sda*, as your server currently has only a single disk, no question on who's who will come up... and it's easier to type for you :D )

You may not *delete* the entries in /etc/fstab, but need to *correct* them. These entries tell the system which file systems on which device (partition) to mount.

Regards,
Jens

stefan_1304
30-Dec-2013, 11:54
GUI is now working too.

I tried with this commands:



Sudo /usr/sbin/sax2 –ra
su (to become root)
sax2 -r (reprobe your video card)
startx
init 5

jmozdzen
30-Dec-2013, 16:48
Hi Stefan,

sounds like you got this up&running yourself - it most probably had to do with a change in video hardware. If any problems remain, please open a new thread for those, to keep threads focused on a single problem :)

If your machine is up&running now, they Hey! You made it a problem of 2013 - 2014 may now come!

A happy new year to you all,

Jens

stefan_1304
13-Jan-2014, 14:55
Sorry, that I am writing again. I tried to restore the server once again on the same hardware, but it does not work again.

I do not have the screen with



(repair filesystem) #


when I start normal, there I have:



Reading all physical volumes, this may take a while...
no volume group found
volume group "system" not found
volume group "system" not found
could not find /dev/system/root_lv.
want me to fail back to /dev/system/root_lv? (Y/n)


it does not matter if I choose yes or no, the screen comes again the next boot.

I tried to do automatic repaid with the installation DVD with no success.
When I boot with installation DVD and choose rescue mode, I can logon with user root.

rescue login:
root

when I execute fdisk -l I can see every partition.

file \etc\fstab is empty and when I fill it with the correct partitions the system do not seems to remember it.

and when I look in /dev there is no folder called system. is there any way where I do not have problems like this when I restore SLES 11 on another HW?

can anyone help? thank you.

jmozdzen
13-Jan-2014, 15:23
Hi Stefan,

when you answer "n" to "want me to fail back to /dev/system/root_lv? (Y/n)", it should take you to a minimalistic shell (within the initial RAM disk environment). There you'd have (pretty limited) chances to get your root LV available. If you're able to activate the system VG from there manually, exiting that shell would get you running (once) again. It still then is required to find out why the system VG could not be found and fix that:

It's that the initrd knows that your root file system is on /dev/system/root_lv, but that cannot be accessed because the complete volume group is unavailable ("no volume group found"). Typically, this is because some hardware driver is missing, so that the physical volumes ("partitions", in your case), are unavailable.

> [...rescue system, then...] file \etc\fstab is empty and when I fill it with the correct partitions the system do not seems to remember it.

When you boot the rescue system via DVD, you're not running from your disk - you have a completely separate environment set up, loaded from DVD, even with it's own root fs. So the first steps to take are:

- "vgscan" to let the system find the disks/physical volumes/volume groups containing your "real" system files
- activate the root vg ("vgchange -ay system", since your VG is called "system")
- mount the "real root" (i.e. "mount /dev/system/root_lv /mnt")
- mount any other required file system (var to /mnt/var, usr to /mnt/usr, ...)
- mount your boot file system to /mnt/boot
- mount /sys and /proc ("mount --bin /sys /mnt/sys; mount --bind /proc /mnt/proc)
- "chroot /mnt" to "switch" to your installed system - sort of. This is *not* your installed system (kernel etc), but only the file systems.

That environment then is pretty complete to do any maintenance/repair work, i.e. to invoke "mkinitrd" to see and/or influence how the initrd is created.

> is there any way where I do not have problems like this when I restore SLES 11 on another HW?

By staying as close as possible to the original hardware and by setting up the system in a way that you know where it is bound to characteristics of your original hardware (i.e. hardware IDs, MAC addresses, port names/numbers etc)

Regards,
Jens

stefan_1304
13-Jan-2014, 15:52
The steps I have to execute via Rescue System via DVD?

from normal system when I make vgscan it says

reading all physical volumes. this may take a while...
no volume groups found

stefan_1304
13-Jan-2014, 16:00
thank you so far.

I did it via DVD,

vgscan found volume group "system"

vgchange -ay system said

2 logical volumes in volume group "system" now active

but then when I try mount it say, that the folder system does not exist.

i looked into /dev and /etc and there was now folder.

jmozdzen
13-Jan-2014, 16:14
thank you so far.

I did it via DVD,

vgscan found volume group "system"

vgchange -ay system said

2 logical volumes in volume group "system" now active

but then when I try mount it say, that the folder system does not exist.

i looked into /dev and /etc and there was now folder.

what command exactly are your trying to execute to mount the logical volume?

When you activate a volume group, usually a folder is created in /dev with the name of the volume group (so it's /dev/system in your case, which I read does exist) and for each logical volume, a corresponding file pointing to the device mapper node is created in that folder. So when such a logical volume in VG "system" (let's call it "lv_test") contains a file system, you'd mount it at "/mnt" via the command "mount /dev/system/lv_test /mnt".

What LVs are linked in the /dev/system folder after activating the VG?

> from normal system when I make vgscan it says ...

what's that "normal system" you're writing about? I thought you couldn't start the clone... is that "normal system" the "master system", where the backup was taken? Why would your backup reference a root volume on LVM (which we confirmed a few days ago), but not know about the VG?

Regards,
Jens

jmozdzen
13-Jan-2014, 16:21
Hi Stefan,

seems I got that wrong - you probably meant "I looked ... and there was no folder." If that's the case, try de-activating the volume group and then reactivating it.

If the VG is already active, but for some reason the files weren't created, "activating" does report the LVs as active, but doesn't create the files...

Regards,
Jens

stefan_1304
14-Jan-2014, 13:44
Hi Jens,

sorry for the late response.
How can i deactivate the vg?


Hi Stefan,

seems I got that wrong - you probably meant "I looked ... and there was no folder." If that's the case, try de-activating the volume group and then reactivating it.

If the VG is already active, but for some reason the files weren't created, "activating" does report the LVs as active, but doesn't create the files...

Regards,
Jens

jmozdzen
14-Jan-2014, 14:51
Hi Stefan,

Hi Jens,

sorry for the late response.
How can i deactivate the vg?

it's as easy as activating it... "vgchange -an system" ("-a" -> "should the vg be active?" with possible answers "y"es and "n"o)

Regards,
Jens

stefan_1304
14-Jan-2014, 15:01
ok I did vgchange -an system and then vgchange -ay system.
after that there is still no folder system in /dev

jmozdzen
14-Jan-2014, 16:17
Hi Stefan,


ok I did vgchange -an system and then vgchange -ay system.
after that there is still no folder system in /dev

what's the environment you're testing in, currently? initrd? Recovery system? If recovery, booted from which DVD? Did vgscan find the VG? What does "vgdisplay system" report? ...

Regards,
Jens

jmozdzen
14-Jan-2014, 16:19
Stefan,

have you tried running it with "--debug" to get more verbose output?

Regards,
Jens

stefan_1304
15-Jan-2014, 08:15
Hi Jens,

I try with the boot DVD in rescue mode.

vgscan find volume group system and

vgdisplay system
displays output


VG Name System
system ID
format lvm2
......
vg size 150,00 GiB


and with --verbose, this is the output:

http://www.directupload.net/file/d/3503/qboenfzh_jpg.htm

king regards.stefan



Stefan,

have you tried running it with "--debug" to get more verbose output?

Regards,
Jens

jmozdzen
15-Jan-2014, 10:55
Hi Stefan,

from the messages I assume that only the DM files are created - I suspect you will find them as /dev/mapper/system-root_lv and /dev/mapper/system-swap_lv. Creation of the /dev/system/ links may be dependent on udevd, which is probably not running in the SLES recovery environment. (I have yet to use that... somehow, grabbing an openSUSE USB stick was more convenient at my place of work :D )

While you're testing all this - one of my suggestions was to answer "n" to the "fallback question" during initrd - that should drop you to a minimalistic shell. If you'd be able to activate the VG in that environment, exiting that shell (via "exit") ought to get you booting into the production system, or at least get you across the "root file system not found" hurdle ;) Have you had a chance to try that?

A more general question: How different from the original server's hardware is this new system? Is it basically the same, with just different IDs of the various hardware parts, or is it made of completely different components, i.e. different storage controllers, other disk types (4k instead of 512b blocks) or something along that line? The trouble you're experiencing seems a bit unusual to me for "moving" a system from one server to another of the same build.

Regards,
Jens

stefan_1304
15-Jan-2014, 11:26
Hi Jens,

I found the folders unter /dev/mapper, like you have supposed.

The source system is installed on a single HDD (it is only test-system too) and the CPU, Board is different too.
Else I tried to restore the server with extra raid controller, on onboard raid and on single HDD.

I treid with "n" and then "exit" and then I read:



invalid root filesystem -- exiting to /bin/sh

jmozdzen
15-Jan-2014, 11:42
Hi Jens,

I found the folders unter /dev/mapper, like you have supposed.

The source system is installed on a single HDD (it is only test-system too) and the CPU, Board is different too.
Else I tried to restore the server with extra raid controller, on onboard raid and on single HDD.

I treid with "n" and then "exit" and then I read:

yes, now you're in the "initrd" mini shell. ("mini" in terms of accessible programs - close to none :[). try activating the volume group... I have no comparable system at hand, so I don't know if the "helper symlinks" are available. If they are, the command sequence would be

- vgscan
- if "system" could be found: "vgchange -an system; vgchange -ay system"
- exit

If already "vgscan" is not found, you'll use the "lvm shell":
- "lvm" (starts the program "lvm", which has its own command line)
- "vgscan"
- "vgchange -ay system"
- "exit" (to exit "lvm")
- "exit" (to exit the initrd shell and to continue booting)

*If* you're able to activate the VG there, then the boot sequence should continue normally.

> Else I tried to restore the server with extra raid controller, on onboard raid and on single HDD.

It might be too much hassle to try to adapt a cloned image from a too different source machine - you'd be better of with a fresh install unless you know what you're doing and know where to adjust the image prior to booting.

Regards,
Jens

stefan_1304
15-Jan-2014, 11:47
in the "initrd" mini shell vgscan says no volume groups found.

I typed lvm
lvm> vgscan



no volume groups found


king regards. stefan

jmozdzen
15-Jan-2014, 12:17
Hi Stefan,

in the "initrd" mini shell vgscan says no volume groups found.

I typed lvm
lvm> vgscan

then I assume that the (disk) hardware needs drivers ("modules") not included in the current initrd image. Your best guess is to boot the rescue system, mount & chroot to the installed system and then to rerun "mkinitrd". You might want to have a look at http://technik.blogs.nde.ag/2014/01/05/linux-initrd-command-line which is giving some details and describing some possible steps to take. Especially important (from the "mkinitrd" point of view, if run from the chroot environment) is properly mounting /sys and /proc via the "--bind" option, else mkinitrd will not have access to the required information.

Regards,
Jens