SLES12sp1 and 2 Xen VM's which are server 2012r2. Every since the update to sp1, these VM's take almost 10 minutes to boot. Is there something that can be done to bring these VM's back to a more acceptable boot time?
SLES12sp1 and 2 Xen VM's which are server 2012r2. Every since the update to sp1, these VM's take almost 10 minutes to boot. Is there something that can be done to bring these VM's back to a more acceptable boot time?
carnold6,
It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.
These forums are peer-to-peer, best effort, volunteer run and that if your issue
is urgent or not getting a response, you might try one of the following options:
- Visit http://www.suse.com/support and search the knowledgebase and/or check all
the other support options available.
- Open a service request: https://www.suse.com/support
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.suse.com)
Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php
If this is a reply to a duplicate posting or otherwise posted in error, please
ignore and accept our apologies and rest assured we will issue a stern reprimand
to our posting bot..
Good luck!
Your SUSE Forums Team
http://forums.suse.com
So i just cannot get the memory lowered on Dom0! I followed:
https://www.suse.com/documentation/s...st_memory.html
and
http://wiki.xen.org/wiki/Xen_Best_Pr...ory_ballooning
for GRUB2. Then reboot server and Dom0 still has all the memory. Xl info shows:
Capture2.PNG
and xl list:
Capture3.PNG
Ok, so i am trying to better understand Dom0 according to https://www.suse.com/documentation/s...omponents.html
Dom0 appears to be the OS (in this case SLES12SP1) on the physical server?
So i changed /etc/default/grub with GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=512M,max:1024M" and upon reboot, the server OS was super slow to boot and to respond to whatever i typed or clicked. Change /etc/default/grub with GRUB_CMDLINE_XEN_DEFAULT="dom0_mem=7096M" and the server OS is much better with boot performance and response time. However, the Xen guests (which are server 2012R2) are still incredibly slow in boot and response time. When i look at the connection detail in virt manager of Dom0, i see the current allocation is now 6996 (which is NOT 7096 and what i entered into the grub file) but the max allocation is still 10240000 and is why, i believe, my guest VM's are so incredibly slow!!
So, i finally got the Dom0 memory worked out and that does not fix the VM's slow boot/response issue. It took 5 minutes to boot and 4 minutes to login once it did boot. It took a little over 1 minute to open the services mmc! Both VM's are 2012 R2 and have 4GB RAM and did NOT use to be this slow before the SP1 update
So today, the systems stopped responding to all/any requests. Had no video couldn't VNC to the system, nothing! Reboot server and now i only get a black screen. No login screen. Same thing with trying a snapshot. Any ideas how to get the server back up and running without losing any data?
After doing everything i know of, including btrfs check --repair <device> which did find errors and after a reboot it hung on loading kernel modules, i finally got the system back up. This time i selected advanced sles12 with Xen option and chose a different kernel. System now boots and Xen VM's appear to be working, checking now......
Any ideas out there?
Hi carnold6,
if the issue is still slow booting VMs, I'd suggest to start bottle-neck analysis.
From previous messages, I see that Dom0 is taking it's fair share of memory. You attempted to boot with 512M, which is awfully small, so no wonder booting the server took so long. Have you ever run "vmstat 1" to see the actual memory consumption of Dom0, so that you'd be able to set an appropriate value (which in effect means, enough memory to run without major initial swapping and especially without constant swapping, then add some (hundreds) MB for file system cache)?
Once you've restricted Dom0 to some reasonable value (to avoid the overhead of ballooning), you ought to run "vmstat 1" (on Dom0) during start-up of your VM. What is taking resources... CPU? (I doubt it) Memory? (Maybe Dom0 now starting to swap) I/O?
Once you've identified the major slowing point, we can try to work out on how to improve on that.
You never mentioned the resource setup of the VM - are the disks local, on NFS, on iSCSI, on SAN? Is the server's memory large enough to hold both Dom0 and DmoU(s)? Such details will help to give advice on reducing the identified bottle-necks.
With regards,
J
From the times when today's "old school" was "new school"
If you find this post helpful and are logged into the web interface, show your appreciation and click on the star below...
Your knowledge far exceeds mine in knowing the tools to use to find the bottleneck and i am hoping here too, how to read the output of vmstat 1? After looking here to interpret the output:
https://www.thomas-krenn.com/en/wiki...s_using_vmstat
Seems to be a whole lotta swapping going on?
As for the resource setup, forgive me for not posting that bit of helpful info. It seems i was in a panic when the system would not boot. Here it is:Code:procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 69568 36024 12 5275148 0 0 605 247 19 15 2 1 89 8 0 0 0 69568 33248 12 5277308 0 0 2048 3756 9727 6364 3 2 94 1 1 0 0 69568 33776 12 5276992 0 0 4096 0 12421 8200 2 1 96 0 0 0 0 69568 33248 12 5277128 0 0 0 0 12375 7601 1 2 96 0 0 0 0 69568 35264 12 5275176 0 0 2048 496 8858 5668 2 1 96 0 0 0 0 69568 33520 12 5277108 0 0 2048 0 7815 4995 2 1 97 0 0 1 0 69568 33680 12 5277120 0 0 0 564 12163 7613 2 1 96 0 0 0 0 69568 35364 12 5275232 0 0 2048 0 8938 5779 2 1 96 0 0 0 0 69568 35568 12 5275120 0 0 32 0 11165 7067 2 1 96 0 0 1 0 69568 35304 12 5275312 0 0 4064 0 11428 7212 1 1 96 1 1 0 0 69568 31176 12 5279108 0 0 4096 0 10352 6522 2 2 96 0 0 2 0 69568 35048 12 5275196 0 0 0 860 11276 7460 3 1 96 0 0 0 0 69568 33912 12 5277228 0 0 2048 0 11314 7665 1 1 98 0 0 0 0 69568 33984 12 5277196 0 0 140 8136 10859 7342 2 2 92 5 0 0 0 69568 33832 12 5277192 0 0 0 0 15453 11446 4 2 93 0 1 0 0 69568 31888 12 5279252 0 0 1908 0 12559 7889 1 1 97 1 0 0 0 69568 36364 12 5273840 0 0 1880 544 11170 6972 1 1 94 4 0 0 0 69568 32228 12 5277988 0 0 4256 0 12076 7457 1 1 98 0 0 0 0 69568 30248 12 5280056 0 0 2048 0 10443 6423 0 1 98 0 0 0 0 69568 30248 12 5280060 0 0 32 0 9347 5848 1 1 98 0 0 0 1 69568 35212 12 5273668 0 0 4704 0 10905 7029 1 1 96 2 0 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 69560 35768 12 5273820 0 0 80 488 12882 8007 0 1 97 1 0 0 0 69560 31752 12 5277860 0 0 4096 0 11485 6982 1 1 97 1 0 1 0 69560 31800 12 5277900 0 0 0 0 14506 8909 1 1 97 0 0 0 0 69560 38400 12 5271000 0 0 2188 8 10443 6672 2 1 96 0 0 1 1 69560 34500 12 5272536 0 0 1788 0 9969 6969 6 2 90 2 0 0 1 69560 33372 12 5266800 0 0 21168 160 12942 8493 7 2 77 13 0 1 0 69560 38816 12 5261724 0 0 5632 0 8764 6308 9 1 83 6 0 0 0 69560 34832 12 5265868 0 0 4064 0 13449 9003 4 2 94 0 0 1 0 69560 34800 12 5265868 0 0 32 296 18274 12935 4 2 93 0 1 0 0 69560 32384 12 5268032 0 0 2048 336 13609 9179 2 2 96 0 1 0 0 69560 32480 12 5268084 0 0 0 15152 16132 10623 1 2 96 0 1 1 0 69560 37416 12 5262936 0 0 4064 68 9962 6314 1 1 92 6 0 0 0 69560 37224 12 5263056 0 0 84 68 10148 6466 1 1 97 1 0 1 0 69560 33000 12 5267068 0 0 4012 0 12332 7714 1 1 97 0 1 1 0 69560 36768 12 5263184 0 0 220 0 9300 6322 6 1 91 1 0 2 0 69560 32784 12 5267288 0 0 4096 280 11401 7551 3 2 95 0 0 0 0 69560 36616 12 5262044 0 0 84 0 11856 7624 5 2 92 0 0 0 0 69560 32696 12 5266200 0 0 4012 0 10369 6651 1 1 98 0 0 0 0 69560 32736 12 5266124 0 0 0 0 10167 6567 1 1 97 0 1 1 0 69560 30944 12 5268164 0 0 2048 0 12373 7664 1 1 98 0 1 0 0 69560 36024 12 5262964 0 0 4096 140 10867 6760 1 2 97 1 0
16GB RAM total with 4GB set aside for each of the 2 VM's. The rest, 8GB should be left over
2 SAS drives (local) with a PERC 6 controller. 1x300GB and 1x750GB in a non-RAID setup (i didn't configure the server). Opt directory is mounted on the 750GB drive
2 quad core Xeon's. Dom0 has 8 vCPU's and both of the VM's have 6 vCPU's a piece
While, for now, i am OK with the system booting on a different kernel, i would like to get back to the latest kernel (like it was before).
Hi,
no, actually there's (during normal operations) *no* swapping going on, and that's good.
Here's how I read that output:
procs: let's ignore that for now
memory: while there's some "swapped" memory reported, this only tells that at some point in time the kernel decided to swap out some (at that time) unused stuff, in favor of making better use of that physical memory. There's still some "free" memory (only saying there's no operational pressure on physical memory management), a bit is used for buffers (that number appears awfully small to me, I have no explanation for that) and a lot is used for caching. As any free memory is used for i/o caching, that big "cache" number seems to imply that you may have more memory committed to that Dom0 than actually required - but it's a trade-off versus block i/o (coming to that further below).
swap: Those zeros tell you that there's no current swap activity - nothing swapped in (stuff that got swapped out because of memory constraints, but now is needed again for current operations) nor swapped out (as memory might be tight - which obviously isn't)
io: actual i/o to local block devices. You're doing some reads, and there are some writes, too... nothing to worry about. Depending on access patterns, file system settings and available cache memory (see above), these numbers *might* get high because every read/write needs to go the the actual block device, rather than being served by the cache. In your case I doubt that, and because of the low percentage of "waits" (see below, CPU "wa" number) you're not having an issue there, anyhow.
system: "in" are interrupts, i.e. caused by devices signaling available data. according to my personal experience, these numbers are a bit high - but depending on your setup, this may be normal, rather than an indicator of real problems. I'd look at the numbers in /proc/interrupts, to get a feeling where these are coming from. "cs" are so called "context switches", telling you the scheduler let a different process get it's share of CPU.
cpu: "us" is "user space", hence programs you're running. "sy" is "system space", stuff that's being handled by the kernel. Then you have "idle times", which often (and in your case) is high and thus indicating that those CPUs are not doing much. More than enough horse-power for the actual workload. "wa" is the percentage of CPU time spent waiting for i/o to complete - for DomUs on local disks, that's definitely a number to watch. But in your case, watching gets mostly boring, only once that number show a significant value.
So looking at these numbers gives an impression of a Dom0 with too much available memory (which might help the VMs instead, I'd immediately turn it down by 2 or 3 GB). The CPUs are mostly bored, there's a bit of reading and writing to disk going on, but nothing to actually worry about. The overall i/o wait is at 8 percent (shown by the first line of output), *that* looks a bit high, but may be explainable. Those high number of interrupts may be a pointer to something, too, but I'm not sure about that.
You gave the VMs 4 GB each - I suspect that that's not enough to reach optimum performance, as the VMs itself (the Windows OS) might be forced to swap out memory during boot. This then might be the cause for those 8 percent over-all i/o wait reported by vmstat - you might want to take a look at vmstat's output during start of the VMs and check the i/o wait percentage at that time... if it's going high, I'd try to throw more memory at these VMs (of course, doing a similar analysis *inside* the VMs would be a better starting point, but might prove to be difficult during the startup phase). Swapping of a DomU does slow it down significantly, even worse than if the Dom0 needs to swap.
Hope this helps a bit to clear up the picture
Regards,
J
From the times when today's "old school" was "new school"
If you find this post helpful and are logged into the web interface, show your appreciation and click on the star below...
Bookmarks