PDA

View Full Version : Exceeded max scheduling attempts 3 for instance xyz



manteshpatil
03-Jul-2019, 16:31
Hi All,

I am using SUSE OpenStack Cloud 8 Environment, where I am able to create an instance through the dashboard
but unable to create an instance using API due to the following error.

Error: Failed to perform requested operation on instance "VMde64af", the instance has an error status:
Please try again later [Error: Exceeded maximum number of retries. Exceeded max scheduling attempts 3
for instance 9b8289e2-c5d6-4003-9202-91e452e65738. Last exception:
Unexpected error while running command. Command: rbd import --pool nova //var/lib/nova/instances/_base/cfe43c0bc7cc07].


API Call: https://<SUSE Server>:8774/v2.1/1efa5bcc87d948b5a484c9e868085ac9/servers
Body: {
"server": {
"name": "test1VM02306e",
"imageRef": "035aaca4-54fe-4501-b0ac-bfe5b82db36e",
"flavorRef": "1",
"key_name": "admin-key-01",
"security_groups": [
{
"name": "default"
}
],
"metadata": {},
"block_device_mapping_v2": [],
"networks": [
{
"uuid": "bcc96c6c-26cd-4c22-8a15-5d0911f183rg"
}
],
"personality": [],
"adminPass": "",
"availability_zone": "nova"
}
}

Please help me to resolve this issue.

Thanks,
Mantesh Patil

bmwiedemann
04-Jul-2019, 09:23
for instance 9b8289e2-c5d6-4003-9202-91e452e65738. Last exception:
Unexpected error while running command. Command: rbd import --pool nova //var/lib/nova/instances/_base/cfe43c0bc7cc07].

This "rbd" (Rados Block Device) indicates trouble with the ceph storage backend. Please check "ceph status"

If it says
health HEALTH_ERR

then you need to investigate there.

otherwise /var/log/nova/nova-scheduler.log might have more hints.

In general, the horizon dashboard always uses the nova API, so if one consistently works and the other does not, it is because the API calls are different. E.g. it can matter if you have the OS on volumes or on local ephemeral disks. You can try variations there to see which part matters.

eblock
04-Jul-2019, 15:42
You should also make sure that your base images are in raw format. The rbd import from your output indicates that your base images are downloaded from glance to local storage on the compute node (to
/var/lib/nova/instances/_base/), which is then converted to raw and then nova tries to create a new rbd object within the pool. Also check your disk space on compute nodes because this would fill up your disk if all the base images reside on your compute nodes. I would also recommend to disable image caching if ceph is your backend, check out this post (http://heiterbiswolkig.blogs.nde.ag/2018/10/08/openstack-slow-vm-boot/) about disk_format and image caching, maybe that helps.
Are the instances created with horizon volume based instances or do they have ephemeral disks?

manteshpatil
15-Jul-2019, 11:38
This "rbd" (Rados Block Device) indicates trouble with the ceph storage backend. Please check "ceph status"

If it says
health HEALTH_ERR


Thank you for your response

As you said tried to get ceph status but it was not giving any result as
In our environment we do not use SUSE Enterprise Storage or Ceph, But not understanding how come rbd commands are getting invoked through API calls?

eblock
15-Jul-2019, 11:50
Thank you for your response

As you said tried to get ceph status but it was not giving any result as
In our environment we do not use SUSE Enterprise Storage or Ceph, But not understanding how come rbd commands are getting invoked through API calls?

Are we talking about SOC 8 CLM or Crowbar here? If you deployed SOC with Crowbar, how did you deploy the nova barclamp? If there is no ceph backend it's clear that nova will fail if it's configured to use ceph.
Can you paste the output of
grep -rvE "^#|^$" /etc/nova/ | grep rbd from one of your compute nodes and
grep -rvE "^#|^$" /etc/cinder/ | grep rbd from a cinder node? I'm not very familiar with CLM yet, so in that case someone else might be able to help you.