PDA

View Full Version : Calamari / Romana don't monitor SES cluster



polezhaevdmi
04-Aug-2016, 14:21
After some time of experiments the Calamari / Romana console lost management connection to SES cluster. The Romana web-GUI reports every management form as follows:

New Calamari Installation
This appears to be the first time you have started Calamari and there are no clusters currently configured.
7 Ceph servers are connected to Calamari, but no Ceph cluster has been created yet. Please use ceph-deploy to create a cluster; please see the SUSE Enterprise Storage documentation for more details.

However, the Ceph cluster is in good health:

admin:~ # ceph -w
cluster f4cbac17-f67d-4a05-b0df-24b5b056143a
health HEALTH_OK
monmap e2: 3 mons at {mon01=192.168.124.21:6789/0,mon02=192.168.124.22:6789/0,mon03=192.168.124.23:6789/0}
election epoch 4176, quorum 0,1,2 mon01,mon02,mon03
fsmap e330: 1/1/1 up {0=mds01=up:active}
osdmap e197: 4 osds: 4 up, 4 in
flags sortbitwise
pgmap v319416: 208 pgs, 14 pools, 2048 MB data, 715 objects
4272 MB used, 16163 MB / 20435 MB avail
208 active+clean
client io 35261 B/s rd, 1647 B/s wr, 11 op/s rd, 3 op/s wr

The main question is: "how to bring the management to operable state?"

Recipe #1: reboot everything. No effect.

Recipe #2: try to reinitialize Calamari databases. No effect.

calamari-ctl clear –-yes-i-am-sure
calamari-ctl initialize

Recipe #3: deinstall and remove data folders for Calamari / Romana, as described at manual tip, reinstall both and reconnect cluster nodes. No effect.
https://www.suse.com/documentation/ses-3/singlehtml/book_storage_admin/book_storage_admin.html#ceph.install.calamari

Recipe #4: reinstall the Calamary server from scratch. Left as a last resort.

Thus, the related question is: "how to troubleshoot the Calamari management?"

According to the video, this task seems not simple...
https://www.youtube.com/watch?v=3ppxtwuKwwI

polezhaevdmi
04-Aug-2016, 14:34
Some investigations, as presented in video...


admin:~ # salt-key -L
Accepted Keys:
mon01.ses01.lab.local
mon02.ses01.lab.local
mon03.ses01.lab.local
osd01.ses01.lab.local
osd02.ses01.lab.local
osd03.ses01.lab.local
osd04.ses01.lab.local
Denied Keys:
Unaccepted Keys:
Rejected Keys:
Cluster still looks fine, but does the management server require the separate key?


admin:~ # salt '*' ceph.get_heartbeats
osd01.ses01.lab.local:
Minion did not return. [No response]
osd02.ses01.lab.local:
Minion did not return. [No response]
mon01.ses01.lab.local:
Minion did not return. [No response]
osd03.ses01.lab.local:
Minion did not return. [No response]
mon02.ses01.lab.local:
Minion did not return. [No response]
osd04.ses01.lab.local:
Minion did not return. [No response]
mon03.ses01.lab.local:
Minion did not return. [No response]
Minion installed, but not responded... Why?

polezhaevdmi
05-Aug-2016, 12:38
Minions are responding actually, but ceph part of counters is not accessible. Looks like the problem is in salt at cluster nodes...

admin:~ # salt '*' test.ping
osd01.ses01.lab.local:
True
osd04.ses01.lab.local:
True
osd03.ses01.lab.local:
True
mon03.ses01.lab.local:
True
mon02.ses01.lab.local:
True
osd02.ses01.lab.local:
True
mon01.ses01.lab.local:
True

polezhaevdmi
05-Aug-2016, 16:12
Tried to reinstall not only Calamari + Romana + salt, but Apache + PostgreSQL as well. Plus, removed all documented data folders. Rebooted, installed romana (and so on), rebooted, initialized Calamari (but still not connected the Ceph nodes to management) and got:

New Calamari Installation
This appears to be the first time you have started Calamari and there are no clusters currently configured.
I have detected 5 host(s) requesting registration.
Hosts Requesting to Be Managed By Calamari
1.mon01.ses01.lab.local
2.mon03.ses01.lab.local
3.osd02.ses01.lab.local
4.osd03.ses01.lab.local
5.osd04.ses01.lab.local
ADD
Tried to join the nodes - got the initial effect (no cluster in Romana).

Conclusion: the SES 3 documentation lists not all places, which containing Calamari databases. In the case of severe management inconsistance, the best recipe - install OS and software from scratch and do not loose You time with that flotsam.