PDA

View Full Version : Missing o2dlm in sysfs



jhaemmer
25-Jun-2014, 17:06
Hi *

I'm currently having a look into the debugging posibilities of my SLES11SP3 setup with pacemaker and ocfs2. The problem I ran into is, that the manpage of ocfs2 mentions in the "DLM Debugging" section the sysfs path "/sys/kernel/debug/o2dlm/<uuid>/dlm_state" to be usefull, but the folder "o2dlm" is missing on my nodes.

Furthermore, using debugfs.ocfs2 the "dlm_locks" command shows the following output (the filesystem is mounted!)



debugfs: dlm_locks <M0000000000000000ac......
Could not open debug state for "C63DB913D6......".
Perhaps that OCFS2 file system is not mounted?


To me this looks somehow related, though I'm not sure.


Does anyone have an idea how to enable this o2dlm in sysfs?

Kind regards
Jochen

jmozdzen
25-Jun-2014, 17:27
Hi Jochen,


Hi *

I'm currently having a look into the debugging posibilities of my SLES11SP3 setup with pacemaker and ocfs2. The problem I ran into is, that the manpage of ocfs2 mentions in the "DLM Debugging" section the sysfs path "/sys/kernel/debug/o2dlm/<uuid>/dlm_state" to be usefull, but the folder "o2dlm" is missing on my nodes.

Shouldn't that be "/sys/kernel/debug/o2dlm/<domain>/dlm_state"?

But the root cause could be that you might not be using o2dlm at all - I assume you've configured the cluster glue to use Pacemaker's DLM, the typical scenario when running OCFS2 together with Pacemaker.

With regards,
Jens

jhaemmer
26-Jun-2014, 08:06
Hi Jens,



But the root cause could be that you might not be using o2dlm at all - I assume you've configured the cluster glue to use Pacemaker's DLM, the typical scenario when running OCFS2 together with Pacemaker.


thanks for this hint. Somehow I didn't realize that these are separate implementations. I stick with my current "/sys/kernel/debug/dlm" for debugging then ;)

Are you aware of any documentation about the structure/data shown in the provided files of this folder/subfolders (*_locks, *_all, *_waiters ....). I was able to identify some (pid, nodeid, lockres), but have no idea what the rest of it means.

thanks in advance
Jochen

jmozdzen
26-Jun-2014, 11:37
Hi Jochen,


Are you aware of any documentation about the structure/data shown in the provided files of this folder/subfolders (*_locks, *_all, *_waiters ....). I was able to identify some (pid, nodeid, lockres), but have no idea what the rest of it means.

no, unfortunately I've not seen any documentation yet - but I'll ask around to see if I can find any pointer. (But don't hold your breath...) Probably most of the details are only documented in "C" ;)

With regards,
Jens

jhaemmer
26-Jun-2014, 14:08
Hi Jens,



.... Probably most of the details are only documented in "C" ;)


:(

That's what I feared.....
But maybe there's another way to achive what I want to.

I'm looking for a way to query the ocfs2/dlm for all current locks including the nodes / pid's which hold them and if applicable the queue of nodes lined up for the next available lock.

Where I'm currently stuck is, that the locking information is spread across all nodes of the cluster ( and origins debugfs / sysfs / ...... ) and it seems I have to collect it by hand. What I naively expected was, that there is one place in the debug output / debugfs ... I could simply query for this information. I mean it has to be somewhere available for all nodes ....

Any hints where to start with in this case?

Regards
Jochen

David Gersic
26-Jun-2014, 15:00
On Thu, 26 Jun 2014 07:14:02 +0000, jhaemmer wrote:

> Are you aware of any documentation about the structure/data shown in the
> provided files of this folder/subfolders (*_locks, *_all, *_waiters
> ....). I was able to identify some (pid, nodeid, lockres), but have no
> idea what the rest of it means.

Several of the HA developers hang out on this mailing list:

http://lists.linux-ha.org/mailman/listinfo/linux-ha

you might ask there.


--
--------------------------------------------------------------------------
David Gersic dgersic_@_niu.edu
Knowledge Partner http://forums.netiq.com

Please post questions in the forums. No support provided via email.
If you find this post helpful, please click on the star below.

jmozdzen
26-Jun-2014, 15:19
Hi Jochen,


I'm looking for a way to query the ocfs2/dlm for all current locks including the nodes / pid's which hold them and if applicable the queue of nodes lined up for the next available lock.

Where I'm currently stuck is, that the locking information is spread across all nodes of the cluster ( and origins debugfs / sysfs / ...... ) and it seems I have to collect it by hand. What I naively expected was, that there is one place in the debug output / debugfs ... I could simply query for this information. I mean it has to be somewhere available for all nodes ....

Any hints where to start with in this case?

I've received a pointer to the according mailing list (http://lists.linux-ha.org/mailman/listinfo/linux-ha), you might try to place your question there, 'cause that's where the developers hang around. Another path would be to open a service request with SUSE, if you have an according subscription. There are very HAE-knowledgeable engineers in the SUSE team that ought to be able to help out with such specific questions.

With regards,
Jens

jhaemmer
27-Jun-2014, 14:29
Hi Jens,
thanks vor the hint and thanks for all your efforts. I see what I can find out at the HA Mailinglist and the Suse Support.

regards
Jochen