Hello all,

we are running SLES 11 (SP1) for zSeries on a z9 running under z/VM
using a mixture of CKD disks (OS only) and SAN (FC - zfcp) disk for the
OCFS2 database(s).

Our OCFS2 unvironment is as follows:


node1:~ # rpm -qa|grep ocfs2
ocfs2-kmp-xen-1.4_2.6.32.12_0.6-4.10.14
ocfs2-tools-1.4.3-0.11.20
ocfs2-tools-debuginfo-1.4.3-0.11.20
ocfs2-kmp-pae-1.4_2.6.32.12_0.6-4.10.14
ocfs2-tools-debugsource-1.4.3-0.11.20
ocfs2console-1.4.3-0.11.20
ocfs2-tools-o2cb-1.4.3-0.11.20

The OCFS2 filesystem is created using "mkfs.ocfs2 -N 8 <share disk
device name>" without issue any specific options i.e. default.

The mount options we are using are all default :

type ocfs2 (rw,_netdev,cluster_stack=pcmk)

We are trying to copy (cp) a 100GB file from a local ext3 file system
to a shared file system created on OCFS2.

Immediately on starting the copy, we are seeing this -


kernel BUG at
/usr/src/packages/BUILD/ocfs2-1.4/default/ocfs2/heartbeat.c:68!
illegal operation: 0001 [#1] SMP
Modules linked in: iptable_filter ip_tables x_tables sr_mod cdrom
af_packet ocf
Supported: Yes
CPU: 3 Not tainted 2.6.32.12-0.7-default #1
Process ocfs2_controld. (pid: 5783, task: 000000007424e438, ksp:
00000000760e38
Krnl PSW : 0704000180000000 000003c005899f0c
(ocfs2_do_node_down+0xc0/0xc4 [ocf
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000000038 000003c005899e4c 0000000000000001
000000007e448000
0000000000000000 000000000000001f 00000000760e3e1e 000000000000002f
000000007327d240 0000000000000001 0000000000000001 000000007e448000
000003c005854000 000003c0058f81f8 00000000760e3d70 00000000760e3d08
Krnl Code: 000003c005899efe: c0e5fffdd15f brasl %r14,3c0058541bc
000003c005899f04: a7f4ffce brc 15,3c005899ea0
000003c005899f08: a7f40001 brc 15,3c005899f0a
>000003c005899f0c: a7f40000 brc 15,3c005899f0c

000003c005899f10: a7180000 lhi %r1,0
000003c005899f14: 501020a8 st %r1,168(%r2)
000003c005899f18: a7180100 lhi %r1,256
000003c005899f1c: 40102850 sth %r1,2128(%r2)
Call Trace:
(<00000000760e3d00> 0x760e3d00)
<000003c004fa5ed8> ocfs2_control_write+0x3b0/0x4dc [ocfs2_stack_user]
<000000000020c160> vfs_write+0xac/0x1a4
<000000000020c354> SyS_write+0x58/0xb4
<0000000000117e7e> sysc_noemu+0x10/0x16
<000002000024ea50> 0x2000024ea50
Last Breaking-Event-Address:
<000003c005899f08> ocfs2_do_node_down+0xbc/0xc4 [ocfs2]
---[ end trace b0ce64c3e38cc8f4 ]---
Unable to handle kernel pointer dereference at virtual kernel address
000000060
Oops: 003b [#2] SMP
Modules linked in: iptable_filter ip_tables x_tables sr_mod cdrom
af_packet ocf
Supported: Yes
CPU: 3 Tainted: G D 2.6.32.12-0.7-default #1
Process ocfs2_controld. (pid: 5783, task: 000000007424e438, ksp:
00000000760e38
Krnl PSW : 0704200180000000 000000000030aba4 (kref_put+0x3c/0x88)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3
Krnl GPRS: 000003e000000003 0000000000000001 0000000600000008
00000000002d71f0
0000000600000008 0000000000000003 0000000000000000 000000000000002f
000000007e9e7400 0000000074974d40 00000000740bc108 000000007cf2e300
000000007cca0e20 000000000047a428 00000000760e3870 00000000760e3850
Krnl Code: 000000000030ab96: e330d0000020 cg %r3,0(%r13)
000000000030ab9c: a7840022 brc 8,30abe0
000000000030aba0: a7180001 lhi %r1,1
>000000000030aba4: 58504000 l %r5,0(%r4)

000000000030aba8: 1825 lr %r2,%r5
000000000030abaa: 1b21 sr %r2,%r1
000000000030abac: ba524000 cs %r5,%r2,0(%r4)
000000000030abb0: a744fffc brc 4,30aba8
Call Trace:
(<00000000760e3898> 0x760e3898)
<00000000002da89c> apparmor_file_free_security+0x40/0x58
<000000000020cf6a> __fput+0x112/0x240
<00000000001e2690> remove_vma+0x60/0xa0
<00000000001e286c> exit_mmap+0x19c/0x2d8
<000000000013f2e2> mmput+0x62/0x150
<0000000000144e4a> exit_mm+0x196/0x1bc
<00000000001470a8> do_exit+0x12c/0x364
<0000000000105e2e> die+0x17a/0x17c
<000000000010768a> illegal_op+0x1f2/0x1f8
<0000000000117e84> sysc_return+0x0/0x8
<000003c005899f0c> ocfs2_do_node_down+0xc0/0xc4 [ocfs2]
(<00000000760e3d00> 0x760e3d00)
<000003c004fa5ed8> ocfs2_control_write+0x3b0/0x4dc [ocfs2_stack_user]
<000000000020c160> vfs_write+0xac/0x1a4
<000000000020c354> SyS_write+0x58/0xb4
<0000000000117e7e> sysc_noemu+0x10/0x16
<000002000024ea50> 0x2000024ea50
Last Breaking-Event-Address:
<00000000002da896> apparmor_file_free_security+0x3a/0x58
---[ end trace b0ce64c3e38cc8f5 ]---
Fixing recursive fault but reboot is needed!
Nov 10 10:38:25 LXCPOA ocfs2_controld[5783]: this node is not in the
ocfs2_cont
Nov 10 10:38:25 LXCPOA kernel: illegal operation: 0001 [#1] SMP
Nov 10 10:38:25 LXCPOA kernel: Modules linked in: iptable_filter
ip_tables x_ta
Nov 10 10:38:25 LXCPOA kernel: Supported: Yes
Nov 10 10:38:25 LXCPOA kernel: CPU: 3 Not tainted 2.6.32.12-0.7-default
#1
Nov 10 10:38:25 LXCPOA kernel: Process ocfs2_controld. (pid: 5783,
task: 000000
Nov 10 10:38:25 LXCPOA kernel: Krnl PSW : 0704000180000000
000003c005899f0c (oc
Nov 10 10:38:25 LXCPOA kernel: R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 A
Nov 10 10:38:25 LXCPOA kernel: Krnl GPRS: 0000000000000038
000003c005899e4c 000
Nov 10 10:38:25 LXCPOA kernel: 0000000000000000 000000000000001f 000
Nov 10 10:38:25 LXCPOA kernel: 000000007327d240 0000000000000001 000
Nov 10 10:38:25 LXCPOA kernel: 000003c005854000 000003c0058f81f8 000
Nov 10 10:38:25 LXCPOA kernel: Krnl Code: 000003c005899efe:
c0e5fffdd15f br
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f04: a7f4ffce b
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f08: a7f40001
Nov 10 10:38:25 LXCPOA kernel: >000003c005899f0c: a7f40000
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f10: a7180000
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f14: 501020a8
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f18: a7180100
Nov 10 10:38:25 LXCPOA kernel: 000003c005899f1c: 40102850
Nov 10 10:38:25 LXCPOA kernel: Call Trace:
Nov 10 10:38:25 LXCPOA kernel: (<00000000760e3d00> 0x760e3d00)
Nov 10 10:38:25 LXCPOA kernel: <000003c004fa5ed8>
ocfs2_control_write+0x3b0/
Nov 10 10:38:25 LXCPOA kernel: <000000000020c160> vfs_write+0xac/0x1a4
Nov 10 10:38:25 LXCPOA kernel: <000000000020c354> SyS_write+0x58/0xb4
Nov 10 10:38:25 LXCPOA kernel: <0000000000117e7e> sysc_noemu+0x10/0x16
Nov 10 10:38:25 LXCPOA kernel: <000002000024ea50> 0x2000024ea50
Nov 10 10:38:25 LXCPOA kernel: Last Breaking-Event-Address:
Nov 10 10:38:25 LXCPOA kernel: <000003c005899f08>
ocfs2_do_node_down+0xbc/0x
Nov 10 10:38:26 LXCPOA kernel:
Nov 10 10:38:26 LXCPOA kernel: ---[ end trace b0ce64c3e38cc8f4 ]---
Nov 10 10:38:26 LXCPOA kernel: Oops: 003b [#2] SMP
Nov 10 10:38:26 LXCPOA kernel: Modules linked in: iptable_filter
ip_tables x_ta
Nov 10 10:38:26 LXCPOA kernel: Supported: Yes
Nov 10 10:38:26 LXCPOA kernel: CPU: 3 Tainted: G D 2.6.32.12-0.7-defa
Nov 10 10:38:26 LXCPOA kernel: Process ocfs2_controld. (pid: 5783,
task: 000000
Nov 10 10:38:26 LXCPOA kernel: Krnl PSW : 0704200180000000
000000000030aba4 (kr
Nov 10 10:38:26 LXCPOA kernel: R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 A
Nov 10 10:38:26 LXCPOA kernel: Krnl GPRS: 000003e000000003
0000000000000001 000
Nov 10 10:38:26 LXCPOA kernel: 0000000600000008 0000000000000003 000
Nov 10 10:38:26 LXCPOA kernel: 000000007e9e7400 0000000074974d40 000
Nov 10 10:38:26 LXCPOA kernel: 000000007cca0e20 000000000047a428 000
Nov 10 10:38:26 LXCPOA kernel: Krnl Code: 000000000030ab96:
e330d0000020 cg
Nov 10 10:38:26 LXCPOA kernel: 000000000030ab9c: a7840022
Nov 10 10:38:26 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined retry
Nov 10 10:38:26 LXCPOA kernel: 000000000030aba0: a7180001
Nov 10 10:38:26 LXCPOA kernel: >000000000030aba4: 58504000
Nov 10 10:38:26 LXCPOA kernel: 000000000030aba8: 1825 lr
Nov 10 10:38:26 LXCPOA kernel: 000000000030abaa: 1b21
Nov 10 10:38:26 LXCPOA kernel: 000000000030abac: ba524000
Nov 10 10:38:26 LXCPOA kernel: 000000000030abb0: a744fffc
Nov 10 10:38:26 LXCPOA kernel: Call Trace:
Nov 10 10:38:26 LXCPOA kernel: (<00000000760e3898> 0x760e3898)
Nov 10 10:38:26 LXCPOA kernel: <00000000002da89c>
apparmor_file_free_securit
Nov 10 10:38:26 LXCPOA kernel: <000000000020cf6a> __fput+0x112/0x240
Nov 10 10:38:26 LXCPOA kernel: <00000000001e2690> remove_vma+0x60/0xa0
Nov 10 10:38:26 LXCPOA kernel: <00000000001e286c>
exit_mmap+0x19c/0x2d8
Nov 10 10:38:26 LXCPOA kernel: <000000000013f2e2> mmput+0x62/0x150
Nov 10 10:38:26 LXCPOA kernel: <0000000000144e4a> exit_mm+0x196/0x1bc
Nov 10 10:38:26 LXCPOA kernel: <00000000001470a8> do_exit+0x12c/0x364
Nov 10 10:38:26 LXCPOA kernel: <0000000000105e2e> die+0x17a/0x17c
Nov 10 10:38:26 LXCPOA kernel: <000000000010768a>
illegal_op+0x1f2/0x1f8
Nov 10 10:38:26 LXCPOA kernel: <0000000000117e84> sysc_return+0x0/0x8
Nov 10 10:38:26 LXCPOA kernel: <000003c005899f0c>
ocfs2_do_node_down+0xc0/0x
Nov 10 10:38:26 LXCPOA kernel: (<00000000760e3d00> 0x760e3d00)
Nov 10 10:38:26 LXCPOA kernel: <000003c004fa5ed8>
ocfs2_control_write+0x3b0/
Nov 10 10:38:26 LXCPOA kernel: <000000000020c160> vfs_write+0xac/0x1a4
Nov 10 10:38:26 LXCPOA kernel: <000000000020c354> SyS_write+0x58/0xb4
Nov 10 10:38:26 LXCPOA kernel: <0000000000117e7e> sysc_noemu+0x10/0x16
Nov 10 10:38:26 LXCPOA kernel: <000002000024ea50> 0x2000024ea50
Nov 10 10:38:26 LXCPOA kernel: Last Breaking-Event-Address:
Nov 10 10:38:26 LXCPOA kernel: <00000000002da896>
apparmor_file_free_securit
Nov 10 10:38:26 LXCPOA kernel:
Nov 10 10:38:26 LXCPOA kernel: ---[ end trace b0ce64c3e38cc8f5 ]---
Nov 10 10:38:26 LXCPOA stonith: external/sbd device OK.
Nov 10 10:38:26 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined retry
Nov 10 10:38:26 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:26 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:26 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:27 LXCPOA cluster-dlm[5740]: _send_message:
cpg_mcast_joined error
Nov 10 10:38:42 LXCPOA stonith: external/sbd device OK.
Nov 10 10:38:58 LXCPOA stonith: external/sbd device OK.
Nov 10 10:39:14 LXCPOA stonith: external/sbd device OK.
Nov 10 10:39:31 LXCPOA stonith: external/sbd device OK.
Nov 10 10:39:47 LXCPOA stonith: external/sbd device OK.
Nov 10 10:40:03 LXCPOA stonith: external/sbd device OK.
Nov 10 10:40:20 LXCPOA stonith: external/sbd device OK.
Nov 10 10:40:36 LXCPOA stonith: external/sbd device OK.
Nov 10 10:40:52 LXCPOA stonith: external/sbd device OK.
Nov 10 10:41:08 LXCPOA stonith: external/sbd device OK.
Nov 10 10:41:24 LXCPOA stonith: external/sbd device OK.
Nov 10 10:41:41 LXCPOA stonith: external/sbd device OK.
Nov 10 10:41:57 LXCPOA stonith: external/sbd device OK.
Nov 10 10:42:13 LXCPOA stonith: external/sbd device OK.
Nov 10 10:42:29 LXCPOA stonith: external/sbd device OK.
Nov 10 10:42:46 LXCPOA stonith: external/sbd device OK.
Nov 10 10:43:02 LXCPOA stonith: external/sbd device OK.
Nov 10 10:43:18 LXCPOA stonith: external/sbd device OK.
Nov 10 10:43:34 LXCPOA stonith: external/sbd device OK.
Nov 10 10:43:50 LXCPOA stonith: external/sbd device OK.
Nov 10 10:44:06 LXCPOA stonith: external/sbd device OK.
Nov 10 10:44:22 LXCPOA stonith: external/sbd device OK.
Nov 10 10:44:39 LXCPOA stonith: external/sbd device OK.
Nov 10 10:44:55 LXCPOA stonith: external/sbd device OK.
Nov 10 10:45:11 LXCPOA stonith: external/sbd device OK.
Nov 10 10:45:27 LXCPOA stonith: external/sbd device OK.
Nov 10 10:45:43 LXCPOA stonith: external/sbd device OK.
Nov 10 10:46:00 LXCPOA stonith: external/sbd device OK.
Nov 10 10:46:16 LXCPOA stonith: external/sbd device OK.
Nov 10 10:46:32 LXCPOA stonith: external/sbd device OK.
Nov 10 10:46:48 LXCPOA stonith: external/sbd device OK.
Nov 10 10:47:04 LXCPOA stonith: external/sbd device OK.
Nov 10 10:47:20 LXCPOA stonith: external/sbd device OK.
Nov 10 10:47:37 LXCPOA stonith: external/sbd device OK.
Nov 10 10:47:53 LXCPOA stonith: external/sbd device OK.
Nov 10 10:48:09 LXCPOA stonith: external/sbd device OK.
Nov 10 10:48:25 LXCPOA stonith: external/sbd device OK.
Nov 10 10:48:41 LXCPOA stonith: external/sbd device OK.
Nov 10 10:48:57 LXCPOA stonith: external/sbd device OK.
Nov 10 10:50:02 LXCPOA crmd: [5683]: ERROR: process_lrm_event: LRM
operation sb
Nov 10 10:51:08 LXCPOA crmd: [5683]: ERROR: process_lrm_event: LRM
operation sb
Nov 10 10:51:48 LXCPOA crmd: [5683]: ERROR: process_lrm_event: LRM
operation sb

A helpful tip and / or idea would be most appreciated.


J.


--
Andre_Massena
------------------------------------------------------------------------
Andre_Massena's Profile: http://forums.novell.com/member.php?userid=120228
View this thread: http://forums.novell.com/showthread.php?t=448661