Results 1 to 5 of 5

Thread: CephFS & Oracle database, 8k blocks, slow write

Hybrid View

  1. #1

    CephFS & Oracle database, 8k blocks, slow write

    I made ceph cluster on SES-4 10.2.5 with ceph-deploy
    supermicro servers (nx1: mon+osd, nx3:mon+osd+mds, nx4: mon+osd+mds,sm5: osd ), netcard with 2x10Gbit ports
    osd - it's hdd (600GB,1TB,2TB capacity), no ssd, no flash

    Code:
       cluster 35f51cdd-7a5c-4c9f-921b-8d63ed8e4da7
         health HEALTH_OK
         monmap e2: 3 mons at {nx1=10.117.28.209:6789/0,nx3=10.117.28.13:6789/0,nx4=10.117.28.14:6789/0}
                election epoch 544, quorum 0,1,2 nx3,nx4,nx1
          fsmap e126: 1/1/1 up {0=nx4=up:active}, 1 up:standby
         osdmap e9552: 96 osds: 96 up, 96 in
                flags sortbitwise,require_jewel_osds
          pgmap v1212475: 4096 pgs, 2 pools, 21535 GB data, 5384 kobjects
                43124 GB used, 67745 GB / 108 TB avail
                    4096 active+clean
    Ceph FS mount like
    Code:
    10.117.28.14:6789:/ /mnt/cephfs ceph  name=admin,secret=bla-lbla-bla==,noatime,_netdev    0       0
    We have virtual server with SLES 12.1 and Oracle database 11.2.0.4
    All looks good, but db-admins said:
    DB async wait > 1000ms

    I test with fio 8k block, and have problem with random write speed
    Code:
    random-write: (g=0): rw=randwrite, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=1
    ...
    fio-2.13
    Starting 8 processes
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    random-write: Laying out IO file(s) (1 file(s) / 1024MB)
    Jobs: 8 (f=8): [w(8)] [6.8% done] [0KB/4304KB/0KB /s] [0/538/0 iops] [eta 01h:08m:43s]
    random-write: (groupid=0, jobs=1): err= 0: pid=468113: Mon Mar 27 10:56:33 2017
      write: io=72600KB, bw=247751B/s, iops=30, runt=300069msec
        slat (usec): min=10, max=91, avg=19.85, stdev= 6.16
        clat (msec): min=2, max=936, avg=33.04, stdev=64.11
         lat (msec): min=2, max=936, avg=33.06, stdev=64.11
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   18], 80.00th=[   28], 90.00th=[   81], 95.00th=[  169],
         | 99.00th=[  330], 99.50th=[  392], 99.90th=[  529], 99.95th=[  668],
         | 99.99th=[  938]
        bw (KB  /s): min=   16, max=  718, per=12.53%, avg=243.82, stdev=153.94
        lat (msec) : 4=0.71%, 10=36.42%, 20=36.15%, 50=13.79%, 100=4.25%
        lat (msec) : 250=6.31%, 500=2.20%, 750=0.14%, 1000=0.02%
      cpu          : usr=0.04%, sys=0.07%, ctx=9089, majf=0, minf=9
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=9075/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468114: Mon Mar 27 10:56:33 2017
      write: io=75968KB, bw=259215B/s, iops=31, runt=300102msec
        slat (usec): min=10, max=133, avg=19.86, stdev= 6.43
        clat (msec): min=1, max=1087, avg=31.58, stdev=63.13
         lat (msec): min=1, max=1087, avg=31.60, stdev=63.13
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   18], 80.00th=[   28], 90.00th=[   74], 95.00th=[  157],
         | 99.00th=[  306], 99.50th=[  388], 99.90th=[  652], 99.95th=[  717],
         | 99.99th=[ 1090]
        bw (KB  /s): min=   16, max=  759, per=13.24%, avg=257.70, stdev=163.21
        lat (msec) : 2=0.01%, 4=0.76%, 10=37.28%, 20=36.15%, 50=13.63%
        lat (msec) : 100=3.99%, 250=6.38%, 500=1.60%, 750=0.17%, 1000=0.02%
        lat (msec) : 2000=0.01%
      cpu          : usr=0.03%, sys=0.08%, ctx=9503, majf=0, minf=9
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=9496/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468115: Mon Mar 27 10:56:33 2017
      write: io=71984KB, bw=245621B/s, iops=29, runt=300103msec
        slat (usec): min=10, max=160, avg=19.81, stdev= 6.36
        clat (msec): min=3, max=961, avg=33.33, stdev=65.31
         lat (msec): min=3, max=961, avg=33.35, stdev=65.31
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   19], 80.00th=[   29], 90.00th=[   84], 95.00th=[  165],
         | 99.00th=[  330], 99.50th=[  408], 99.90th=[  619], 99.95th=[  725],
         | 99.99th=[  963]
        bw (KB  /s): min=   15, max=  720, per=12.48%, avg=242.93, stdev=156.65
        lat (msec) : 4=0.43%, 10=35.69%, 20=36.14%, 50=14.35%, 100=4.77%
        lat (msec) : 250=6.55%, 500=1.87%, 750=0.17%, 1000=0.04%
      cpu          : usr=0.04%, sys=0.07%, ctx=9008, majf=0, minf=11
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=8998/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468116: Mon Mar 27 10:56:33 2017
      write: io=76976KB, bw=262655B/s, iops=32, runt=300102msec
        slat (usec): min=10, max=98, avg=19.89, stdev= 6.36
        clat (msec): min=2, max=1002, avg=31.16, stdev=61.88
         lat (msec): min=2, max=1002, avg=31.18, stdev=61.88
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   18], 80.00th=[   27], 90.00th=[   69], 95.00th=[  159],
         | 99.00th=[  310], 99.50th=[  388], 99.90th=[  570], 99.95th=[  676],
         | 99.99th=[ 1004]
        bw (KB  /s): min=   16, max=  720, per=13.28%, avg=258.50, stdev=165.14
        lat (msec) : 4=0.50%, 10=38.10%, 20=35.17%, 50=14.11%, 100=4.62%
        lat (msec) : 250=5.55%, 500=1.78%, 750=0.14%, 1000=0.02%, 2000=0.01%
      cpu          : usr=0.04%, sys=0.07%, ctx=9626, majf=0, minf=11
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=9622/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468117: Mon Mar 27 10:56:33 2017
      write: io=74000KB, bw=252500B/s, iops=30, runt=300102msec
        slat (usec): min=10, max=87, avg=20.14, stdev= 6.60
        clat (msec): min=2, max=990, avg=32.42, stdev=64.20
         lat (msec): min=2, max=990, avg=32.44, stdev=64.20
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   18], 80.00th=[   28], 90.00th=[   76], 95.00th=[  165],
         | 99.00th=[  326], 99.50th=[  400], 99.90th=[  594], 99.95th=[  644],
         | 99.99th=[  988]
        bw (KB  /s): min=   16, max=  688, per=12.77%, avg=248.47, stdev=159.62
        lat (msec) : 4=0.59%, 10=36.74%, 20=36.21%, 50=13.94%, 100=4.17%
        lat (msec) : 250=6.34%, 500=1.77%, 750=0.23%, 1000=0.02%
      cpu          : usr=0.04%, sys=0.07%, ctx=9258, majf=0, minf=12
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=9250/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468118: Mon Mar 27 10:56:33 2017
      write: io=72232KB, bw=246495B/s, iops=30, runt=300069msec
        slat (usec): min=9, max=116, avg=20.42, stdev= 6.69
        clat (msec): min=2, max=1217, avg=33.21, stdev=63.28
         lat (msec): min=2, max=1217, avg=33.23, stdev=63.28
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   14],
         | 70.00th=[   19], 80.00th=[   31], 90.00th=[   83], 95.00th=[  169],
         | 99.00th=[  322], 99.50th=[  388], 99.90th=[  553], 99.95th=[  594],
         | 99.99th=[ 1221]
        bw (KB  /s): min=   15, max=  752, per=12.46%, avg=242.55, stdev=153.01
        lat (msec) : 4=0.64%, 10=36.21%, 20=35.22%, 50=14.54%, 100=4.56%
        lat (msec) : 250=6.82%, 500=1.86%, 750=0.13%, 2000=0.01%
      cpu          : usr=0.02%, sys=0.08%, ctx=9040, majf=0, minf=11
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=9029/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468119: Mon Mar 27 10:56:33 2017
      write: io=71344KB, bw=243437B/s, iops=29, runt=300103msec
        slat (usec): min=9, max=129, avg=19.82, stdev= 6.31
        clat (msec): min=2, max=898, avg=33.63, stdev=66.03
         lat (msec): min=2, max=898, avg=33.65, stdev=66.03
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   13],
         | 70.00th=[   18], 80.00th=[   29], 90.00th=[   82], 95.00th=[  174],
         | 99.00th=[  334], 99.50th=[  404], 99.90th=[  570], 99.95th=[  627],
         | 99.99th=[  898]
        bw (KB  /s): min=   16, max=  798, per=12.33%, avg=240.00, stdev=160.43
        lat (msec) : 4=0.58%, 10=36.82%, 20=35.71%, 50=13.58%, 100=4.54%
        lat (msec) : 250=6.31%, 500=2.20%, 750=0.24%, 1000=0.01%
      cpu          : usr=0.03%, sys=0.08%, ctx=8929, majf=0, minf=11
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=8918/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-write: (groupid=0, jobs=1): err= 0: pid=468120: Mon Mar 27 10:56:33 2017
      write: io=69112KB, bw=235822B/s, iops=28, runt=300102msec
        slat (usec): min=10, max=175, avg=20.09, stdev= 6.68
        clat (msec): min=2, max=1280, avg=34.71, stdev=68.95
         lat (msec): min=2, max=1280, avg=34.73, stdev=68.95
        clat percentiles (msec):
         |  1.00th=[    5],  5.00th=[    6], 10.00th=[    7], 20.00th=[    9],
         | 30.00th=[   10], 40.00th=[   11], 50.00th=[   12], 60.00th=[   14],
         | 70.00th=[   19], 80.00th=[   30], 90.00th=[   86], 95.00th=[  176],
         | 99.00th=[  343], 99.50th=[  420], 99.90th=[  603], 99.95th=[  685],
         | 99.99th=[ 1287]
        bw (KB  /s): min=   16, max=  752, per=11.94%, avg=232.45, stdev=152.69
        lat (msec) : 4=0.57%, 10=35.06%, 20=36.36%, 50=14.37%, 100=4.65%
        lat (msec) : 250=6.44%, 500=2.30%, 750=0.22%, 1000=0.01%, 2000=0.02%
      cpu          : usr=0.03%, sys=0.07%, ctx=8643, majf=0, minf=11
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=0/w=8639/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    
    Run status group 0 (all jobs):
      WRITE: io=584216KB, aggrb=1946KB/s, minb=230KB/s, maxb=256KB/s, mint=300069msec, maxt=300103msec
    but read looks good

    Code:
    random-read: (g=0): rw=randread, bs=8K-8K/8K-8K/8K-8K, ioengine=libaio, iodepth=1
    ...
    fio-2.13
    Starting 8 processes
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    random-read: Laying out IO file(s) (1 file(s) / 1024MB)
    Jobs: 8 (f=8): [r(8)] [100.0% done] [204.8MB/0KB/0KB /s] [26.3K/0/0 iops] [eta 00m:00s]
    random-read: (groupid=0, jobs=1): err= 0: pid=466737: Mon Mar 27 10:47:33 2017
      read : io=6633.4MB, bw=22642KB/s, iops=2830, runt=300001msec
        slat (usec): min=6, max=393, avg=13.01, stdev= 3.57
        clat (usec): min=103, max=285454, avg=338.18, stdev=1401.74
         lat (usec): min=112, max=285464, avg=351.36, stdev=1401.78
        clat percentiles (usec):
         |  1.00th=[  171],  5.00th=[  183], 10.00th=[  191], 20.00th=[  201],
         | 30.00th=[  219], 40.00th=[  258], 50.00th=[  278], 60.00th=[  298],
         | 70.00th=[  318], 80.00th=[  342], 90.00th=[  398], 95.00th=[  462],
         | 99.00th=[  852], 99.50th=[ 1896], 99.90th=[12480], 99.95th=[17280],
         | 99.99th=[62208]
        bw (KB  /s): min= 3232, max=30272, per=12.43%, avg=22633.21, stdev=4542.55
        lat (usec) : 250=37.77%, 500=58.50%, 750=2.52%, 1000=0.40%
        lat (msec) : 2=0.32%, 4=0.16%, 10=0.17%, 20=0.12%, 50=0.02%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=0.99%, sys=4.27%, ctx=849418, majf=0, minf=104
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=849071/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466738: Mon Mar 27 10:47:33 2017
      read : io=6700.5MB, bw=22871KB/s, iops=2858, runt=300001msec
        slat (usec): min=6, max=567, avg=13.00, stdev= 3.61
        clat (usec): min=107, max=234255, avg=334.65, stdev=1323.11
         lat (usec): min=116, max=234275, avg=347.83, stdev=1323.16
        clat percentiles (usec):
         |  1.00th=[  171],  5.00th=[  183], 10.00th=[  189], 20.00th=[  201],
         | 30.00th=[  219], 40.00th=[  258], 50.00th=[  278], 60.00th=[  294],
         | 70.00th=[  314], 80.00th=[  342], 90.00th=[  390], 95.00th=[  446],
         | 99.00th=[  852], 99.50th=[ 1880], 99.90th=[12608], 99.95th=[17536],
         | 99.99th=[52992]
        bw (KB  /s): min= 4192, max=29360, per=12.56%, avg=22862.49, stdev=4520.73
        lat (usec) : 250=37.87%, 500=58.99%, 750=1.94%, 1000=0.38%
        lat (msec) : 2=0.33%, 4=0.16%, 10=0.17%, 20=0.12%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%
      cpu          : usr=1.03%, sys=4.29%, ctx=858044, majf=0, minf=54
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=857656/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466739: Mon Mar 27 10:47:33 2017
      read : io=6775.7MB, bw=23127KB/s, iops=2890, runt=300001msec
        slat (usec): min=6, max=1190, avg=13.03, stdev= 3.77
        clat (usec): min=108, max=292723, avg=330.74, stdev=1446.30
         lat (usec): min=117, max=292737, avg=343.95, stdev=1446.33
        clat percentiles (usec):
         |  1.00th=[  169],  5.00th=[  181], 10.00th=[  189], 20.00th=[  199],
         | 30.00th=[  211], 40.00th=[  239], 50.00th=[  270], 60.00th=[  290],
         | 70.00th=[  310], 80.00th=[  338], 90.00th=[  390], 95.00th=[  454],
         | 99.00th=[  836], 99.50th=[ 1816], 99.90th=[12480], 99.95th=[18560],
         | 99.99th=[59136]
        bw (KB  /s): min= 5456, max=30304, per=12.70%, avg=23120.34, stdev=4770.42
        lat (usec) : 250=42.88%, 500=53.73%, 750=2.22%, 1000=0.38%
        lat (msec) : 2=0.32%, 4=0.16%, 10=0.16%, 20=0.11%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=0.98%, sys=4.40%, ctx=867625, majf=0, minf=42
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=867281/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466740: Mon Mar 27 10:47:33 2017
      read : io=6585.8MB, bw=22479KB/s, iops=2809, runt=300001msec
        slat (usec): min=6, max=529, avg=12.97, stdev= 3.69
        clat (usec): min=103, max=416495, avg=340.76, stdev=1541.99
         lat (usec): min=111, max=416509, avg=353.91, stdev=1542.02
        clat percentiles (usec):
         |  1.00th=[  171],  5.00th=[  183], 10.00th=[  189], 20.00th=[  201],
         | 30.00th=[  221], 40.00th=[  258], 50.00th=[  278], 60.00th=[  298],
         | 70.00th=[  318], 80.00th=[  346], 90.00th=[  406], 95.00th=[  482],
         | 99.00th=[  860], 99.50th=[ 1992], 99.90th=[12096], 99.95th=[16768],
         | 99.99th=[51456]
        bw (KB  /s): min= 2944, max=28368, per=12.34%, avg=22470.13, stdev=4666.43
        lat (usec) : 250=37.39%, 500=58.25%, 750=3.13%, 1000=0.40%
        lat (msec) : 2=0.33%, 4=0.16%, 10=0.18%, 20=0.12%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=1.07%, sys=4.15%, ctx=843360, majf=0, minf=39
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=842969/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466741: Mon Mar 27 10:47:33 2017
      read : io=6669.4MB, bw=22765KB/s, iops=2845, runt=300001msec
        slat (usec): min=6, max=573, avg=12.99, stdev= 3.62
        clat (usec): min=111, max=361292, avg=336.28, stdev=1537.57
         lat (usec): min=120, max=361301, avg=349.46, stdev=1537.60
        clat percentiles (usec):
         |  1.00th=[  169],  5.00th=[  183], 10.00th=[  189], 20.00th=[  199],
         | 30.00th=[  213], 40.00th=[  245], 50.00th=[  270], 60.00th=[  290],
         | 70.00th=[  314], 80.00th=[  338], 90.00th=[  394], 95.00th=[  462],
         | 99.00th=[  852], 99.50th=[ 1992], 99.90th=[12736], 99.95th=[17792],
         | 99.99th=[60160]
        bw (KB  /s): min= 3936, max=29776, per=12.50%, avg=22757.01, stdev=4739.17
        lat (usec) : 250=41.15%, 500=55.05%, 750=2.59%, 1000=0.39%
        lat (msec) : 2=0.32%, 4=0.16%, 10=0.17%, 20=0.12%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=1.07%, sys=4.22%, ctx=854048, majf=0, minf=70
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=853677/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466742: Mon Mar 27 10:47:33 2017
      read : io=6761.0MB, bw=23077KB/s, iops=2884, runt=300001msec
        slat (usec): min=6, max=405, avg=12.89, stdev= 3.52
        clat (usec): min=103, max=374092, avg=331.64, stdev=1584.80
         lat (usec): min=112, max=374105, avg=344.70, stdev=1584.84
        clat percentiles (usec):
         |  1.00th=[  169],  5.00th=[  181], 10.00th=[  187], 20.00th=[  197],
         | 30.00th=[  207], 40.00th=[  229], 50.00th=[  266], 60.00th=[  286],
         | 70.00th=[  310], 80.00th=[  338], 90.00th=[  390], 95.00th=[  462],
         | 99.00th=[  820], 99.50th=[ 1848], 99.90th=[12480], 99.95th=[17792],
         | 99.99th=[63744]
        bw (KB  /s): min= 6032, max=30640, per=12.67%, avg=23069.45, stdev=4839.61
        lat (usec) : 250=44.92%, 500=51.36%, 750=2.58%, 1000=0.37%
        lat (msec) : 2=0.29%, 4=0.16%, 10=0.17%, 20=0.11%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=1.01%, sys=4.30%, ctx=865719, majf=0, minf=21
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=865408/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466743: Mon Mar 27 10:47:33 2017
      read : io=6659.4MB, bw=22731KB/s, iops=2841, runt=300001msec
        slat (usec): min=7, max=624, avg=13.09, stdev= 3.65
        clat (usec): min=110, max=406051, avg=336.72, stdev=1549.74
         lat (usec): min=119, max=406064, avg=349.98, stdev=1549.78
        clat percentiles (usec):
         |  1.00th=[  169],  5.00th=[  183], 10.00th=[  189], 20.00th=[  199],
         | 30.00th=[  213], 40.00th=[  247], 50.00th=[  274], 60.00th=[  294],
         | 70.00th=[  314], 80.00th=[  342], 90.00th=[  394], 95.00th=[  462],
         | 99.00th=[  852], 99.50th=[ 1992], 99.90th=[12480], 99.95th=[17536],
         | 99.99th=[59648]
        bw (KB  /s): min= 2592, max=30064, per=12.48%, avg=22721.81, stdev=4705.44
        lat (usec) : 250=40.61%, 500=55.66%, 750=2.52%, 1000=0.40%
        lat (msec) : 2=0.32%, 4=0.16%, 10=0.18%, 20=0.12%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=0.99%, sys=4.31%, ctx=852733, majf=0, minf=104
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=852398/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    random-read: (groupid=0, jobs=1): err= 0: pid=466744: Mon Mar 27 10:47:33 2017
      read : io=6553.3MB, bw=22368KB/s, iops=2796, runt=300001msec
        slat (usec): min=6, max=535, avg=13.05, stdev= 3.59
        clat (usec): min=3, max=426544, avg=342.44, stdev=1597.32
         lat (usec): min=110, max=426555, avg=355.67, stdev=1597.35
        clat percentiles (usec):
         |  1.00th=[  171],  5.00th=[  183], 10.00th=[  191], 20.00th=[  201],
         | 30.00th=[  219], 40.00th=[  258], 50.00th=[  278], 60.00th=[  298],
         | 70.00th=[  318], 80.00th=[  346], 90.00th=[  402], 95.00th=[  478],
         | 99.00th=[  852], 99.50th=[ 2024], 99.90th=[12480], 99.95th=[17792],
         | 99.99th=[57088]
        bw (KB  /s): min= 2736, max=28944, per=12.29%, avg=22366.13, stdev=4610.85
        lat (usec) : 4=0.01%, 100=0.01%, 250=37.41%, 500=58.41%, 750=2.97%
        lat (usec) : 1000=0.39%
        lat (msec) : 2=0.32%, 4=0.16%, 10=0.18%, 20=0.12%, 50=0.03%
        lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%
      cpu          : usr=1.05%, sys=4.17%, ctx=839148, majf=0, minf=64
      IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued    : total=r=838811/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    
    Run status group 0 (all jobs):
       READ: io=53338MB, aggrb=182059KB/s, minb=22368KB/s, maxb=23127KB/s, mint=300001msec, maxt=300001msec

  2. #2

    Re: CephFS & Oracle database, 8k blocks, slow write

    While I'm not sure it's the root cause of the issue, I would probably not
    use this kind of storage for an oft-written relational database. In order
    to guarantee the system is as reliable as claimed (by default three
    replicas of all data in a pool) every write confirms that all replicas
    have, at least in their transaction log, the write committed before
    letting the client move on to do something else. This means that you're
    waiting for three writes, not just one, and across multiple OSDs
    (probably/preferably multiple machines, if not multiple racks/datacenters)
    in order for the redundancy to be complete. If you have something that
    writes a lot of little things, you're going to be penalized in overhead.

    Ceph is great for things with a lot of big objects, especially static
    ones, because the write penalty is for the object, not for a block on a
    disk, so one big object may have one bit of overhead instead of a million
    blocks to make up on object each incurring their own bi of overhead.

    None of that is meant to claim that Ceph cannot do things better, or
    cannot be used for a lot of little writes, but you're not going to be able
    to get the same performance because of the redundancy guaranteed as you
    would from a system that is tuned for fast writes or block-based access.

    --
    Good luck.

    If you find this post helpful and are logged into the web interface,
    show your appreciation and click on the star below...

  3. #3

    Re: CephFS & Oracle database, 8k blocks, slow write

    The explanation, why Ceph 'out-of-the-box' requires tuning almost per each paticular case, is in the Sbastien Han's presentation. However, the SES cluster performance might be tuned to better. As for me, I see these approaches to try:

    1. Tune OS and SES parameters for small random operations. Link 1, Link 2, Link 3.

    2. Switch to experimental BlueStore backend. That might give 40% IOPS gain on the same hardware. Link 1, Link 2, Link 3.

  4. #4

    Re: CephFS & Oracle database, 8k blocks, slow write

    I made 2 pools for CepFS : ceph_data (pg num 2048, size 2), ceph_metadata (pg num 2048, size 2)
    I wrote ~7 TB
    I see the use that Openattic showed:
    ceph_data 25%
    ceph_metadata 0.0%

    I have idea to re-create pools, changing pg nums to another percentage.
    ceph_data 4096
    ceph_metadata 1024

    Is it good idea to make a metadata pool smaller?

    thanks

  5. #5

    Re: CephFS & Oracle database, 8k blocks, slow write

    Quote Originally Posted by mkov871 View Post
    ceph_data 4096
    ceph_metadata 1024
    Is it good idea to make a metadata pool smaller?
    Yes, the metadata pool usually has small amount of data stored. I'm suspecting it will be OK to start with 32 or 64 PGs for ceph_metadata, and double the PG number each case the pool utilization will be high (or will be proposed by ' ceph -s '). The PG number increment can be done on the fly.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •