Wednesday, May 25, 2011

LSI MegaRAID & SATA SSDs

So, continuing from my last post, we had such great success with our first stab at an SSD/FC disk array, that we wanted more. This time we plan on using these arrays for not just replica datastores, but for OS / persistent data volumes as well.

We ordered three new 2U SuperMicro systems (configured/built by New Tech Solutions); 1 of these is for development, the other 2 are production.

I will detail the specs on these machines in my next article, but for this post, our development system looks something like this:
8 GB RAM (12 GB system with sparring mode); 24 logical CPUs (2 x Intel E5645, 8 cores each); vanilla 2.6.36.2; LSI MegaRAID SAS 9280-24i4e (FW: 2.120.43-1223); (3) CTFDDAC256MAG -> RAID5

I wanted to look at "raw" performance numbers using the MegaRAID adapter with the SSDs, and the different attributes for a RAID5 volume (strip size, read cache, write cache, etc.). We also purchased the FastPath license for these systems which supposedly promises better IOPS performance. I tested using the FIO tool; 4K IO size and either random-read or random-write.


Initially RAID5, 64KB stripe size, no read cache, no write cache:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLDAdd -R5[245:1,245:2,245:3] WT NORA -a0

Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [340M/0K /s] [85K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=14573
read : io=19,539MB, bw=326MB/s, iops=83,364, runt= 60001msec
slat (usec): min=3, max=140, avg= 5.04, stdev= 5.95
clat (usec): min=301, max=8,230, avg=761.32, stdev=147.70
bw (KB/s) : min=298136, max=346048, per=100.00%, avg=333468.71, stdev=10766.04
cpu : usr=16.11%, sys=51.23%, ctx=198046, majf=0, minf=4326
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=5001929/0, short=0/0
lat (usec): 500=5.22%, 750=37.49%, 1000=54.79%
lat (msec): 2=2.49%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=19,539MB, aggrb=326MB/s, minb=333MB/s, maxb=333MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=4993426/0, merge=0/0, ticks=3016385/0, in_queue=3015085, util=99.50%

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/55M /s] [0/14K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=14578
write: io=3,213MB, bw=54,826KB/s, iops=13,706, runt= 60003msec
slat (usec): min=3, max=54, avg= 6.23, stdev= 3.77
clat (msec): min=1, max=14, avg= 4.66, stdev= 1.18
bw (KB/s) : min=53664, max=55848, per=100.04%, avg=54846.86, stdev=443.66
cpu : usr=4.26%, sys=10.83%, ctx=143019, majf=0, minf=3893
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/822432, short=0/0

lat (msec): 2=0.14%, 4=32.40%, 10=67.46%, 20=0.01%

Run status group 0 (all jobs):
WRITE: io=3,213MB, aggrb=54,826KB/s, minb=56,141KB/s, maxb=56,141KB/s, mint=60003msec, maxt=60003msec

Disk stats (read/write):
sdb: ios=2/820949, merge=0/0, ticks=0/3788757, in_queue=3788666, util=99.83%


Turn read-ahead on:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp RA -L1 -a0

Set Read Policy to ReadAhead on Adapter 0, VD 1 (target id: 1) success

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [274M/0K /s] [68K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=14598
read : io=15,889MB, bw=265MB/s, iops=67,792, runt= 60001msec
slat (usec): min=3, max=472, avg= 4.99, stdev= 7.50
clat (usec): min=305, max=8,823, avg=937.73, stdev=166.98
bw (KB/s) : min=252448, max=275832, per=100.01%, avg=271188.24, stdev=3913.90
cpu : usr=12.16%, sys=39.26%, ctx=163051, majf=0, minf=4288
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=4067634/0, short=0/0
lat (usec): 500=2.68%, 750=11.75%, 1000=42.42%
lat (msec): 2=43.14%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=15,889MB, aggrb=265MB/s, minb=271MB/s, maxb=271MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=4060446/0, merge=0/0, ticks=2946651/0, in_queue=2945570, util=96.54%


Turn adaptive read-ahead on:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp ADRA -L1 -a0

Set Read Policy to Adaptive ReadAhead on Adapter 0, VD 1 (target id: 1) success

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [274M/0K /s] [69K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=14601
read : io=15,938MB, bw=266MB/s, iops=67,999, runt= 60001msec
slat (usec): min=3, max=155, avg= 4.95, stdev= 7.43
clat (usec): min=197, max=9,330, avg=934.95, stdev=166.96
bw (KB/s) : min=254872, max=275872, per=100.01%, avg=272026.69, stdev=3426.24
cpu : usr=11.89%, sys=39.05%, ctx=165357, majf=0, minf=4268
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=4080027/0, short=0/0
lat (usec): 250=0.01%, 500=2.73%, 750=11.65%, 1000=43.68%
lat (msec): 2=41.93%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=15,938MB, aggrb=266MB/s, minb=272MB/s, maxb=272MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=4072842/0, merge=0/0, ticks=2951169/0, in_queue=2950138, util=96.58%


Turn write-back cache on:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -L1 -a0

Set Write Policy to WriteBack on Adapter 0, VD 1 (target id: 1) success

Exit Code: 0x00

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/46M /s] [0/12K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=14612
write: io=2,722MB, bw=46,451KB/s, iops=11,612, runt= 60005msec
slat (usec): min=3, max=97, avg= 6.94, stdev= 4.81
clat (usec): min=319, max=121K, avg=5502.21, stdev=2138.39
bw (KB/s) : min=34994, max=74472, per=100.06%, avg=46477.89, stdev=2958.57
cpu : usr=2.84%, sys=9.90%, ctx=94985, majf=0, minf=3830
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/696820, short=0/0
lat (usec): 500=0.01%, 750=0.04%, 1000=0.23%
lat (msec): 2=2.27%, 4=20.39%, 10=76.54%, 20=0.52%, 250=0.01%

Run status group 0 (all jobs):
WRITE: io=2,722MB, aggrb=46,450KB/s, minb=47,565KB/s, maxb=47,565KB/s, mint=60005msec, maxt=60005msec

Disk stats (read/write):
sdb: ios=6/695544, merge=0/0, ticks=1/3793599, in_queue=3793665, util=99.83%


Now enabling FastPath:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -ELF -Applykey key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX -a0

Successfully applied the Activation key. Please restart the system for the changes to take effect.

FW error description:
To complete the requested operation, please reboot the system.

Exit Code: 0x59

Reboot... FastPath enabled, NORA, WT:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -ELF -ControllerFeatures -a0

Activated Advanced Software Options
---------------------------

Advanced Software Option : MegaRAID FastPath
Time Remaining : Unlimited

Advanced Software Option : MegaRAID RAID6
Time Remaining : Unlimited

Advanced Software Option : MegaRAID RAID5
Time Remaining : Unlimited


Re-host Information
--------------------

Needs Re-hosting : No

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [344M/0K /s] [86K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4001
read : io=19,689MB, bw=328MB/s, iops=84,005, runt= 60001msec
slat (usec): min=3, max=147, avg= 4.99, stdev= 5.82
clat (usec): min=306, max=7,969, avg=755.55, stdev=144.88
bw (KB/s) : min=292720, max=350320, per=100.00%, avg=336026.69, stdev=11191.43
cpu : usr=16.16%, sys=50.87%, ctx=204438, majf=0, minf=4366
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=5040440/0, short=0/0
lat (usec): 500=5.11%, 750=39.35%, 1000=53.38%
lat (msec): 2=2.16%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=19,689MB, aggrb=328MB/s, minb=336MB/s, maxb=336MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=5031387/0, merge=0/0, ticks=3048876/0, in_queue=3047559, util=99.53%

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/62M /s] [0/16K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4007
write: io=3,657MB, bw=62,412KB/s, iops=15,602, runt= 60005msec
slat (usec): min=3, max=59, avg= 6.34, stdev= 3.15
clat (usec): min=758, max=13,638, avg=4093.62, stdev=1405.59
bw (KB/s) : min=58856, max=64280, per=100.03%, avg=62427.56, stdev=832.47
cpu : usr=5.45%, sys=12.53%, ctx=205523, majf=0, minf=3926
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/936258, short=0/0
lat (usec): 1000=0.01%
lat (msec): 2=3.66%, 4=47.02%, 10=49.30%, 20=0.01%

Run status group 0 (all jobs):
WRITE: io=3,657MB, aggrb=62,411KB/s, minb=63,909KB/s, maxb=63,909KB/s, mint=60005msec, maxt=60005msec

Disk stats (read/write):
sdb: ios=5/934504, merge=0/0, ticks=0/3788023, in_queue=3787869, util=99.82%


Now with a RAID5 8KB stripe, NORA, WT, FastPath.
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0

Adapter 0: Deleted Virtual Drive-1(target id-1)

Exit Code: 0x00

apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLDAdd -R5[245:1,245:2,245:3] WT NORA -strpsz8 -a0

Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [340M/0K /s] [85K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4083
read : io=19,528MB, bw=325MB/s, iops=83,318, runt= 60001msec
slat (usec): min=3, max=176, avg= 5.06, stdev= 5.98
clat (usec): min=304, max=8,223, avg=761.79, stdev=149.12
bw (KB/s) : min=282992, max=345560, per=100.00%, avg=333283.97, stdev=11634.94
cpu : usr=15.29%, sys=51.53%, ctx=199083, majf=0, minf=4357
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=4999217/0, short=0/0
lat (usec): 500=5.26%, 750=37.42%, 1000=54.64%
lat (msec): 2=2.67%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=19,528MB, aggrb=325MB/s, minb=333MB/s, maxb=333MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=4990350/0, merge=0/0, ticks=3019107/0, in_queue=3017794, util=99.43%

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/63M /s] [0/16K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4088
write: io=3,664MB, bw=62,521KB/s, iops=15,630, runt= 60005msec
slat (usec): min=3, max=55, avg= 6.41, stdev= 3.16
clat (usec): min=680, max=12,283, avg=4086.43, stdev=1412.30
bw (KB/s) : min=60384, max=64584, per=100.04%, avg=62548.03, stdev=787.59
cpu : usr=5.25%, sys=12.94%, ctx=207933, majf=0, minf=3921
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/937890, short=0/0
lat (usec): 750=0.01%, 1000=0.01%
lat (msec): 2=4.11%, 4=46.66%, 10=49.22%, 20=0.01%

Run status group 0 (all jobs):
WRITE: io=3,664MB, aggrb=62,520KB/s, minb=64,021KB/s, maxb=64,021KB/s, mint=60005msec, maxt=60005msec

Disk stats (read/write):
sdb: ios=6/936182, merge=0/0, ticks=1/3788157, in_queue=3787968, util=99.83%


Now with a RAID5 512KB stripe, NORA, WT, FastPath:
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0

Adapter 0: Deleted Virtual Drive-1(target id-1)

Exit Code: 0x00

apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLDAdd -R5[245:1,245:2,245:3] WT NORA -strpsz512 -a0

Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [342M/0K /s] [86K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4128
read : io=19,622MB, bw=327MB/s, iops=83,720, runt= 60001msec
slat (usec): min=3, max=144, avg= 5.11, stdev= 5.90
clat (usec): min=309, max=8,598, avg=758.01, stdev=144.97
bw (KB/s) : min=305024, max=344864, per=99.99%, avg=334864.07, stdev=11241.95
cpu : usr=16.00%, sys=52.34%, ctx=199592, majf=0, minf=4302
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=5023310/0, short=0/0
lat (usec): 500=4.85%, 750=39.31%, 1000=53.38%
lat (msec): 2=2.45%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=19,622MB, aggrb=327MB/s, minb=335MB/s, maxb=335MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=5014359/0, merge=0/0, ticks=3055952/0, in_queue=3054696, util=99.57%

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/62M /s] [0/15K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4121
write: io=3,630MB, bw=61,948KB/s, iops=15,486, runt= 60004msec
slat (usec): min=3, max=41, avg= 6.24, stdev= 2.96
clat (usec): min=755, max=13,862, avg=4124.50, stdev=1337.97
bw (KB/s) : min=60056, max=64792, per=100.05%, avg=61978.19, stdev=876.92
cpu : usr=4.81%, sys=12.06%, ctx=195535, majf=0, minf=3910
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/929279, short=0/0
lat (usec): 1000=0.01%
lat (msec): 2=2.60%, 4=47.48%, 10=49.91%, 20=0.01%

Run status group 0 (all jobs):
WRITE: io=3,630MB, aggrb=61,947KB/s, minb=63,434KB/s, maxb=63,434KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
sdb: ios=6/927586, merge=0/0, ticks=0/3792792, in_queue=3792684, util=99.83%


Now with a RAID5 64KB stripe, NORA, WT, FastPath, and setting the "Cached" option (instead of default "Direct" mode -- not sure exactly what this means?):
apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLdDel -L1 -a0

Adapter 0: Deleted Virtual Drive-1(target id-1)

Exit Code: 0x00

apricot ~ # /opt/MegaRAID/MegaCli/MegaCli64 -CfgLDAdd -R5[245:1,245:2,245:3] WT NORA Cached -a0

Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00

Random read test:
apricot ~ # fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [234M/0K /s] [59K/0 iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4153
read : io=13,625MB, bw=227MB/s, iops=58,131, runt= 60001msec
slat (usec): min=3, max=133, avg= 4.78, stdev= 4.93
clat (usec): min=323, max=8,419, avg=1094.91, stdev=160.15
bw (KB/s) : min=216768, max=237352, per=100.00%, avg=232534.52, stdev=3063.09
cpu : usr=12.05%, sys=35.69%, ctx=200778, majf=0, minf=4351
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=3487946/0, short=0/0
lat (usec): 500=0.01%, 750=2.97%, 1000=21.33%
lat (msec): 2=75.68%, 4=0.01%, 10=0.01%

Run status group 0 (all jobs):
READ: io=13,625MB, aggrb=227MB/s, minb=233MB/s, maxb=233MB/s, mint=60001msec, maxt=60001msec

Disk stats (read/write):
sdb: ios=3481717/0, merge=0/0, ticks=3351413/0, in_queue=3350535, util=99.55%

Random write test:
apricot ~ # fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --runtime=60 --name=/dev/sdb
/dev/sdb: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0K/63M /s] [0/16K iops] [eta 00m:00s]
/dev/sdb: (groupid=0, jobs=1): err= 0: pid=4158
write: io=3,650MB, bw=62,295KB/s, iops=15,573, runt= 60004msec
slat (usec): min=4, max=56, avg= 6.43, stdev= 3.07
clat (usec): min=691, max=12,874, avg=4101.26, stdev=1420.81
bw (KB/s) : min=60544, max=64520, per=100.03%, avg=62313.63, stdev=734.21
cpu : usr=5.27%, sys=12.74%, ctx=206551, majf=0, minf=3936
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued r/w: total=0/934486, short=0/0
lat (usec): 750=0.01%, 1000=0.01%
lat (msec): 2=3.90%, 4=46.66%, 10=49.42%, 20=0.01%

Run status group 0 (all jobs):
WRITE: io=3,650MB, aggrb=62,294KB/s, minb=63,789KB/s, maxb=63,789KB/s, mint=60004msec, maxt=60004msec

Disk stats (read/write):
sdb: ios=8/932805, merge=0/0, ticks=2/3786854, in_queue=3786689, util=99.83%


Conclusion
Here is a table with the results summarized (all configurations are (3) SATA SSDs + RAID5):

SetupRandom Read (4K IOPS)Random Write (4K IOPS)
WT, NORA, 64K Strip, Direct85K14K
WT, RA, 64K Strip, Direct68K14K
WT, ADRA, 64K Strip, Direct69K14K
WB, NORA, 64K Strip, Direct85K12K
WT, NORA, 64K Strip, Direct, FastPath86K16K
WT, NORA, 8K Strip, Direct, FastPath85K16K
WT, NORA, 512K Strip, Direct, FastPath86K15K
WT, NORA, 64K Strip, Cached, FastPath59K16K


So, from the numbers above we can definitely see that not using read (adaptive read-ahead or read-ahead) and write (write-back) cache is better. FastPath didn't seem to make much of a difference -- maybe 1K? If I tested multiple times and averaged it would probably come out the same though.

Strip size doesn't seem to play much on performance either -- possibly with an IO size other than 4K it may.

Look for another article soon on using these new (24) slot systems with SCST...

2 comments:

  1. Both of the read-ahead modes cause the controller to read a bit more than the default whenever it seeks to somewhere. If your workload is completely random, this will always degrade performance. If the workload is mostly sequential, it should improve it, sometimes quite significantly.

    If your workload is completely random all the time, these tests are fair. But a lot of use-cases involve periodic longer sequential reads, and those should benefit from read-ahead. You might want to try this out using a tool that does large sequential read/write tests, such as bonnie++, if you want to capture that aspect of things.

    For SSD, I don't think this really matters very much though, and you may not see any improvement even for them by tweaking read-ahead up. I wanted to clarify the expected situation on regular drives though.

    ReplyDelete
  2. Great data points. I have a feeling that the lower stripe sizes make even more difference with a higher drive count, do you agree? I am thinking of using this adapter with 6x460GB Intel 520's, and this makes me feel more at ease that it will work out.

    ReplyDelete