So, I did a refresh on my home machine recently, or really just an entirely new machine... I picked up a used Dell Precision T7500 workstation (24 GB memory, 2 x Xeon W5590 processors). I also bought a used Fusion-io ioDrive 160 GB SLC flash memory device. I knew it was going to be fast, but was surprised at just how fast with such a little card.
I'm running Fedora 21 "Workstation" on this system. The drive for this card, called "VSL" is available from fusionio.com but you need to create an account first to access it. It also appears there is a newer version of the driver/firmware if you pay for a support contract. I used the 2.3.11 version of driver, and it lists supporting Fedora 17. The driver is written for older kernels, so I had to change it a bit to work with 3.x -- let me know if you're interested in the changes needed for newer kernels.
Anyhow, here is a quick peak at the performance numbers on this system using the fio tool...
--snip--
# fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [750.3MB/0KB/0KB /s] [192K/0/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1406: Sat Jan 17 11:00:38 2015
read : io=10240MB, bw=763767KB/s, iops=190941, runt= 13729msec
slat (usec): min=1, max=172, avg= 2.85, stdev= 2.61
clat (usec): min=199, max=3604, avg=331.24, stdev=77.36
lat (usec): min=201, max=3625, avg=334.22, stdev=77.33
clat percentiles (usec):
| 1.00th=[ 245], 5.00th=[ 253], 10.00th=[ 270], 20.00th=[ 294],
| 30.00th=[ 318], 40.00th=[ 326], 50.00th=[ 330], 60.00th=[ 330],
| 70.00th=[ 334], 80.00th=[ 350], 90.00th=[ 402], 95.00th=[ 426],
| 99.00th=[ 454], 99.50th=[ 462], 99.90th=[ 540], 99.95th=[ 2544],
| 99.99th=[ 2992]
bw (KB /s): min=673840, max=768568, per=100.00%, avg=763737.48, stdev=18102.33
lat (usec) : 250=3.48%, 500=96.39%, 750=0.04%, 1000=0.01%
lat (msec) : 2=0.03%, 4=0.06%
cpu : usr=23.24%, sys=62.81%, ctx=254638, majf=0, minf=664
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=2621440/w=0/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=10240MB, aggrb=763767KB/s, minb=763767KB/s, maxb=763767KB/s, mint=13729msec, maxt=13729msec
Disk stats (read/write):
fioa: ios=2607327/0, merge=31/0, ticks=815401/0, in_queue=815145, util=99.34%
--snip--
--snip--
# fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0KB/747.3MB/0KB /s] [0/191K/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1433: Sat Jan 17 11:01:49 2015
write: io=10240MB, bw=746955KB/s, iops=186738, runt= 14038msec
slat (usec): min=1, max=192, avg= 3.33, stdev= 2.83
clat (usec): min=192, max=3048, avg=338.28, stdev=70.32
lat (usec): min=194, max=3052, avg=341.74, stdev=70.41
clat percentiles (usec):
| 1.00th=[ 262], 5.00th=[ 282], 10.00th=[ 298], 20.00th=[ 310],
| 30.00th=[ 318], 40.00th=[ 322], 50.00th=[ 330], 60.00th=[ 334],
| 70.00th=[ 342], 80.00th=[ 366], 90.00th=[ 398], 95.00th=[ 414],
| 99.00th=[ 454], 99.50th=[ 478], 99.90th=[ 1144], 99.95th=[ 2024],
| 99.99th=[ 2800]
bw (KB /s): min=660624, max=765872, per=99.99%, avg=746907.14, stdev=25759.49
lat (usec) : 250=0.32%, 500=99.39%, 750=0.18%, 1000=0.01%
lat (msec) : 2=0.06%, 4=0.05%
cpu : usr=23.67%, sys=68.75%, ctx=110028, majf=0, minf=431
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=2621440/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=10240MB, aggrb=746955KB/s, minb=746955KB/s, maxb=746955KB/s, mint=14038msec, maxt=14038msec
Disk stats (read/write):
fioa: ios=109/2595463, merge=110/28, ticks=9/814160, in_queue=813744, util=99.39%
--snip--
So, in both of those tests, the first being 100% random, 100% read with 4K IOs, I'm getting 192K (192,000) IOPS! And in the second test its 100% random, 100% write with 4K IOs: 191K (191,000) IOPS! That's pretty fast for such a little package... just a single PCIe flash device.
And for some sequential IO tests with a much larger IO size...
--snip--
# fio --bs=4m --direct=1 --rw=read --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [R] [92.9% done] [800.0MB/0KB/0KB /s] [200/0/0 iops] [eta 00m:01s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1452: Sat Jan 17 11:06:41 2015
read : io=10240MB, bw=819392KB/s, iops=200, runt= 12797msec
slat (usec): min=110, max=19743, avg=4959.85, stdev=8374.12
clat (msec): min=92, max=392, avg=312.69, stdev=20.35
lat (msec): min=92, max=411, avg=317.65, stdev=18.69
clat percentiles (msec):
| 1.00th=[ 212], 5.00th=[ 302], 10.00th=[ 302], 20.00th=[ 302],
| 30.00th=[ 322], 40.00th=[ 322], 50.00th=[ 322], 60.00th=[ 322],
| 70.00th=[ 322], 80.00th=[ 322], 90.00th=[ 322], 95.00th=[ 322],
| 99.00th=[ 322], 99.50th=[ 334], 99.90th=[ 392], 99.95th=[ 392],
| 99.99th=[ 392]
bw (KB /s): min=442593, max=835584, per=97.73%, avg=800802.08, stdev=80018.74
lat (msec) : 100=0.12%, 250=1.33%, 500=98.55%
cpu : usr=0.07%, sys=3.44%, ctx=1028, majf=0, minf=65543
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=2560/w=0/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=10240MB, aggrb=819392KB/s, minb=819392KB/s, maxb=819392KB/s, mint=12797msec, maxt=12797msec
Disk stats (read/write):
fioa: ios=20256/0, merge=0/0, ticks=1799659/0, in_queue=1806118, util=99.29%
--snip--
--snip--
# fio --bs=4m --direct=1 --rw=write --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0KB/767.3MB/0KB /s] [0/191/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1448: Sat Jan 17 11:06:11 2015
write: io=10240MB, bw=786157KB/s, iops=191, runt= 13338msec
slat (usec): min=124, max=20466, avg=5167.94, stdev=8529.73
clat (msec): min=99, max=412, avg=326.06, stdev=21.06
lat (msec): min=99, max=413, avg=331.23, stdev=19.41
clat percentiles (msec):
| 1.00th=[ 225], 5.00th=[ 314], 10.00th=[ 314], 20.00th=[ 314],
| 30.00th=[ 334], 40.00th=[ 334], 50.00th=[ 334], 60.00th=[ 334],
| 70.00th=[ 334], 80.00th=[ 334], 90.00th=[ 334], 95.00th=[ 334],
| 99.00th=[ 334], 99.50th=[ 351], 99.90th=[ 412], 99.95th=[ 412],
| 99.99th=[ 412]
bw (KB /s): min=407157, max=802816, per=98.28%, avg=772616.08, stdev=74921.31
lat (msec) : 100=0.12%, 250=1.17%, 500=98.71%
cpu : usr=3.31%, sys=2.05%, ctx=1139, majf=0, minf=7
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=2560/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=10240MB, aggrb=786156KB/s, minb=786156KB/s, maxb=786156KB/s, mint=13338msec, maxt=13338msec
Disk stats (read/write):
fioa: ios=59/20405, merge=55/0, ticks=7/1888035, in_queue=1893181, util=98.52%
--snip--
So with 100% sequential, 100% read using 4M IOs we see 800 MB/sec; with same test using writes I'm seeing 767 MB/sec. Pretty fast! I'm not sure where the bottleneck is here... I believe this card is PCIe 2.0 4x so that bus may be the crippler, not sure, I'll have to look into it. Either way, the random IO performance is really where its at, and I am very much impressed.
Saturday, January 17, 2015
Subscribe to:
Post Comments (Atom)
Marc,
ReplyDeletecan you please share your patch to driver/kernel sources, so I would be able to build it for rhel7 3.10.x kernel ? Thanks
PS: please share to matorola /at/ gmail.com
Marc,
ReplyDeleteThanks for the article on the fusion-io card. I was hoping you'd had a chance to get one working with esos. Just about to start researching and any help much appreciated.
Regards
Alasdair Smith
It would actually be pretty easy to support these cards with ESOS, but it would need to be a build-time option for ESOS since the driver for these cards is proprietary. To get it working with my home machine, I had to hack up the driver that is "free" to normal users, but if you have a paid support account with them, you get the "new" driver which I believe supports recent kernel versions.
DeleteI'd be happy to add support for this into ESOS if you can provide the new driver and test with hardware for me.
--Marc
Hi Marc,
DeleteThe latest driver that I could see on the SanDisk website was for fedora 20, would that be a recent enough kernel edition? I would be happy to test on hardware. Feel free to contact me directly on alidsmith /at/ gmail.com
Regards
Alasdair