Saturday, January 17, 2015

Crazy Performance From Something So Small

So, I did a refresh on my home machine recently, or really just an entirely new machine... I picked up a used Dell Precision T7500 workstation (24 GB memory, 2 x Xeon W5590 processors). I also bought a used Fusion-io ioDrive 160 GB SLC flash memory device. I knew it was going to be fast, but was surprised at just how fast with such a little card.

I'm running Fedora 21 "Workstation" on this system. The drive for this card, called "VSL" is available from fusionio.com but you need to create an account first to access it. It also appears there is a newer version of the driver/firmware if you pay for a support contract. I used the 2.3.11 version of driver, and it lists supporting Fedora 17. The driver is written for older kernels, so I had to change it a bit to work with 3.x -- let me know if you're interested in the changes needed for newer kernels.

Anyhow, here is a quick peak at the performance numbers on this system using the fio tool...

--snip--
# fio --bs=4k --direct=1 --rw=randread --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [r] [100.0% done] [750.3MB/0KB/0KB /s] [192K/0/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1406: Sat Jan 17 11:00:38 2015
  read : io=10240MB, bw=763767KB/s, iops=190941, runt= 13729msec
    slat (usec): min=1, max=172, avg= 2.85, stdev= 2.61
    clat (usec): min=199, max=3604, avg=331.24, stdev=77.36
     lat (usec): min=201, max=3625, avg=334.22, stdev=77.33
    clat percentiles (usec):
     |  1.00th=[  245],  5.00th=[  253], 10.00th=[  270], 20.00th=[  294],
     | 30.00th=[  318], 40.00th=[  326], 50.00th=[  330], 60.00th=[  330],
     | 70.00th=[  334], 80.00th=[  350], 90.00th=[  402], 95.00th=[  426],
     | 99.00th=[  454], 99.50th=[  462], 99.90th=[  540], 99.95th=[ 2544],
     | 99.99th=[ 2992]
    bw (KB  /s): min=673840, max=768568, per=100.00%, avg=763737.48, stdev=18102.33
    lat (usec) : 250=3.48%, 500=96.39%, 750=0.04%, 1000=0.01%
    lat (msec) : 2=0.03%, 4=0.06%
  cpu          : usr=23.24%, sys=62.81%, ctx=254638, majf=0, minf=664
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=2621440/w=0/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=10240MB, aggrb=763767KB/s, minb=763767KB/s, maxb=763767KB/s, mint=13729msec, maxt=13729msec

Disk stats (read/write):
  fioa: ios=2607327/0, merge=31/0, ticks=815401/0, in_queue=815145, util=99.34%
--snip--

--snip--
# fio --bs=4k --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [w] [100.0% done] [0KB/747.3MB/0KB /s] [0/191K/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1433: Sat Jan 17 11:01:49 2015
  write: io=10240MB, bw=746955KB/s, iops=186738, runt= 14038msec
    slat (usec): min=1, max=192, avg= 3.33, stdev= 2.83
    clat (usec): min=192, max=3048, avg=338.28, stdev=70.32
     lat (usec): min=194, max=3052, avg=341.74, stdev=70.41
    clat percentiles (usec):
     |  1.00th=[  262],  5.00th=[  282], 10.00th=[  298], 20.00th=[  310],
     | 30.00th=[  318], 40.00th=[  322], 50.00th=[  330], 60.00th=[  334],
     | 70.00th=[  342], 80.00th=[  366], 90.00th=[  398], 95.00th=[  414],
     | 99.00th=[  454], 99.50th=[  478], 99.90th=[ 1144], 99.95th=[ 2024],
     | 99.99th=[ 2800]
    bw (KB  /s): min=660624, max=765872, per=99.99%, avg=746907.14, stdev=25759.49
    lat (usec) : 250=0.32%, 500=99.39%, 750=0.18%, 1000=0.01%
    lat (msec) : 2=0.06%, 4=0.05%
  cpu          : usr=23.67%, sys=68.75%, ctx=110028, majf=0, minf=431
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=0/w=2621440/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=746955KB/s, minb=746955KB/s, maxb=746955KB/s, mint=14038msec, maxt=14038msec

Disk stats (read/write):
  fioa: ios=109/2595463, merge=110/28, ticks=9/814160, in_queue=813744, util=99.39%
--snip--

So, in both of those tests, the first being 100% random, 100% read with 4K IOs, I'm getting 192K (192,000) IOPS! And in the second test its 100% random, 100% write with 4K IOs: 191K (191,000) IOPS! That's pretty fast for such a little package... just a single PCIe flash device.

And for some sequential IO tests with a much larger IO size...

--snip--
# fio --bs=4m --direct=1 --rw=read --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [R] [92.9% done] [800.0MB/0KB/0KB /s] [200/0/0 iops] [eta 00m:01s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1452: Sat Jan 17 11:06:41 2015
  read : io=10240MB, bw=819392KB/s, iops=200, runt= 12797msec
    slat (usec): min=110, max=19743, avg=4959.85, stdev=8374.12
    clat (msec): min=92, max=392, avg=312.69, stdev=20.35
     lat (msec): min=92, max=411, avg=317.65, stdev=18.69
    clat percentiles (msec):
     |  1.00th=[  212],  5.00th=[  302], 10.00th=[  302], 20.00th=[  302],
     | 30.00th=[  322], 40.00th=[  322], 50.00th=[  322], 60.00th=[  322],
     | 70.00th=[  322], 80.00th=[  322], 90.00th=[  322], 95.00th=[  322],
     | 99.00th=[  322], 99.50th=[  334], 99.90th=[  392], 99.95th=[  392],
     | 99.99th=[  392]
    bw (KB  /s): min=442593, max=835584, per=97.73%, avg=800802.08, stdev=80018.74
    lat (msec) : 100=0.12%, 250=1.33%, 500=98.55%
  cpu          : usr=0.07%, sys=3.44%, ctx=1028, majf=0, minf=65543
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=2560/w=0/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: io=10240MB, aggrb=819392KB/s, minb=819392KB/s, maxb=819392KB/s, mint=12797msec, maxt=12797msec

Disk stats (read/write):
  fioa: ios=20256/0, merge=0/0, ticks=1799659/0, in_queue=1806118, util=99.29%
--snip--

--snip--
# fio --bs=4m --direct=1 --rw=write --ioengine=libaio --iodepth=64 --name=/dev/fioa --size=10G
/dev/fioa: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=64
fio-2.1.10
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0KB/767.3MB/0KB /s] [0/191/0 iops] [eta 00m:00s]
/dev/fioa: (groupid=0, jobs=1): err= 0: pid=1448: Sat Jan 17 11:06:11 2015
  write: io=10240MB, bw=786157KB/s, iops=191, runt= 13338msec
    slat (usec): min=124, max=20466, avg=5167.94, stdev=8529.73
    clat (msec): min=99, max=412, avg=326.06, stdev=21.06
     lat (msec): min=99, max=413, avg=331.23, stdev=19.41
    clat percentiles (msec):
     |  1.00th=[  225],  5.00th=[  314], 10.00th=[  314], 20.00th=[  314],
     | 30.00th=[  334], 40.00th=[  334], 50.00th=[  334], 60.00th=[  334],
     | 70.00th=[  334], 80.00th=[  334], 90.00th=[  334], 95.00th=[  334],
     | 99.00th=[  334], 99.50th=[  351], 99.90th=[  412], 99.95th=[  412],
     | 99.99th=[  412]
    bw (KB  /s): min=407157, max=802816, per=98.28%, avg=772616.08, stdev=74921.31
    lat (msec) : 100=0.12%, 250=1.17%, 500=98.71%
  cpu          : usr=3.31%, sys=2.05%, ctx=1139, majf=0, minf=7
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=0.6%, 32=1.2%, >=64=97.5%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=0/w=2560/d=0, short=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=786156KB/s, minb=786156KB/s, maxb=786156KB/s, mint=13338msec, maxt=13338msec

Disk stats (read/write):
  fioa: ios=59/20405, merge=55/0, ticks=7/1888035, in_queue=1893181, util=98.52%
--snip--

So with 100% sequential, 100% read using 4M IOs we see 800 MB/sec; with same test using writes I'm seeing 767 MB/sec. Pretty fast! I'm not sure where the bottleneck is here... I believe this card is PCIe 2.0 4x so that bus may be the crippler, not sure, I'll have to look into it. Either way, the random IO performance is really where its at, and I am very much impressed.

4 comments:

  1. Marc,

    can you please share your patch to driver/kernel sources, so I would be able to build it for rhel7 3.10.x kernel ? Thanks

    PS: please share to matorola /at/ gmail.com

    ReplyDelete
  2. Marc,

    Thanks for the article on the fusion-io card. I was hoping you'd had a chance to get one working with esos. Just about to start researching and any help much appreciated.

    Regards

    Alasdair Smith

    ReplyDelete
    Replies
    1. It would actually be pretty easy to support these cards with ESOS, but it would need to be a build-time option for ESOS since the driver for these cards is proprietary. To get it working with my home machine, I had to hack up the driver that is "free" to normal users, but if you have a paid support account with them, you get the "new" driver which I believe supports recent kernel versions.

      I'd be happy to add support for this into ESOS if you can provide the new driver and test with hardware for me.


      --Marc

      Delete
    2. Hi Marc,

      The latest driver that I could see on the SanDisk website was for fedora 20, would that be a recent enough kernel edition? I would be happy to test on hardware. Feel free to contact me directly on alidsmith /at/ gmail.com

      Regards

      Alasdair

      Delete