TL;DR – ashift=12
performed noticeably better than ashift=13
.
I recently installed Proxmox on a new build and couldn’t find any information about the best ashift
values for my new NVMe SSD drives. Given that I was installing a brand new server, it gave me a chance to do some quick testing. Here’s what I found.
Hardware setup
- Asus Pro W680-ACE IPME (without the IPME card installed)
- 64GB of DDR5-4800 ECC memory
- 2x 2TB Samsung 990 PRO NVMe SSDs
The SSDs are installed on the motherboard’s M.2 slots, one in the slot directly attached to the CPU and the other in a slot connected through the W680 controller.
Software setup
For both tests I did a clean install of Proxmox VE 8.1.3 with both SSDs in a RAID1 configuration. All zpool/vdev settings were default (compression=lzw
, checksum=on
, copies=1
, recordsize=128K
, etc.) except for the ashift
values changed for the tests. After the installation, I did the standard apt-get
updates to get current, and then installed fio
to do the testing.
Test setup
The tests I ran were from Jim Salter’s ARStechnica article which gives good detail on how to run fio
to test disk performance. Each test was run back-to-back. I ran each test 4 times. Here’s the script I used to run the tests:
echo "Test 1 - Single 4KiB random write process"
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
echo "Test 2 - 16 parallel 64KiB random write processes"
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=64k --size=256m --numjobs=16 --iodepth=16 --runtime=60 --time_based --end_fsync=1
echo "Test 3 - Single 1MiB random write process"
fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=1m --size=16g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
Results
ashift=12
was faster (on average) than ashift=13
for every test. For Test 1, it was 10.3% (15MB/s) faster. For Test 2, it was a whopping 22.3% (499MB/s) faster. And for Test 3, it was 8.5% (117MB/s) faster. Those are pretty big differences – I was surprised they weren’t more in the noise. Made it easy to make the call to set my drives to ashift=12
, which also happens to be the common wisdom for all drives today.
Test | ashift | Iteration 1 | Iteration 2 | Iteration 3 | Iteration 4 | Average |
1 | 12 | 176 | 92.6 | 150 | 163 | 145.4 |
2 | 12 | 2844 | 1126 | 3865 | 1104 | 2234.75 |
3 | 12 | 1139 | 1320 | 1550 | 1473 | 1370.5 |
1 | 13 | 177 | 72.7 | 118 | 154 | 130.425 |
2 | 13 | 3199 | 1473 | 993 | 1280 | 1736.25 |
3 | 13 | 1030 | 1221 | 1476 | 1287 | 1253.5 |
Why didn’t you test ashift 9?
It looks to be faster than both ashift 12 and 13: https://feldspaten.org/2024/05/05/Performance-impact-of-different-blocksizes-on-a-Samsung-Pro-SSD-while-using-zfs/
The drives reports a physical block size of 512 but might be lying:
# nvme id-ns -H /dev/nvme0n1 | grep “Relative Performance”
LBA Format 0 : Metadata Size: 0 bytes – Data Size: 512 bytes – Relative Performance: 0 Best (in use)
Everything I read at the time pointed to 512 (ashift=9) being not worth testing, and the general guidance was essentially “if you’re going to be wrong, be wrong high”. If you run tests with it, please let me know what you find.