Without doing it myself, if I have understood what you have done
correctly, I would guess it is the imbalance and software raid that is
doing it.
With raid 1 (mirror set, I think) being done in software the total write
time form the writees point of view is the time taken to write the
blocks to the first device (primary) in the mirror pair) the software is
then mirroring this to the second and the whole thing is decoupled a touch.
Where as once you start to stripe, the total write time from the writees
point of view is the time taken to write serially across both devices
not just one.
So in the first case you get the speed of the primary per write, where
as in the second it is the speed of serially writing to both devices.
Of course this is a generalization and ignores that fact that under
uni*s/linuxs of all types the writes are buffered by the OS. But if
doing metrics how you do and take your measures will be affected by this.
Striping can be faster but only where the writes/reads are queued
optimally across disks with synchronized spindles. Software raid across
UN-synchronized disks will never achieve the same performance. This is
the province of hardware raid and clever drive electronics.
There again I may be completely wrong, I suspect there may be factors at
play I have not accounted for in my idealized view.
Cheers
Kirbs
On 30/05/2019 03:11, Andy Smith wrote:
Hello,
A new BitFolk server that I will put into service soon has 1x SSD
and 1x NVMe instead of 2x SSD. I tried this because the NVMe,
despite being vastly more performant than the SATA SSD, is actually
a fair bit cheaper. On the downside it only has a 3 year warranty
(vs 5) and 26% of the write endurance (5466TBW vs 21024TBW)¹.
So anyway, a pair of very imbalanced devices. I decided to take some
time to play around with RAID configurations to see how Linux MD
handled that. The results surprised me, and I still have many open
questions.
As a background, for a long time it's generally been advised that
Linux RAID-10 gives the highest random IO performance. This is
because it can stripe read IO across multiple devices, whereas with
RAID-1, a single process will do IO to a single device.
Linux's non-standard implementation of the RAID-10 algorithm can
also generalise to any amount of devices: conventional RAID-10
requires an even number of devices with a minimum of 4, but Linux
RAID-10 can work with 2 or even an odd number.
More info about that:
https://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10
As a result I have rarely felt the need to use RAID-1 for 10+ years.
But, I ran these benchmarks and what I found is that RAID-1 is THREE
TIMES FASTER than RAID-10 on a random read workload with these
imbalanced devices.
Here is a full write up:
http://strugglers.net/~andy/blog/2019/05/29/linux-raid-10-may-not-always-be…
I can see and replicate the results, and I can tell that it's
because RAID-1 is able to direct the vast majority of reads to the
NVMe, but I don't know why that is or if it is by design.
I also have some other open questions, for example one of my tests
against HDD is clearly wrong as it achieves 256 IOPS, which is
impossible for a 5,400RPM rotational drive.
So if you have any comments, explanations, ideas how my testing
methodology might be wrong, I would be interested in hearing.
Cheers,
Andy
¹ I do however monitor the write capacity of BitFolk's SSDs and they
all show 100+ years of expected life, so I am not really bothered
if that drops to 25 years.
_______________________________________________
users mailing list
users(a)lists.bitfolk.com
https://lists.bitfolk.com/mailman/listinfo/users