Re: [bitfolk] Interesting (?) Linux RAID-10 performance cave…

Top Page

Reply to this message
Author: admins
Date:  
To: users
Subject: Re: [bitfolk] Interesting (?) Linux RAID-10 performance caveat
Without doing it myself, if I have understood what you have done
correctly, I would guess it is the imbalance and software raid that is
doing it.

With raid 1 (mirror set, I think) being done in software the total write
time form the writees point of view is the time taken to write the
blocks to the first device (primary) in the mirror pair) the software is
then mirroring this to the second and the whole thing is decoupled a touch.

Where as once you start to stripe, the total write time from the writees
point of view is the time taken to write serially across both devices
not just one.

So in the first case you get the speed of the primary per write, where
as in the second it is the speed of serially writing to both devices.

Of course this is a generalization and ignores that fact that under
uni*s/linuxs of all types the writes are buffered by the OS. But if
doing metrics how you do and take your measures will be affected by this.

Striping can be faster but only where the writes/reads are queued
optimally across disks with synchronized spindles. Software raid across
UN-synchronized disks will never achieve the same performance. This is
the province of hardware raid and clever drive electronics.

There again I may be completely wrong, I suspect there may be factors at
play I have not accounted for in my idealized view.



Cheers


Kirbs





On 30/05/2019 03:11, Andy Smith wrote:
> Hello,
>
> A new BitFolk server that I will put into service soon has 1x SSD
> and 1x NVMe instead of 2x SSD. I tried this because the NVMe,
> despite being vastly more performant than the SATA SSD, is actually
> a fair bit cheaper. On the downside it only has a 3 year warranty
> (vs 5) and 26% of the write endurance (5466TBW vs 21024TBW)¹.
>
> So anyway, a pair of very imbalanced devices. I decided to take some
> time to play around with RAID configurations to see how Linux MD
> handled that. The results surprised me, and I still have many open
> questions.
>
> As a background, for a long time it's generally been advised that
> Linux RAID-10 gives the highest random IO performance. This is
> because it can stripe read IO across multiple devices, whereas with
> RAID-1, a single process will do IO to a single device.
>
> Linux's non-standard implementation of the RAID-10 algorithm can
> also generalise to any amount of devices: conventional RAID-10
> requires an even number of devices with a minimum of 4, but Linux
> RAID-10 can work with 2 or even an odd number.
>
> More info about that:
> https://en.wikipedia.org/wiki/Non-standard_RAID_levels#Linux_MD_RAID_10
>
> As a result I have rarely felt the need to use RAID-1 for 10+ years.
>
> But, I ran these benchmarks and what I found is that RAID-1 is THREE
> TIMES FASTER than RAID-10 on a random read workload with these
> imbalanced devices.
>
> Here is a full write up:
> http://strugglers.net/~andy/blog/2019/05/29/linux-raid-10-may-not-always-be-the-best-performer-but-i-dont-know-why/
>
> I can see and replicate the results, and I can tell that it's
> because RAID-1 is able to direct the vast majority of reads to the
> NVMe, but I don't know why that is or if it is by design.
>
> I also have some other open questions, for example one of my tests
> against HDD is clearly wrong as it achieves 256 IOPS, which is
> impossible for a 5,400RPM rotational drive.
>
> So if you have any comments, explanations, ideas how my testing
> methodology might be wrong, I would be interested in hearing.
>
> Cheers,
> Andy
>
> ¹ I do however monitor the write capacity of BitFolk's SSDs and they
> all show 100+ years of expected life, so I am not really bothered
> if that drops to 25 years.
>
>
> _______________________________________________
> users mailing list
> users@???
> https://lists.bitfolk.com/mailman/listinfo/users


--
admins@???
www.sheffieldhackspace.org.uk