Hi Matt,
On Wed, Oct 05, 2011 at 10:03:10AM +0100, Matt Molyneaux wrote:
What solutions have you looked at? Is bcache stable
enough for use yet?
I haven't yet actually bought any hardware. I'm still researching.
If I can come up with a plan that I think might work then I will
likely buy the next server a lot sooner to allow time for
experimentation.
- bcache
http://bcache.evilpiepirate.org/
The documentation online is massively out of date, so far as to be
useless from what I can see. Read the docs inside the source.
It will require a patched kernel.
The authors (primarily Kent Overstreet, a Google employee; unclear
if this is a Google project or not) has been seeking inclusion in
upstream but it doesn't seem to be going well:
https://lkml.org/lkml/2011/1/3/3
https://lkml.org/lkml/2011/9/10/13
- flashcache
https://github.com/facebook/flashcache/
This seems better documented.
It again would require a patched kernel.
No attempt to get it included upstream from what I can tell.
It's a Facebook project and looks like it's in production use
there:
https://www.facebook.com/note.php?note_id=388112370932
- Ad-hoc pvmove of "hot" extents to a PV that's backed by SSD
https://bbs.archlinux.org/viewtopic.php?id=113529
Since that was posted, he appears to have refined his daemon to
the point where you just run it and it spits out command lines for
you to execute the pvmove:
https://github.com/tomato42/lvmts
Seems rather hacky and throws up a lot of questions such as
Can you be running that daemon (which is doing a blktrace) all
the time?
What do you do when your SSD cache is full? Which extents would
you evict?
A fly in the ointment:
SSDs can lose transactions if they are abruptly powered off, even if
you disable their write cache:
http://www.evanjones.ca/intel-ssd-durability.html
http://www.afewmoreamps.com/2011/08/fsync-durability.html
Of course, spinning discs can *also* lose transactions when power is
yanked, but we avoid this issue by putting them behind a hardware
RAID controller with battery-backed cache.
I had been hoping that I could put two SSDs behind Linux md
(software RAID) with their write caches turned off, for two reasons:
- md is technically superior and has useful features like being
able to grow arrays online.
- Hardware RAID controller slots are expensive; while I might be
able to go down to 6 SATA disks from 8 and use the other two
slots for an SSD RAID-1, that would be a shame.
So now it looks like using md is not going to work and if this is to
be done it'll have to be behind a RAID controller, as it's not
acceptable risk.
Cheers,
Andy
--
http://bitfolk.com/ -- No-nonsense VPS hosting