Something unrelated to Ask an Atheist hosting, but important enough to the project (and my business) that I needed to focus on it rather than doing video production. So we might not have video for this episode until the end of the week– later than our usual Wednesday posting.
Sorry about that.
If you want to know what broke, check after the break. Warning: unix sysadmin crap lies ahead.
I’ve got a ten element RAID6 using Linux software RAID… given the older processors on that machine (pre-EMT64 Xeons) and PCI-X bus issues, when the file server crashes and resyncs, it’s sometimes given to I/O starvation and thereby crapping the bed. So I get this loop where the machine boots, the RAID tries to resync, disk I/O gets screwed up, crashes, watchdog notices, machine reboots…
Fantastic.
Solution is to boot into Finnix, copying the entire root into RAM, and resyncing the array from there. All libraries and other stuff are in RAM, there’s enough there that even taking up half of RAM with a live CD, I don’t touch swap.
Long term solution is to upgrade the motherboard to something that can handle a little more I/O, or (more likely) move to a new RAID with fewer elements. It wasn’t really supposed to get that big in the first place.