I had a rather brutal lesson in LVM recovery last week. I had four SATA disks in an LVM VG, one of which failed. Despite the failed disk *not* being the first drive in the VG, all the LVs in the VG were toast. Why? Because when I created the LVs, I striped them over the two disks that were in the VG at the time.
I figured it would be a performance improvement; it proabably was. I’d forgotten that I did it. When I added two more disks to the VG, and was able to extend the LVs without any problems, I thought that I must have *not* striped them — turned out that because I’d added a *pair* of new disks, LVM was able to extend the two-striped LVs onto the two new disks just fine.
So, back to the failure: the disk that I lost was the second PV in the VG (the second one in the original pair). Recovery procedures for LVM involve either substituting a new PV of the same size in place of any PVs from the failed disk, or creating a special device node called “/dev/ioerror” that LVM can refer to instead of a missing PV (usually you link /dev/ioerror to /dev/zero). Having done either of those, you can add the “–partial” option to your LVM commands and LVM will do its best to make your LVs available (even though they’d have crashing-great gaps in them).
The one rule that is given in this procedure is that you cannot recover any LVs that *started* in the failed PV. “No worries,” thought I, “I lost the second PV, all my LVs start on the first PV, I’ll be fine”. WRONG. Because I was striping, the LVs all started in the first *and* the second PV. So, a failure of either disk was totally destructive to the entire VG (of course if I had created a new LV when I added the two new disks, that LV would have been fine since it would have started and striped over the third and fourth PVs).
So what’s my config now? LVM over a RAID5 array built from my four SATA disks. Since I’m coming to rely on this stuff more, I figure it’s time to give up a little performance to gain some stability and recoverability (besides, the performance has been just fine so far).