This week Adam talks with Andy Klein from Backblaze about hard drive reliability at scale.
Adam Stacoviak: Yeah, I mean, realistically… I mean, you’ve done a lot of the hard work in quantifying the value of the data, and you’ve been consistent with the ability to capture it, and then report on it at a quarterly and yearly basis, which I just commend you on. Thank you for that. And you give it out for free. you don’t say, “Hey, for Backblaze customers, you can see the data.” It’s free for everybody to see. And I think you even have like downloads of the raw data, if I recall correctly. I didn’t know what to do with it, but I’m like “Great it’s there.” If I wanted to dig into it further, then I could. But yeah, there should be some sort of drive testing…
But what a hard thing to do. I mean, especially, as you probably know, models change so quickly, and the model numbers don’t seem to be like there’s some sort of rhyme or reason to them; they just seem to be like “Okay, we’re done with building that one, and now we’re going here.” And it’s also based on geography; it may be made in Taiwan, it might be made in Vietnam, it may be made somewhere else… And these things also play a role into it. It could have been something geographical in that area; there could have been a storm, there could have been an earthquake, or a hurricane, or something catastrophic, or who knows what. There’s things that happen in these manufacturing plants when they make these drives to get consistency.
[01:21:45.12] I’ve even heard to buy not in the same batch. So don’t buy more than x drives from, let’s say B&H. Buy two from B&H, two from CDW… Obviously, buy the same model, if you can, to try and keep the model number parity… But I’ve heard all these different – essentially, old wives’ tales on how to buy hard drives as a consumer. And really, it seems to be cargo-culted, or learned, from somebody else, or just fear, essentially. “This is why I do it, because it’s a fear.”
And the way I’ve kind of done it is based on the capacity, first. So I think, “How big do I need?” So I begin with my capacity. because I’m different. I want to get to price curve eventually, but my deal is “How much do I want to have? How many drives can I actually handle?” and then at that level, what’s my parity level? Can I afford to have a couple extra, so if those two fail in that parity, let’s say a RAID-Z2 given a ZFS file system array, as an example… If those two drives fail, can I replace them? Do I have two more drives to replace them if two did fail?
I hadn’t considered your cloning idea, which I think is super-smart. I’m gonna have to consider that. I might just do some hard drive failure tests just to see how that could work. That seems so smart, to clone versus resilver… Although I don’t know how that would work with ZFS, if that’s a thing or not. But capacity is where I begin. Then it’s like “Okay, for price, did I get that?” And then the final thing I do once I actually get the drives - I hadn’t considered running the SMART test right away to consider how many power-on hours it had, because I didn’t consider they’re doing tests in there… But I thought, “Well, hey, if Seagate is doing a burn-in of sorts on my drives, or some sort of test beforehand, let me know.” I would buy a model that has burn-in testing beforehand. Save me the week, if I’m gonna burn-in an 18 terabyte drive.
So when I bought this new array recently, the burn-in test lasted seven full days. I don’t know if you use this software now, it’s called badblocks… But you can run a series of tests, it writes three different test patterns, and then a final one, which is the zeros across it… But for each write, there’s a read comparison. So it’s a write across the whole disk, in one pattern, then a read, another write, then a read, another write, then a read, and then finally, a zero pass write, and then a recomparison to confirm that the drive is actually clean. But for an 18-terabyte drive, six drives, it took an entire week. And that’s just a tremendous amount of time for somebody who’s like “I just want to get onto building my thing… Come on now.”
But that’s the way I look at it. Like, that’s how I’ve learned to buy, is like “What capacity do I want to have?” And then “Can I afford it?” Just the drives alone. And then “Can I afford the extras if I need parity and replacement for that parity?” Of course you want parity. And then finally, doing a final burn-in before actually put the drives in the service… Which I feel like is a little overkill, to some degree… But you know what? The worst thing to do is to build this full array. I’m not a business, I have limited time… And then I’ve got to deal with failures a week or so later. Now, that burn-in test may not predict a week-long later failure, but it might mitigate it, because like, well, if drive four of six did not pass the sectors test in badblocks, well then, let’s send that one back for an RMA, or just a simple straight-up return kind of thing. And you know before you even build the array, you’ve got a problem child, essentially.