Drive tests OK yet QTS temporarily rejected it? Ideas?

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
storageneeded
Starting out
Posts: 46
Joined: Sun Jan 01, 2012 2:40 am

Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by storageneeded » Wed Apr 07, 2021 1:47 am

While performing a large copy my Qnap decided a drive had failed and removed it from the array. The problem is I pulled the drive and it tests perfectly. I put it back in and the NAS is entirely happy with it as if nothing happened. Is this a NAS hardware problem? A bug in QTS? A weird intermittent drive problem that I'll never get replaced under warranty? When I discovered the problem disk 4 was spun down and the system acted like it wasn't even present. Here are the relevant system log entries:

Error 4/6/2021 4:27:57 System 127.0.0.1 localhost [Hardware Status] Host: Disk 4": Disconnected."
Error 4/6/2021 4:27:49 System 127.0.0.1 localhost [Hardware Status] Failed to access the installed drive. Check the drive for errors.
Error 4/6/2021 4:27:47 System 127.0.0.1 localhost [Storage & Snapshots] Disk Host: Disk 4" failed. Volume: Archive Storage pool: 1."
Warning 4/6/2021 4:27:47 System 127.0.0.1 localhost [Storage & Snapshots] RAID group 1" is degraded. Storage pool: 1."
Information 4/6/2021 4:27:43 System 127.0.0.1 localhost [Storage & Snapshots] Finished hot-removing disk Host: Disk 4"."
Warning 4/6/2021 4:27:23 System 127.0.0.1 localhost [Bad Block Log]: Invalid data found on Host: Disk 4 at sector (163763200, 8, 1).
Information 4/6/2021 4:26:46 System 127.0.0.1 localhost [Hardware Status] Host: Disk 4": NCQ disabled. Timeout error."

This is QTS 4.4.3 and a relatively new 4 bay Intel NAS with relatively new WD Red 10TB drives all still under warranty. Does anyone have any ideas besides spending $260 to replace a drive that passes all tests but the NAS decided to randomly take offline?

User avatar
dolbyman
Guru
Posts: 22754
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by dolbyman » Wed Apr 07, 2021 1:50 am

Hard to say from afar .. did you do a full surface test on the drive ? .. it would take a couple of days on that drive size

You are also allowed to state your NAS model.. "a relatively newish intel NAS" is not really helping

storageneeded
Starting out
Posts: 46
Joined: Sun Jan 01, 2012 2:40 am

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by storageneeded » Wed Apr 07, 2021 2:01 am

@dolbyman Thanks for the reply. It's a TS-451 and I ran the WD Data Lifeguard diagnostics assuming it would fail and qualify for a warranty replacement but it passed. I also ran HD Sentinel which passed and rates the drive as 100%. There's nothing in the SMART data to indicate a drive problem. There's no way WD is going to warranty this drive.

There are various people who've complained about weird inexplicable "NCQ disabled" errors with WD Red drives in this forum. But those seem to be different problems than what I experienced.

User avatar
dolbyman
Guru
Posts: 22754
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by dolbyman » Wed Apr 07, 2021 2:32 am

Do a full surface test then .. and see if it comes up clean or not.

SMART is not always perfect.

Also on a 451 ... the intel bug that slowly degrades a timing signal could also be luring

storageneeded
Starting out
Posts: 46
Joined: Sun Jan 01, 2012 2:40 am

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by storageneeded » Thu Apr 08, 2021 8:00 am

@dolbyman The full surface test is complete and passed. I'm guessing this is related to the weirdness with "NCQ disabled" errors as several others have reported with WD drives. As for Intel issues this is a fairly new TS-451-DeU with the relatively new Intel J4025 CPU. I'm not aware of any issues with that CPU unless you can point me to a link? This really undermines my confidence in the NAS when it drops a drive for a single seemingly minor error on a drive that tests 100% perfect and is not eligible for warranty replacement. If it matters I'm using thick volumes and perhaps that's the root cause as I've previously only used simple static volumes. Qnap can't seem to get much of anything 100% right that deviates from basic open source Linux. Just look at their security update page for dozens of examples every month of how their proprietary code is full of bugs, security holes, and worse. And newer versions are often less stable than older versions. It's a mess. I think theTS-451 likely belongs on eBay as it seems another another unexplained failure is just waiting to happen. Perhaps with Seagate drives it would be happy but buying 5 Seagate 10TB Ironwolf drives would be $1400 and seems like throwing good money after bad. The WD Red drives are on the the Qnap qualified list but apparently that's not enough to have a stable NAS. :(

User avatar
dolbyman
Guru
Posts: 22754
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by dolbyman » Thu Apr 08, 2021 11:03 am

QNAP an their confusing naming convention

the 451 and 451+ have this bug .. my 951X does not..and I don't think your 451DeU either...but further up you said it's a 451..no further model indications .. hence the confusion

Havent seen the NCQ issue lately ..so not sure what happened and who is to blame QNAP OR WD

storageneeded
Starting out
Posts: 46
Joined: Sun Jan 01, 2012 2:40 am

Re: Drive tests OK yet QTS temporarily rejected it? Ideas?

Post by storageneeded » Fri Apr 09, 2021 6:02 am

@dolbyman, I appreciate your help. If Qnap certifies the WD Red drives in my NAS, which they do, and one such WD Red drive is rejected by the NAS even when it passes all tests including a full surface scan and is not eligible for replacement, there's clearly a problem that points to Qnap. Anytime a drive is rejected right or wrong from an array it puts the entire array at risk. And even replacing the drive further puts the array at risk from all the intensive disk activity to rebuild the array. So this is NOT a minor problem especially when you're dealing 30TB of data. Qnap needs to stop focusing on adding fluff and features few want and start putting more emphasis on solid, secure, reliable file storage which is what a NAS is all about. Qnap doesn't seem to understand what really matters. They seem more interested in features few want or need. It's like owning a car that can't even get you to the grocery store yet the car maker is re-designing the infotainment screen to include new animations instead of making a more reliable car. That's Qnap. They just don't get what matters.

Post Reply

Return to “System & Disk Volume Management”