Raid Scrubbing always finding blocks to repair

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Raid Scrubbing always finding blocks to repair

Post by gabeyd » Fri Jun 12, 2020 7:24 am

I have a TVS-951X with 5 WD Red 4TB (older model WD40EFRX) and two 512GB Samsung 860 Pro SSDs. The 5 WD drives are in RAID6, and the 2 SSD are in RAID1 SSD cache. (Tried turning off the SSD cache for some of the weeks, it makes no difference).

Starting back in April, I noticed that my weekly scrubbing was finding bad blocks to repair:
Information2020/04/1208:15:22 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 120.
Information2020/04/1909:03:24 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 560.

And it's just an "Information" not a Warning or Error, which seems strange - properly functioning hardware should not need to repair blocks.

This has been going on for weeks now:
Information 2020/04/26 10:00:31 AM System 127.0.0.1 Storage & Snapshots RAID Group [Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 608.
Information2020/05/0310:01:22 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 592.
Information2020/05/1010:02:35 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 584.
Information2020/05/17 10:11:35 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 472.
Information2020/05/24 10:08:09 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 504.
Information2020/06/0209:38:34 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Volume: DataVol1, Blocks repaired: 1048.

Support, after taking months, suggested I backup, and completely rebuild the RAID array, so I reluctantly did.

Backed everything up, rebuilt the raid array. Did a scrub after the raid array synced, and it say 0 blocks to repair. Restored all my data from backup, ran a scrub, and already, in 1 day:

Information2020/06/0609:37:07 AMSystem127.0.0.1Storage & SnapshotsRAID Group[Storage & Snapshots] Finished scrubbing RAID group "1". Storage pool: 1, Blocks repaired: 552.

Note, I switched to a storage pool when I rebuilt the RAID, instead of the fixed volume before.

I've run extended SMART on all drives. No issues, everything that counts, like ID 5 are all 0, on all drives.

I've run a full bad block scan on all 5 drives, no issues.

I installed Memtest86 on a USB stick and ran for 24 hours (3 passes on the 32G of RAM) and it found 0 errors.

The NAS is not reporting anything wrong, other than those "Information" messages.

Currently running the "Help Desk" HDD Stress Test, which has been running for 44 hours, and found 0 issues, at least that it's reporting through the app UI. (How long does this take? It seems to have started over on drive 1 now, as that's the only drive flashing, and I could see it progressing through about 1 drive every 6 or 8 hours or something).

Anyone have any ideas on what else to try?

Thanks!

dolbyman
Guru
Posts: 19691
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Raid Scrubbing always finding blocks to repair

Post by dolbyman » Fri Jun 12, 2020 11:06 pm

I only had ever one raid scrubbing event find errors

that was when I took out a raid member formatted it and synced it back (the disk was immediately accepted with no rebuild) but probably had some missing chunks

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Sun Jun 14, 2020 11:36 am

Right, it should be some kind of exception that raid scrubbing finds blocks to repair. I get that it's better to find them during scrubbing than during a raid rebuild, but in the normal case, it should not find any to fix anyway...

So, why every single time I run a scrub, every week, or even if I run it back to back...

dolbyman
Guru
Posts: 19691
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Raid Scrubbing always finding blocks to repair

Post by dolbyman » Sun Jun 14, 2020 2:24 pm

something aint right then

best to open a ticket with qnap

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Tue Jun 16, 2020 3:58 am

Appreciate the reply. I already have a ticket open with support, they don't see anything wrong. After months of back and forth, their suggestion is to buy 5 new drives. Even though SMART and bad block scan and stress test don't show anything wrong with the drives. I'm not, at least not yet, up to spending 600 or 700 for new drives.

I was hoping someone here had seen similar or had some ideas.

Here's my new plan:
1) Rebuild the RAID array again, this time as RAID5 instead of RAID6 - using all 5 drives. Then scrub that, then restore backup data, then scrub again. On RAID6, that gives errors already. RAID6 bug?
2) If that still shows blocks to repair, I'm going to run Western DIgital DataLifeguard on each of the 5 drives and see if it finds any errors. (Wish it had a write test, not just a read test - I think I've already done plenty of read tests in qnap, but worth a try).
3) Open a support ticket with Western DIgital.

I don't mind replacing a disk, if it's one disk that is failing, but it's disturbing that there is no indication of a failing disk, and yet still RAID blocks to repair.

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Thu Jun 25, 2020 4:08 am

All 5 disks have completed an extended test with 0 errors in Wester Digital Data Lifeguard. (read only test, takes about 8 hours per disk).
All 5 disks have completed extended write/verify test in Hard Disk Validator (write,read test, takes 17 hours per disk).
All 5 disks are showing SMART status "Good" (with all 0s) after the two above tests.

This was done by removing each disk, one by one from the NAS and putting it on an internal connector in a Windows PC.

It is now VERY unlikely that there is a disk problem.

I found this tread: viewtopic.php?t=153717

I am testing that theory now, I put back the factory 2G DIMM, and removed my 32G upgrade. I am now running a RAID 5, 4 disk build and scrubbing it. Same test I did yesterday that failed with 184 blocks repaired.

This, even though I ran 24 hours of MemTest86 and it found ZERO errors on my 32G RAM.

We'll see...

jacobite1
Easy as a breeze
Posts: 338
Joined: Fri Aug 07, 2015 7:02 pm
Location: London, England

Re: Raid Scrubbing always finding blocks to repair

Post by jacobite1 » Thu Jun 25, 2020 4:49 am

It seems that putting 'too much' RAM in QNAP products is causing or potentially causes a whole host of issues. If it scrubs fine at 2GB, you might have solved your problem, as irritating as that is.
TVS-872XT-i5-16GB with 2*WD60EFRX and 4*ST12000VNZ008 in RAID 6. The reshape from RAID 5 only took 8 days...
Backed up to a stack of a half dozen 'cold' external 12TB and 8TB HDDs - please back up your data, RAID is not the same as a backup!

Formerly TVS-463 with 4*WD60EFRX in RAID5, planning to reuse as a backup destination once the EFRXs are freed up.

Protected by an APC SMT750VA UPS - protect your NASes folks!

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Thu Jun 25, 2020 6:56 am

Well, I will have figured out my problem, but not solved it. As I need to have significantly more than 2GB of RAM. I may be able to get by with 16G, but certainly prefer 32G. I run a couple VMs.

If it seems to work with the factory 2G stick, I will probably buy another set of 32G, probably the Gskill set that was recommended in the other thread, and run some tests with that.

It would be nice if Qnap wasn't charging 5x market price for their RAM upgrades.

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Thu Jun 25, 2020 9:01 am

First test is a success!

- Only the factory 2GB RAM stick. (Remove my Crucial CT2K16G4SFD824A 32GB kit)
- Only 4 disks, as RAID 5 (same as the test yesterday)
- Built and let sync a new RAID 5 array, with just a pool, no volume. Immediately scrub as soon as the sync is done.
- 0 Blocks repaired!

So, we MAY have found the problem, even though there are no other symptoms of bad RAM, and MemTest86 doesn't find errors in 24 hours of testing.

But, I need 32GB or RAM - it's part of why I choose this model NAS. So, I ordered Gskill 32GB kit, and will try that when it comes.

In the mean time, to confirm that RAM is the problem, I'm building a 5 disk RAID6 array - what I want to have. And will restore 1TB or so of data, and then do a scrub, running only the 2GB factory SODIMM.

Then I'll do the same with the Gskill RAM when it arrives next week.

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Fri Jun 26, 2020 11:50 pm

Ok, it was the RAM. Which is weird, because I was showing no other symptoms of bad RAM. No crashes, VMs ran fine also with no crashes, MemTest86 ran fine for 24 hours.

I built my full 5 disk RAID6 array, and ran a scrub, and it found 0 mismatched blocks.
Then I restored a 2TB of data, ran a scrub, and it found 0 mismatched blocks.

This is the first time in the last 3 months that I have not found blocks to repair during a raid scrub.

Now, I just need to find 32G of RAM that works. Crucial won't take my RAM back (didn't buy direct from them), they would replace it if defective, but if it's a compatibility issue, that's not defective. Even though they list it as compatible with the 951X on their memory finder. Ok, so I'm out $120 of Crucial RAM, and done being a customer of theirs. Which **, because they were my go-to for RAM upgrades.

Anyway, hopefully the 32G of GSkill I ordered will work in the 951X and I'll be back to having a reliable NAS.

jacobite1
Easy as a breeze
Posts: 338
Joined: Fri Aug 07, 2015 7:02 pm
Location: London, England

Re: Raid Scrubbing always finding blocks to repair

Post by jacobite1 » Sat Jun 27, 2020 12:12 am

Oh right,
"TVS-951X Compatible Upgrades - These upgrades are guaranteed compatible when you order from Crucial.com"

So, by ordering through a third party you've been stuffed. That's annoying.
TVS-872XT-i5-16GB with 2*WD60EFRX and 4*ST12000VNZ008 in RAID 6. The reshape from RAID 5 only took 8 days...
Backed up to a stack of a half dozen 'cold' external 12TB and 8TB HDDs - please back up your data, RAID is not the same as a backup!

Formerly TVS-463 with 4*WD60EFRX in RAID5, planning to reuse as a backup destination once the EFRXs are freed up.

Protected by an APC SMT750VA UPS - protect your NASes folks!

gabeyd
Starting out
Posts: 18
Joined: Thu May 07, 2015 1:40 am

Re: Raid Scrubbing always finding blocks to repair

Post by gabeyd » Sun Jul 12, 2020 11:51 pm

Update: gskill RAM is working. 32GB (2x16GB) ripjaws F4-2400C16D-32GRS

No raid scrubbing errors.

Also, I talked crucial in to a replacement set, not sure what I’ll do with them. And I won’t try them in my nas.

treylo
New here
Posts: 7
Joined: Thu Jul 09, 2020 5:12 am

Re: Raid Scrubbing always finding blocks to repair

Post by treylo » Tue Jul 14, 2020 5:24 am

Wow! This, and the linked thread, are really interesting. For years I've had corrupt/changed files when copying to and from one of my two Qnaps. Now, whenever copying thigns, I always do a post-copy-binary-verification of files which takes FOREVER but tells me which files are 'wrong' and allows me to re-copy them. I've never understood what the problem was, but I did notice that it only seemed to happen with ONE of my NAS's.... and, guess, what... it's the one that I upgraded and added RAM.... now I wonder if that's been the problem this whole time.
gabeyd wrote:
Thu Jun 25, 2020 4:08 am

I found this tread: viewtopic.php?t=153717

I am testing that theory now, I put back the factory 2G DIMM, and removed my 32G upgrade. I am now running a RAID 5, 4 disk build and scrubbing it. Same test I did yesterday that failed with 184 blocks repaired.

This, even though I ran 24 hours of MemTest86 and it found ZERO errors on my 32G RAM.

We'll see...

User avatar
Moogle Stiltzkin
Ask me anything
Posts: 9077
Joined: Thu Dec 04, 2008 12:21 am
Location: Around the world....
Contact:

Re: Raid Scrubbing always finding blocks to repair

Post by Moogle Stiltzkin » Tue Jul 14, 2020 12:43 pm

treylo wrote:
Tue Jul 14, 2020 5:24 am
but I did notice that it only seemed to happen with ONE of my NAS's.... and, guess, what... it's the one that I upgraded and added RAM.... now I wonder if that's been the problem this whole time.
so when you switch ram back to original the problem went away? :'
NAS
[Main Server] QNAP TS-877 w. 4tb [ 3x HGST Deskstar NAS (HDN724040ALE640) & 1x WD RED NAS ] EXT4 Raid5 & 2 x m.2 SATA Samsung 850 Evo raid1 +16gb ddr4 Crucial+ QWA-AC2600 wireless+QXP PCIE
[Backup] QNAP TS-653A w. 5x 2TB Samsung F3 (HD203WI) EXT4 Raid5
[Backup] QNAP TL-D400S 2x 4TB WD Red Nas (WD40EFRX) single disks.
[^] QNAP TS-659 Pro II
[^] QNAP TS-509 Pro w. 4x 1TB WD RE3 (WD1002FBYS) EXT4 Raid5
[^] QNAP TS-228
[^] QNAP TS-128
[Mobile NAS] TBS-453DX w. 2x Crucial MX500 500gb EXT4 raid1

Network
Asus AC68U Router|100dl/50ul MBPS FTTH Internet | Win10, WC PC-Intel i7 920 Ivy bridge desktop (1x 512gb Samsung 850 Pro SSD + 1x 4tb HGST Ultrastar 7K4000)


Guides & articles
[Review] Moogle's QNAP experience
[Review] Moogle's TS-877 review
https://www.patreon.com/mooglestiltzkin

Post Reply

Return to “System & Disk Volume Management”