Page 9 of 13

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sun Jan 19, 2014 4:54 am
by schumaku
Most low power CPU NAS models won't get a reasonable advantage. Far to much hype and wribg expectations in this thread, no wonders to achieve...

How many times a RAID must be rebuilt in a NAS lifetime? Case #1: Once. Case #2: One HDD failure in three to five years. Case #3: RAID expansion. Case #4: RAID migration, once in a NAS lifetime?

Re: [HOWTO] How to increase raid rebuild speed

Posted: Mon Jan 20, 2014 4:18 am
by StardustOne
With what I have experienced, every time a QNAP NAS with a RAID has an issue, you have to power it off by cutting the power. The software is simply too buggy to take care of an ordinary shutdown.

I have seen this with the TS-219 with RAID1 and firmware 4.0.x and I have experienced the same with the TS-419 when I moved the two drives from the TS-219 and failed to get a RAID5 by adding another 2 TB drive to that RAID1 I just moved from the TS-219. Why did the RAID5 buildup fail? The drive I put in was faulty, and the QNAP RAID migration tried to add this faulty drive, then decided oh it is a faulty drive then had to revert back to the RAID1.

And then I tried to figure out, what is wrong with that wrong drive, I was not even able to format it. After 64% the NAS just got stuck, I got the constantly logged error message that the drive has an I/O error, the NAS became unresponsive, I cannot stop the formatting going on forever because of all the mess done with that 4.0.5 firmware.

If I would now power off that RAID system, I again get a dirty flag, and the RAID1 has to be rebuilt from scratch. How many times? Every time a shutdown is not performed correctly or cannot be done anymore. That is how many and it is way too often that I would have to stick with the 5 MB/s rebuild write speed (silly idea in the first place). What does it help if your RAID rebuild takes days or weeks? A RAID rebuild that takes weeks is at high risk. If that other drive in a RAID1 fails, kaboom. Data gone for good.

No, a RAID rebuild has to happen as fast as possible, the data has to be protected. And I do not understand why the QNAP engineers do not offer a setting in a menu where I can control the rebuild or migration speed. Every decent RAID controller can do this, even if that QNAP is software RAID and even if I have to wait a bit longer for that system to respond while rebuilding, it has to happen as fast as tolerable.

And by the way, a RAID migration is not a question of once in a lifetime, I have done this before and I migrated from 2 bay to 4 bay and I have to do this now again. And this happened in 2 year's time not a lifetime. And in maybe 1 or 2 years when hard drives become cheaper I may do it again.

We do not live in the age of diskette drives and tapes anymore. It is the digital age where every picture taken with a cheapo camera takes about 3 to 5 MB now.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Mon Jan 20, 2014 4:38 am
by doktornotor
Shocking discovery that a faulty drive is difficult to format and I/O errors cause serious problems... Why are you trying to format a defective drive in the first place? Replace it.

Any numbers set as suggested in the OP on this thread are maximum values. They may not even be reached, they are not guaranteed in any way and they will be way lower if you are working with the files on the NAS while the RAID is being rebuilt.

P.S. "Once in a NAS lifetime" is not the same thing like "once in [your] lifetime."

Re: [HOWTO] How to increase raid rebuild speed

Posted: Thu Feb 06, 2014 8:00 pm
by Litakely
Works like a charm! Thank you

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sat Mar 22, 2014 2:26 am
by harmen1001
Works like a charm! Thanks!

Re: [HOWTO] How to increase raid rebuild speed

Posted: Thu May 15, 2014 1:01 am
by doodley
Hi,

massive problem.

I wanted to capacity expand a QNAP Turbo TS-559 Pro 5-bay NAS server running RAID5 on 5x2TB harddisks.

marked one drive for replacement... and replaced the WD2TB with a WD4TB... rebuild started... but got stuck.

i followed your advice to increase built-speed... that worked for a while... now i got this: and build speed is down to speed=176K/sec estimating a rebuild time of finish=94869.0min = 65.88 days....

Code: Select all

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md0 : active raid5 sdb3[1] sda3[5] sde3[4] sdd3[3] sdc3[2]
      7807782400 blocks level 5, 64k chunk, algorithm 2 [5/4] [_UUUU]
      [=========>...........]  recovery = 48.4% (945127008/1951945600) finish=94869.0min speed=176K/sec
      
md5 : active raid1 sde2[2](S) sdd2[3](S) sdc2[4](S) sdb2[1] sda2[0]
      530048 blocks [2/2] [UU]
      
md13 : active raid1 sda4[0] sdd4[4] sdc4[3] sde4[2] sdb4[1]
      458880 blocks [5/5] [UUUUU]
      bitmap: 0/57 pages [0KB], 4KB chunk

md9 : active raid1 sda1[0] sdd1[4] sde1[3] sdc1[2] sdb1[1]
      530048 blocks [5/5] [UUUUU]
      bitmap: 4/65 pages [16KB], 4KB chunk

unused devices: <none>
dmesg gives me: what is the problem here?

Code: Select all

[69365.751315] ata2.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x0
[69365.754219] ata2.00: irq_stat 0x40000008
[69365.757105] ata2.00: failed command: READ FPDMA QUEUED
[69365.760214] ata2.00: cmd 60/00:20:f0:3b:cb/04:00:70:00:00/40 tag 4 ncq 524288 in
[69365.760218]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69365.765987] ata2.00: status: { DRDY ERR }
[69365.768872] ata2.00: error: { UNC }
[69365.783978] ata2.00: configured for UDMA/133
[69365.786809] ata2: EH complete
[69372.153900] ata2.00: exception Emask 0x0 SAct 0x1c7ffffc SErr 0x0 action 0x0
[69372.156767] ata2.00: irq_stat 0x40000008
[69372.159584] ata2.00: failed command: READ FPDMA QUEUED
[69372.162401] ata2.00: cmd 60/00:d0:f0:3b:cb/04:00:70:00:00/40 tag 26 ncq 524288 in
[69372.162404]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69372.168121] ata2.00: status: { DRDY ERR }
[69372.171054] ata2.00: error: { UNC }
[69372.186238] ata2.00: configured for UDMA/133
[69372.189221] ata2: EH complete
[69378.523344] ata2.00: exception Emask 0x0 SAct 0xfffffc SErr 0x0 action 0x0
[69378.526352] ata2.00: irq_stat 0x40000008
[69378.529272] ata2.00: failed command: READ FPDMA QUEUED
[69378.532280] ata2.00: cmd 60/00:10:f0:3b:cb/04:00:70:00:00/40 tag 2 ncq 524288 in
[69378.532283]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69378.538558] ata2.00: status: { DRDY ERR }
[69378.541699] ata2.00: error: { UNC }
[69378.557051] ata2.00: configured for UDMA/133
[69378.560120] ata2: EH complete
[69384.572481] ata2.00: exception Emask 0x0 SAct 0x3ffffc SErr 0x0 action 0x0
[69384.575980] ata2.00: irq_stat 0x40000008
[69384.579201] ata2.00: failed command: READ FPDMA QUEUED
[69384.582488] ata2.00: cmd 60/00:a8:f0:3b:cb/04:00:70:00:00/40 tag 21 ncq 524288 in
[69384.582491]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69384.589110] ata2.00: status: { DRDY ERR }
[69384.592457] ata2.00: error: { UNC }
[69384.608115] ata2.00: configured for UDMA/133
[69384.611608] ata2: EH complete
[69390.754466] ata2.00: exception Emask 0x0 SAct 0xfffff SErr 0x0 action 0x0
[69390.757891] ata2.00: irq_stat 0x40000008
[69390.761315] ata2.00: failed command: READ FPDMA QUEUED
[69390.764811] ata2.00: cmd 60/00:00:f0:3b:cb/04:00:70:00:00/40 tag 0 ncq 524288 in
[69390.764815]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69390.771836] ata2.00: status: { DRDY ERR }
[69390.775383] ata2.00: error: { UNC }
[69390.791146] ata2.00: configured for UDMA/133
[69390.794752] ata2: EH complete
[69396.991449] ata2.00: exception Emask 0x0 SAct 0xffffc SErr 0x0 action 0x0
[69396.995260] ata2.00: irq_stat 0x40000008
[69396.998986] ata2.00: failed command: READ FPDMA QUEUED
[69397.002574] ata2.00: cmd 60/00:98:f0:3b:cb/04:00:70:00:00/40 tag 19 ncq 524288 in
[69397.002578]          res 41/40:00:28:3c:cb/00:00:70:00:00/40 Emask 0x409 (media error) <F>
[69397.009711] ata2.00: status: { DRDY ERR }
[69397.013258] ata2.00: error: { UNC }
[69397.029096] ata2.00: configured for UDMA/133
[69397.032943] sd 1:0:0:0: [sde] Unhandled sense code
[69397.036721] sd 1:0:0:0: [sde]  Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[69397.040571] sd 1:0:0:0: [sde]  Sense Key : Medium Error [current] [descriptor]
[69397.044464] Descriptor sense data with sense descriptors (in hex):
[69397.048341]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[69397.052302]         70 cb 3c 28 
[69397.056150] sd 1:0:0:0: [sde]  Add. Sense: Unrecovered read error - auto reallocate failed
[69397.060104] sd 1:0:0:0: [sde] CDB: Read(10): 28 00 70 cb 3b f0 00 04 00 00
[69397.064104] end_request: I/O error, dev sde, sector 1892367400
[69397.067989] md/raid:md0: read error not correctable (sector 1890246816 on sde3).
[69397.071658] raid5: some error occurred in a active device:4 of md0.
[69397.075427] raid5: Keep the raid device active in degraded mode but set readonly.
[69397.079308] md/raid:md0: read error not correctable (sector 1890246824 on sde3).
[69397.084082] raid5: some error occurred in a active device:4 of md0.
Linux QNAP2 3.4.6 #1 SMP Fri Apr 26 01:53:35 CST 2013 x86_64 unknown

... now it seems to freeze up...

great.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Thu May 15, 2014 1:32 am
by Don
Changing these numbers will not cause errors. It will only go as fast as it can go. Other services might slow down but it won't cause errors. It is possible that the new drive is bad.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Thu May 15, 2014 2:06 am
by doodley
thx for the answer.

i will replace the new 4TB with an 2TB (not the one i took out... because i suspect it remembers the SerialNumber and then gets confused/does odd things)

so the log says sde is the faulty drive...

md0 : active raid5 sda3[5] sdb3[1] sde3[4] sdd3[3] sdc3[2]

... looking at the qnap from the front.... which one is SDE?

okay hdparm actually gives me the serial... and i can check via the webinterface... which bay has which drive...

great.... so i have a faulty drive... (not the new drive)

maybe qnap should ask the user to "health check" all sectors of all drives before going operational...

that would be some kind of service.

otherwise users have to do this manually.

Code: Select all

[/share/external/sdi1] # hdparm -i /dev/sde

/dev/sde:

 Model=SAMSUNG HD204UI                         , FwRev=1AQ10001, SerialNo=S2JGJD1ZA00149      
 Config={ Fixed }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
 BuffType=unknown, BuffSize=32767kB, MaxMultSect=16, MultSect=?0?
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4 
 DMA modes:  mdma0 mdma1 mdma2 
 UDMA modes: udma0 udma1 udma2 
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: unknown: 

 * signifies the current active mode
f***!

i made a backup prior to all this expansion stuff...

how does ext3-RAID5 detect (not at all i guess) and handle defect sectors?

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sat Jul 19, 2014 6:24 am
by Dannejaha
I have yet to read every single post in the thread, but i would like to point out that after you have increased the minimun speed, you may want to stop Download Station or any other App/Program that is running against the drives. I stopped download station (which only had inactive/seeding old torrents), and the extension increased from 16.5MB to 21.5MB/s, a 30% speed increase.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sat Jul 19, 2014 6:12 pm
by schumaku
Dannejaha wrote:..., but i would like to point out that after you have increased the minimum speed, you may want to stop Download Station or any other App/Program that is running against the drives.
Yes, absolutely correct. On the opposite, the default limits are set in a way that the ongoing operations are not massively hurt. Technically, the rebuild does only bring up existing storage blocks to the higher level of redundancy again, anything newly written blocks are written to the complete RAID, regardless if RAID1, RAID5, RAID6, or RAID10. Comes the question why an OP wants to change the defaults at all...

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sat Jul 19, 2014 8:22 pm
by Dannejaha
I guess it depends on how long you want it to run. It seems to run very slow from beginning and every bits of speed is wanted. But sure, if you need to read from the discs meanwhile, you may want to have your apps running. Currently i have 25MB/s and it is still ongoing. Raid 6, 5x3TB discs, 90% usage of space. I am expanding from 4 to 5 discs.

It is for home use purpose in my case, so i can have it "down" 1-2 days, a week of slow performance is not of desire.

[EDIT]
May i ask. Mine has just passed 67% and now the speed seems to be running at 63MB/s. Why is the speed changing, and why is it about 2-3 times faster now? All discs are of the exact same type, the four first bought at one point and the final and fifth now recently.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Mon Oct 13, 2014 11:46 pm
by buster626
I am running a QNAP TS-653 and migrating from Raid 5 to Raid 6. Tried everything on this forum and still running very slow.

md2 : active raid6 sdd3[3] sdc3[0] sda3[2] sdb3[1]
11701135232 blocks super 1.0 level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
[>....................] reshape = 3.9% (233299904/5850567616) finish=13971.0min speed=6700K/sec

Never peaks above 7000K/sec

Updated the min and max values... hdparm show 100MB/sec +

Cache Size is 4096.. CPU is at 1%.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Thu Dec 04, 2014 1:53 pm
by unixvrules
I have a TC-EC1279U-RP. modifying speed_limit_max and speed_limit_min wasn't effective. Yes, the logs shows the max speed being propagated to /sys/block/md1/md/sync_speed_max but not the min value.

Code: Select all

echo 500000 > /proc/sys/dev/raid/speed_limit_max
echo 200000 > /proc/sys/dev/raid/speed_limit_min

egrep speed /proc/mdstat
[=>...................] recovery = 7.6% (299884224/3897063616) finish=11985.3min speed=5001K/sec

Perform cmd "/bin/echo 500000 > /sys/block/md1/md/sync_speed_max 2>>/dev/null" OK, cmd_rsp=0, reason code:0.

BUT not the min value and mdstat still capped at 5000.

THIS was magic for me....

Code: Select all

echo 50000 > /sys/block/md1/md/sync_speed_min  
cat /sys/block/md1/md/sync_speed_min           
50000 (local)
egrep speed /proc/mdstat                


[=>...................] recovery = 8.2% (320790336/3897063616) finish=1165.0min speed=51160K/sec


Something to make a note in the unofficial docs?

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sat Dec 20, 2014 7:33 pm
by fantomas
as I have posted some months ago, rebuilding of RAID6/RAID6 can be speed up by tuning different parameter:

http://forum.qnap.com/viewtopic.php?f=1 ... d5#p323417

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Mar 03, 2015 12:30 pm
by tom4moore
This has helped my synchronization like no other. I have been seeing speeds early on in the process from 169000K/sec and now it is at 76% complete and still maintaining 102000K/sec. CPU Usage has been about 10-25% on my TS-451 building a RAID5 array with 3x HGST HDN724040ALE640, 4TB, 7200RPM drives. Great tip. :) Thanks

fantomas, your tip is to adjust the stripe size or am I misunderstanding and should be the cache size?

Code: Select all

[~] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid5 sda3[0] sdc3[2] sdb3[1]
      7794127232 blocks super 1.0 level 5, 64k chunk, algorithm 2 [3/3] [UUU]
      [===============>.....]  resync = 77.2% (3008933904/3897063616) finish=144.8min speed=102201K/sec

md256 : active raid1 sdc2[2](S) sdb2[1] sda2[0]
      530112 blocks super 1.0 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md13 : active raid1 sda4[0] sdc4[2] sdb4[1]
      458880 blocks super 1.0 [24/3] [UUU_____________________]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md9 : active raid1 sda1[0] sdc1[2] sdb1[1]
      530048 blocks super 1.0 [24/3] [UUU_____________________]
      bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>