Page 12 of 14

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Dec 20, 2016 11:27 pm
by avvidme
Well, it finally finished. TS-859 Raid 5 to 6 migration plus 1 extra drive (expansion) = 17 days.
All services turned off.

And yet my throughput performance is now back to normal.

Unfreakin believable.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sun Feb 19, 2017 7:54 am
by solomonshv
i can't thank you enough for finding this. I am upgrading my TS 451 from WD 5TB Red drives (that are coming out of warranty) to 8TB Red Pro drives. i was swapping the drives 1 at a time. i set min to 150,000 and max to 300,000. this increased the rebuild speed THIRTY FOLD!!! as in, 30 times faster. and the CPU load only hovers between 40 to 50%.

i'm blown away. not by how much this helped or how easy it was, but blown away by how f**king stupid QNAP are for this oversight. the default minimum on my 451 was 5000 (5MB/s) and that's pretty much the speed it was moving at. who thought this was a good idea??? at this rate, if a drive failed, there is a pretty good chance a second drive will fall out of RAID in the WEEKS it would take to rebuild

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sun Feb 19, 2017 9:55 am
by P3R
solomonshv wrote:I am upgrading my TS 451 from WD 5TB Red drives (that are coming out of warranty)...
Well everything is coming out of warranty eventually but doesn't a 5 TB WD Red have at the very least 5 month left on the 3 year warranty even if you were among the very first customers to receive it?

When did you buy yours?
...at this rate, if a drive failed, there is a pretty good chance a second drive will fall out of RAID in the WEEKS it would take to rebuild
I disagree about it beeing "...a pretty good chance...".

Faster is of course better if your NAS have nothing better to do than to devote almost all resources toward a rebuild but I don't think that hours, days or even weeks for a rebuild will affect the risk for an additional disk failure more than very marginally. The big risk with a rebuild is that any of the other disks have bad sectors already and the few per cent of additional time even a multi week rebuild would affect that risk are only those few per cent. I don't see how that could be the same as "...a pretty good chance...".

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sun Mar 19, 2017 3:26 pm
by rafarataneneces
avvidme wrote:Well, it finally finished. TS-859 Raid 5 to 6 migration plus 1 extra drive (expansion) = 17 days.
All services turned off.

And yet my throughput performance is now back to normal.

Unfreakin believable.


Did this happened with the default values?

/proc/sys/dev/raid/speed_limit_max
200000
/proc/sys/dev/raid/speed_limit_min
1000

Re: [HOWTO] How to increase raid rebuild speed

Posted: Sun Mar 19, 2017 5:17 pm
by Spider99
this trick only works on a rebuild not a reshape so will not affect a change from r5 to r6

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Jun 13, 2017 6:40 pm
by oyvindo
This tip worked great on my TS-543Mini still - 8 years after first post. :D
Great Smile....

Re: [HOWTO] How to increase raid rebuild speed

Posted: Wed Jun 14, 2017 4:55 pm
by JohnSmith755
I'm glad I found this important information, thanks for posting!

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Sep 26, 2017 4:44 am
by Amedee
Don wrote:I found this article on the web. Use it at your own risk as I have not tested it and make no guarantees. If yo do try it please post your results here so others may benefit.

Increase the speed of Linux Software RAID reconstruction

If you are in a situation where you sit in front of the console (or on a remote ssh connection) waiting for a Linux software RAID to finish rebuilding (either you added a new drive, or you replaced a failed one, etc.) then you might be frustrated by how slow this process is running. You are running cat on /proc/mdstat repeatedly (you should really use watch in this case ;)), and this seems to never finish… Obviously that there is a logical reason for this ‘slowness‘ and on a production system you should leave it running with the defaults. But in case you want to speed up this process here is how you can do it. This will place a much higher load on the system so you should use it with care.

To see your Linux kernel speed limits imposed on the RAID reconstruction use:

Code: Select all

cat /proc/sys/dev/raid/speed_limit_max
200000
cat /proc/sys/dev/raid/speed_limit_min
1000

In the system logs you can see something similar to:

Code: Select all

md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.

This means that the minimum guaranteed speed of the rebuild of the array is approx 1MB/s. The actual speed will be higher and will depend on the system load and what other processes are running at that time.
In case you want to increase this minimum speed you need to enter a higher value in speed_limit_min. For example to set this to approx 50 megabytes per second as minimum use:

Code: Select all

echo 50000 >/proc/sys/dev/raid/speed_limit_min

The results are instant… you can return to the watch window to see it running, and hope that this will finish a little faster (this will really depend on the system you are running, the HDDs, controllers, etc.):

Code: Select all

watch cat /proc/mdstat

There is no watch program on the NAS so that part won't work. You will have to play with the numbers until you get one that works for you.

Article is located here --> http://www.ducea.com/2006/06/25/increas ... struction/


You can still use watch. I suppose that most of y'all are Windows users, so first install Linux-on-Windows, which gives you an Ubuntu inside Windows.
For those of you who are already on Linux, you're almost there.

First you need to install sshpass (in Linux, or in Linux-on-Windows):

Code: Select all

sudo apt install sshpass


Then you type:

Code: Select all

SSHPASS=<YOUR_SUPER_SECRET_PASSWORD> \
NAS_IP=<YOUR_NAS_IP_ADDRESS> \
watch sshpass -e ssh -o PubkeyAuthentication=no -o StrictHostKeyChecking=no admin@$NAS_IP "cat /proc/mdstat"


Output:

Code: Select all

Every 2,0s: sshpass -e ssh -o PubkeyAuthentication=no -o StrictHostKeyChecking=no admin@192.168.178.34 cat /proc/mdstat

Personalities : [raid1] [linear] [raid0] [raid10] [raid6] [raid5] [raid4]
md0 : active raid6 sda3[0] sdd3[3] sdc3[2] sdb3[1]
       3903891072 blocks super 1.0 level 6, 64k chunk, algorithm 2 [4/4] [UUUU]
       [==========>..........]  resync = 53.4% (1044127808/1951945536) finish=923.6min speed=16380K/sec
      
md4 : active raid1 sdd2[4](S) sdc2[3](S) sdb2[2] sda2[0]
       530128 blocks super 1.0 [2/2] [UU]
      
md13 : active raid1 sda4[0] sdc4[5] sdd4[4] sdb4[3]
       458880 blocks super 1.0 [4/4] [UUUU]
       bitmap: 0/8 pages [0KB], 32KB chunk

md9 : active raid1 sda1[0] sdc1[6] sdd1[5] sdb1[4]
       530112 blocks super 1.0 [4/4] [UUUU]
       bitmap: 1/9 pages [4KB], 32KB chunk

unused devices: <none>

SECURITY CONSIDERATIONS
First and foremost, users of sshpass should realize that ssh's insistance on only getting the password interactively is not without reason. It is close to impossible to securely store the
password, and users of sshpass should consider whether ssh's public key authentication provides the same end-user experience, while involving less hassle and being more secure.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Mon Oct 16, 2017 3:34 pm
by Oscar_I
After reading this thread, I still haven't managed to get my raid-rebuild speed to increase to acceptable level

The problem is that the rebuild speed is only between 1000-1800k/Sec (with a time to finish of about 22days!!??)


I had a QNAP TS-453-Pro configured as RAID10 with four 4TB WD-red disks and I am now in the process of replacing them to four 8 TB disks (currently at ~93% disk usage)
The first two disks went fairly fast (although it took a few days for each disk), but now after I inserted the third, the speed seems to be even slower than before.

All disks are in "good" shape according to the internal disk monitor.
I have upgraded the internal firmware about a month ago.
The NAS is not used for anything else than backup, and I have paused the backup during the raid-rebuild.

I have tried different values for /proc/sys/dev/raid/speed_limit_min - but it doesn't seem to have much impact
When I run "top" the stat for MD1_resync is set as DWN - and almost no CPU usage.

Does the number of files impact the rebuild speed? I have a fair amount of rather small files (~32k), could that be a reason?
Does anyone have any clue what might be wrong or have any ideas of what I can try to I can increase the rebuild speed?



Thanks
Oscar

Re: [HOWTO] How to increase raid rebuild speed

Posted: Mon Oct 16, 2017 10:46 pm
by dolbyman
maybe the raidpair drive has issues..raid10 rebuild is a simple copy process no parity calc operations needed.

And the first two disks went ok speedwise you said.
So be patient and hope the other drive does not bite the dust.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Oct 17, 2017 10:54 pm
by Oscar_I
But wouldn't that be shown in the disk monitor ?
All disks are in "good shape" according to the QNAP webinterface

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Oct 17, 2017 11:30 pm
by P3R
Oscar_I wrote:But wouldn't that be shown in the disk monitor ?
Often but not always.

Re: [HOWTO] How to increase raid rebuild speed

Posted: Wed Oct 25, 2017 3:21 pm
by Oscar_I
P3R wrote:
Oscar_I wrote:But wouldn't that be shown in the disk monitor ?
Often but not always.

Any suggestion of where I can get more info about the status (and possibly determine what drive that is that has issues)

When I looked at the output from

Code: Select all

>top


it seems as the rebuild process is pretty much dormant...

Code: Select all

Mem: 1738052K used, 192488K free, 203784K shrd, 217152K buff, 227064K cached
CPU:  7.6% usr 25.7% sys  0.0% nic 37.7% idle 28.2% io  0.0% irq  0.6% sirq
Load average: 20.59 19.58 16.43 3/569 29681
PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
...
20283     2 admin    DWN      0  0.0   3  0.1 [md1_resync]

it seem as the "DWN" means (found on a different forum)

Code: Select all

D - uninterruptible sleep (usually IO)
W - paging (not valid since the 2.6.xx kernel)
N - low-priority (nice to other users)

Is this a sign of a faulty hard drive?
or can I force the QNAP TS-453 Pro to increase the priority?
(the NAS is currently not being used due to the low diskspace)

Thanks!

Re: [HOWTO] How to increase raid rebuild speed

Posted: Wed Oct 25, 2017 4:22 pm
by P3R
Oscar_I wrote:Any suggestion of where I can get more info about the status (and possibly determine what drive that is that has issues)
I would begin with comparing the raw values of the detailed SMART data (Storage Manager, Storage, Disks/VJBOD, [select the disk in question], Disk Health, SMART Information) between all the disks to see if something is significantly different with this disk.

Other than that unfortunately the only suggestion I have would be to run the diagnostic software of the manufacturer on the disk but that would have to be done in a different computer so isn't the best now in the middle of a current storage expansion... :cry:

I would probably contact the proper channel for a possible warranty replacement of the disk (reseller or disk manufacturer depending on where in the world you're located) and ask for their advice.

Are these 8 TB disks also WD Red? If not, what brand and model?

Re: [HOWTO] How to increase raid rebuild speed

Posted: Tue Mar 06, 2018 6:13 am
by gandalf67
Hello,

Just purchased a TS-431P, updated to the latest firmware, installed 4 x 8TB drives (Seagate) and configured a Raid-5 volume. Initial sync is awfully slow, and speed does not change when I change the sync priority:

cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md1 : active raid5 sdd3[3] sdc3[2] sdb3[1] sda3[0]
23412214272 blocks super 1.0 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
[>....................] resync = 2.1% (166559964/7804071424) finish=103654.7min speed=1228K/sec

That's 71 days!

I tried to increase the min and max speed as described here, no result. CPU is almost idle, with the md1_raid5 process using between 0.5 and 3% cpu.

Web UI is slow too, everytime I commit a change, it takes a long time, or even times out/fails, like trying to change the admin password, or trying to reboot. Think I'm going to try a hard reset.

Edit: finally managed to reboot the NAS, it restarted rebuild, this time between 15 and 30 Mbytes/s, which of course had changed more or less 71 days to 71 hours, much better, even if it's a bit slow, but on my previous NAS (Lenovo ix4-300D) it took 2 days for 4x4TB Raid-5 so this seems consistent.

Thanks.