Rebuilding stuck at 70.0%

Questions about SNMP, Power, System, Logs, disk, & RAID.
Locked
E-Vix
New here
Posts: 4
Joined: Mon Oct 05, 2015 3:09 pm

Rebuilding stuck at 70.0%

Post by E-Vix »

I have a QNAP TS-419p with 4x 3TB disks in RAID 5.
All was working well until a couple of months ago disk 3 started giving warnings about bad sectors. I was planning on getting it replaced when disk 1 failed entirely (disk is completely dead) and before I could get that disk replaced disk 3 failed due to too much bad sectors as well and luckily the array was automatically put in read only mode. After giving disk 3 a complete scan it was healthy enough to get the array back online. At this part it seams like no data was lost.
I then realized I had a spare 3TB disk in an old unused NAS and I proceeded to use that disk to replace the failed disk 1. I tried doing it the QNAP way (take old disk out, wait a couple of minutes until the NAS beeps and put the replacement disk in), but it would not start rebuilding the array. After trying this for numerous times (removing al created partitions from the disk in between to make it clean again) I realized that this wasn't going to work. I then started to try and rebuild it from the terminal and after numerous failed attempts I finally got that working.
Rebuilding was slow at an average of 3000KB/s and after a week the NAS got really and 1 or 2 days later it spontaneously rebooted. killing the rebuild...
Disk 1 now had a read/write error. I checked the disk which was healthy. I put it back in the NAS again and I restarted the rebuild.
After about a week the rebuild reached 70.0% and then slowed down. 4 days later it was still at 70.0%. Seeing no progress at all I aborted, rebooted the NAS (It was terribly laggy in the terminal) and I restarted the rebuild once again.
Now a week later it reached 70.0% again ad it is stuck again at exactly the same spot.
Does anybody have an idea what is going on and how I might be able to fix this? It's stuck at exactly the same place (not the percentage, but the bytes or blocks that are mentioned after that). The speed mentioned is false. Even though it mentions that it is doing 1KB/s it has been at the same block for a few days now.

The easy answer would be "Rebuild the NAS and restore your data from a backup", but I don't mind losing the data. The data is there and it is accessible. I could get an extra disk and move it all there, but right now I just want to get the array in a healthy state again. Using MDADM is a learning curve for me that is more valuable to me than the data on the disks. Getting it working with all data in tact is the ultimate goal. If that proves to be impossible then I'll rebuild it.

Code: Select all

[~] # cat /proc/mdstat
Personalities : [raid1] [linear] [raid0] [raid10] [raid6] [raid5] [raid4]
md0 : active raid5 sda3[4] sdb3[1] sdd3[3] sdc3[2]
                 8786092608 blocks super 1.0 level 5, 64k chunk, algorithm 2 [4/3] [_UUU]
                 [==============>......]  recovery = 70.0% (2051751688/2928697536) finish=8262465.3min speed=1K/sec

md4 : active raid1 sdd2[2] sdb2[0]
                 530128 blocks super 1.0 [2/2] [UU]

md13 : active raid1 sda4[3] sdb4[0] sdd4[2] sdc4[1]
                 458880 blocks [4/4] [UUUU]
                 bitmap: 0/57 pages [0KB], 4KB chunk

md9 : active raid1 sda1[0] sdb1[1] sdd1[3] sdc1[2]
                 530048 blocks [4/4] [UUUU]
                 bitmap: 4/65 pages [16KB], 4KB chunk

unused devices: <none>
[~] #

Code: Select all

[~] # mdadm --detail /dev/md0
/dev/md0:
        Version : 01.00.03
  Creation Time : Fri Mar  6 22:15:02 2020
     Raid Level : raid5
     Array Size : 8786092608 (8379.07 GiB 8996.96 GB)
  Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Sep 13 18:48:00 2021
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 70% complete

           Name : 0
           UUID : e1cf6884:9a15f175:c3587901:63ddd9fc
         Events : 3040525

    Number   Major   Minor   RaidDevice State
       4       8        3        0      spare rebuilding   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       3       8       51        3      active sync   /dev/sdd3
[~] #
holger_kuehn
Easy as a breeze
Posts: 413
Joined: Sun Oct 20, 2013 11:45 pm
Location: Premnitz, Germany

Re: Rebuilding stuck at 70.0%

Post by holger_kuehn »

Usually this points to a problem with one of the disks reading from. There might be a problem with a single sector on one of the disks. If thats the case you will not be able to rebuild it. This is something RAID-scrubbing is used for. Did this ran beforehand on your RAID 5.

If it's stuck there for days, you could try a reboot, the rebuilding should continue, otherwise I'd suspect a complete rebuild is in order.
NAS (production): TS-1635AX FW: QTS 5.1.4.2596 build 20231128
NAS (backup): TS-1635AX FW: QTS 5.1.4.2596 build 20231128
QTS (SSD): [RAID-1] 2 x 2TB Samsung Evo 860 M.2-Sata
Data (QTier): [RAID-6] 4 x 4TB Samsung 870 QVO Sata
Data (HDD): [RAID-6] 7 x 18TB Exos
RAM: 8 GB (QNAP shipped)
UPS: CyberPower CP900EPFCLCD
BACKUP: 10x4TB WD Red using a USB 3.0 Dock
Usage: SMB with rclone (encrypted)

NAS: TS-873U-RP FW: QTS 5.1.4.2596 build 20231128
Data (SSD): [RAID-10] 4 x 1TB Samsung Evo 860 Sata
RAM: 8 GB (QNAP shipped)
UPS: CyberPower PR2200ELCDRT2U
BACKUP: 4TB Synology DS214 FW: DSM 7.0.41890
Usage: SMB, Backup Domain Controller
User avatar
dolbyman
Guru
Posts: 35024
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Rebuilding stuck at 70.0%

Post by dolbyman »

Old cat1 units do not support scrubbing afaik
Locked

Return to “System & Disk Volume Management”