after a motherboard suddenly died on a TS-870U-RP, I bought a new TS-877XU and moved all the 8 hard disks : all the data were successfully restored (such a relief... for a short time).
Unfortunately, after a few days, the disks (that were rather old) began to fail one after the other and I am now in a very bad situation :
my main Volume was using disks enclosures 2-3-5-6-7-8 (6 disks then) in RAID6 so, theorically, I could lose 2 disks and it happened :
Firstly, enclosure 5 failed, I replaced the disk with a new one and it stated to rebuild. Meanwhile, I configured the enclosure 4 as a Spare Disk (big mistake I think as I will explain later...).
During the rebuilding process, enclosure 7 failed as well (bad sectors) and everything hung (no access to web interface). After two days waiting, I decided to perform a hard reboot on the NAS : still hung.
After replacing the disk in enclosure 7, the web interface service went back to life but the RAID (/dev/md1) and the associated volume had disappeared.
On the web interface, it just says that "RAID device is inactive" and in Storage->Disks, the RAID group 1 seems to be wrong : it indicates in blue enclosures 2-4-5-6-8. Capacity, Raid group name, RAID type and disk member are all blanks.
On the CLI, md_checker gave me another strange result:
Code: Select all
RAID metadata found!
UUID: 73d3c616:09ac1b87:4f9d29a4:fa00100d
Level: raid6
Devices: 7
Name: md1
Chunk Size: 64K
md Version: 1.0
Creation Time: Apr 7 18:34:39 2014
Status: OFFLINE
===============================================================================
Disk | Device | # | Status | Last Update Time | Events | Array State
===============================================================================
5 /dev/sdg3 0 Rebuild Dec 24 14:55:44 2019 71778 AAAA.AA
2 /dev/sda3 1 Active Dec 24 14:55:44 2019 71778 AAAA.AA
8 /dev/sde3 2 Active Dec 24 14:55:44 2019 71778 AAAA.AA
3 /dev/sdd3 3 Active Dec 24 14:55:44 2019 71778 AAAA.AA
4 /dev/sdc3 3 Active Dec 20 16:05:31 2019 23170 AAAAAAA
-------------- 4 Missing -------------------------------------------
6 /dev/sdf3 5 Active Dec 24 14:55:44 2019 71778 AAAA.AA
-------------- 6 Missing -------------------------------------------
===============================================================================
WARNING: Duplicate device detected for #(3)!
The "duplicate device detected" is a mystery to me...
Now if I try to re-assemble the RAID using mdadm and giving the 4 disks that should still be in good conditions (enclosure 2-3-6-8), it fails :
Code: Select all
[~] # mdadm --assemble /dev/md1 /dev/sda3 /dev/sdd3 /dev/sdf3 /dev/sde3
mdadm: failed to get exclusive lock on mapfile - continue anyway...
mdadm: /dev/md1 assembled from 4 drives - not enough to start the array.
If I add the spare disk sdc3 on the command line, the error is exactly the same:
Code: Select all
[~] # mdadm --assemble /dev/md1 /dev/sda3 /dev/sdd3 /dev/sdf3 /dev/sde3 /dev/sdc3
mdadm: failed to get exclusive lock on mapfile - continue anyway...
mdadm: /dev/md1 assembled from 4 drives - not enough to start the array.
- if it is possible to remove the spare disk in the RAID metadata found by md_checker so that it considers only 6 devices (then I could possibly try again with --assemble)
- how can I be sure of the order of the /dev/sdX3 in the create command ?
- how can things have turned so bad...?
Thanks for your valuable help!