A little help needed for the new guy

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
BlueWave
New here
Posts: 6
Joined: Thu Jul 27, 2017 2:39 am

A little help needed for the new guy

Post by BlueWave »

Hello,

I have the TVS-671 running QTS 5.0.1.2145 Build 20220903 firmware with six Seagate SATA drives in a Raid 5 array. I'm having some problems getting my Raid system back online and restoring access to almost 40TB of data.

Short story, shorter: I wanted to setup a Cloud File Mount with Google Drive but since I allocated 100% of my storage pool to the Raid 5 array it said I had no available storage so since I know I had about 8TB of available storage in my only active pool I thought I could shrink it by 1GB and thus enable this Cloud File App to mount my Google Drive... Obviously I'm here because things didn't go well. The process (resize pool and change from thick to thin) appeared to complete without error however that evening when I went to watch Plex I realized it wasnt working, logged into the QNAP and realized it (Plex) appeared to be frozen, after restarting the app didnt fix the issue I figured I could restart the NAS, wait 5-10 minutes then be back up and running. Basically the NAS did restart, except on restart it was stuck on the "Booting system please wait" screen for over an hour. I tried the 10 second reset trick with a paperclip and was able to boot my NAS only when the drives were removed (I never pulled them all the way out so that i dont forget their order). After booting I was careful to never initialize the system knowing that would erase everything but by getting back into shell and manually inserting the drives back in one at a time I was able to run some commands. Over the past 15+ hours of trying to fix this (beginner skill level so lots of google searches) I followed the flash reset using a spare USB drive I had and after that I was able to boot into the dashboard but the storage & snaphots would say there was an error in the storage pool



Based on what info I've seen others ask or post I hope this helps some...

cat /proc/mdstat...

Code: Select all

[~] # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md1 : active raid5 sda3[0] sdg3[5] sdf3[4] sde3[3] sdd3[2] sdc3[1]
      48782416000 blocks super 1.0 level 5, 64k chunk, algorithm 2 [6/6] [UUUUUU]
      bitmap: 0/73 pages [0KB], 65536KB chunk

md322 : active raid1 sdg5[6](S) sdf5[5](S) sde5[4](S) sdd5[3](S) sdc5[2] sda5[0]
      7235136 blocks super 1.0 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md256 : active raid1 sdg2[6](S) sdf2[5](S) sde2[4](S) sdd2[3](S) sdc2[2] sda2[0]
      530112 blocks super 1.0 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md13 : active raid1 sdg4[24] sdf4[4] sde4[3] sdd4[2] sdc4[1] sda4[25]
      458880 blocks super 1.0 [24/6] [UUUUUU__________________]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md9 : active raid1 sdg1[24] sdf1[4] sde1[3] sdd1[2] sdc1[1] sda1[25]
      530048 blocks super 1.0 [24/6] [UUUUUU__________________]
      bitmap: 1/1 pages [4KB], 65536KB chunk
md_checker...

Code: Select all

[~] # md_checker                                                                                                   

Welcome to MD superblock checker (v2.0) - have a nice day~

Scanning system...


RAID metadata found!
UUID:		cd633af7:e155912f:991eaa92:7860bbb5
Level:		raid5
Devices:	6
Name:		md1
Chunk Size:	64K
md Version:	1.0
Creation Time:	Sep 24 00:08:54 2022
Status:         ONLINE (md1) [UUUUUU]
===============================================================================================
 Enclosure | Port | Block Dev Name | # | Status |   Last Update Time   | Events | Array State
===============================================================================================
 NAS_HOST       1        /dev/sda3   0   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
 NAS_HOST       2        /dev/sdc3   1   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
 NAS_HOST       3        /dev/sdd3   2   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
 NAS_HOST       4        /dev/sde3   3   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
 NAS_HOST       5        /dev/sdf3   4   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
 NAS_HOST       6        /dev/sdg3   5   Active   Sep 26 21:56:52 2022        6   AAAAAA                   
===============================================================================================

mdadm --detail /dev/md1....

Code: Select all

[~] # mdadm --detail /dev/md1
/dev/md1:
        Version : 1.0
  Creation Time : Sat Sep 24 00:08:54 2022
     Raid Level : raid5
     Array Size : 48782416000 (46522.54 GiB 49953.19 GB)
  Used Dev Size : 9756483200 (9304.51 GiB 9990.64 GB)
   Raid Devices : 6
  Total Devices : 6
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Mon Sep 26 22:02:50 2022
          State : clean 
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : 1
           UUID : cd633af7:e155912f:991eaa92:7860bbb5
         Events : 6

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       35        1      active sync   /dev/sdc3
       2       8       51        2      active sync   /dev/sdd3
       3       8       67        3      active sync   /dev/sde3
       4       8       83        4      active sync   /dev/sdf3
       5       8       99        5      active sync   /dev/sdg3

Code: Select all

[~] # df -aTh
Filesystem           Type            Size      Used Available Use% Mounted on
none                 tmpfs         400.0M    291.4M    108.6M  73% /
devtmpfs             devtmpfs        7.7G      4.0K      7.7G   0% /dev
/proc                proc               0         0         0   0% /proc
/proc                proc               0         0         0   0% /proc
devpts               devpts             0         0         0   0% /dev/pts
sysfs                sysfs              0         0         0   0% /sys
tmpfs                tmpfs          64.0M    444.0K     63.6M   1% /tmp
tmpfs                tmpfs           7.8G    136.0K      7.8G   0% /dev/shm
tmpfs                tmpfs          16.0M     72.0K     15.9M   0% /share
/dev/sdb5            ext2            7.8M     28.0K      7.8M   0% /mnt/boot_config
tmpfs                tmpfs          16.0M         0     16.0M   0% /mnt/snapshot/export
/dev/md9             ext3          493.5M    174.8M    318.7M  35% /mnt/HDA_ROOT
cgroup_root          tmpfs           7.8G         0      7.8G   0% /sys/fs/cgroup
none                 cgroup             0         0         0   0% /sys/fs/cgroup/memory
cpu                  cgroup             0         0         0   0% /sys/fs/cgroup/cpu
/dev/md13            ext4          417.0M    406.3M     10.6M  97% /mnt/ext
tmpfs                tmpfs          32.0M     27.2M      4.8M  85% /samba_third_party
none                 configfs           0         0         0   0% /sys/kernel/config
/dev/ram2            ext2          433.9M      2.3M    431.6M   1% /mnt/update
tmpfs                tmpfs          64.0M      3.1M     60.9M   5% /samba
tmpfs                tmpfs          48.0M     60.0K     47.9M   0% /samba/.samba/lock/msg.lock
tmpfs                tmpfs          16.0M         0     16.0M   0% /mnt/ext/opt/samba/private/msg.sock


I tried running this...

Code: Select all

/etc/init.d/services.sh stop
/etc/init.d/opentftp.sh stop
/etc/init.d/Qthttpd.sh stop
umount /dev/mapper/ce_cachedev1
e2fsck_64 -f -v -C 0 /dev/mapper/ce_cachedev1
reboot
However I believe it got stuck on the e2fsck_64 around 60% and after 12+ hours of zero activity (hard drives also got silent) I would eventually power cycle the NAS and be back to square one. At one point when md_checker said the Raid array was offline following some other post I ran some mdadm commands to reassemble the array and I believe it worked and showed online but e2fsck_64 would never seem to complete 100%.

When I run "/etc/init.d/init_lvm.sh" it seems to run but then eventually gets stuck...

Code: Select all

[/] # /etc/init.d/init_lvm.sh
Changing old config name...
mv: can't rename '/etc/config/qdrbd.conf': No such file or directory
Reinitialing...
Detect disk(8, 80)...
dev_count ++ = 0Detect disk(8, 48)...
dev_count ++ = 1Detect disk(8, 16)...
ignore non-root enclosure disk(8, 16).
Detect disk(8, 96)...
dev_count ++ = 2Detect disk(253, 0)...
ignore non-root enclosure disk(253, 0).
Detect disk(8, 64)...
dev_count ++ = 3Detect disk(8, 32)...
dev_count ++ = 4Detect disk(8, 0)...
dev_count ++ = 5Detect disk(8, 80)...
Detect disk(8, 48)...
Detect disk(8, 16)...
ignore non-root enclosure disk(8, 16).
Detect disk(8, 96)...
Detect disk(253, 0)...
ignore non-root enclosure disk(253, 0).
Detect disk(8, 64)...
Detect disk(8, 32)...
Detect disk(8, 0)...
sys_startup_p2:got called count = -1
I'm not sure how long I should leave it up and running for however it's been a few hours this round but the drives have become silent (The NAS is on my desktop while trying to figure this issue out). I guess maybe should leave it running for tonight and check tomorrow morning when I return to work. I do notice that this time it "says" my storage pool is online and ready but File Station is showing nothing so far (Screenshot: https://i.gyazo.com/848e15a37d3f3ff519b ... 67134d.png) so hopefully thats a sign of progress but still I hear very little disk activity and no updates on the shell command that's sitting on "sys_startup_p2:got called count = -1". Assuming this fails, any other ideas for troubleshooting this?
BlueWave
New here
Posts: 6
Joined: Thu Jul 27, 2017 2:39 am

Re: A little help needed for the new guy

Post by BlueWave »

On another computer I was able to SSH into the QNAP and simply run "top" to see whats running and I do see "/bin/e2fsck_64 -q 1 -n /dev/mapper/ce_cachdev1" running with 25% cpu usage and I see that percentage of CPU changing some so I guess that means its not frozen and I just need to be more patient.

I do see under VSZ it shows 16880m which Im guessing means its using up 100% of my 16GB Ram on the NAS... Is this normal behavior?

Also is there a command I can run that will show me the progress withOUT terminating whats currently running?
FSC830
Experience counts
Posts: 2043
Joined: Thu Mar 03, 2016 1:11 am

Re: A little help needed for the new guy

Post by FSC830 »

As long as filesystem check is running no data access is possible.
My guess is you need to wait until filesystem check is done and wait for the outcome.

Using the 10s reset resets all network settings back to default and removes all user shares/permissions (data is still untouched).
A better way would have been to delete the volume, create a new volume (with smaller size) and restore data from backup.
No backup? Prepare to loose data, if not already lost by the actions now, somewhere in future, the possibilities for data loss are numerous...

Regards
BlueWave
New here
Posts: 6
Joined: Thu Jul 27, 2017 2:39 am

Re: A little help needed for the new guy

Post by BlueWave »

So after about 18hrs the "/etc/init.d/init_lvm.sh" finally completed successfully and I was able to boot back into the NAS normally. Just when I thought this 5+ day long journey though hell was over I noticed my storage pool was suddenly encrypted and its asking me for an encryption key to unlock, what encryption key, I've NEVER set one before. I tried all my admin passwords, the mac address, everything I could think of... nothing is being accepted. Ive rebooted my NAS countless times in the past, its never suddenly encrypted my drive. Heck I have screenshots of my storage & snapshots page from yesterday clearly showing the drive was unlocked. Now when I go to run a check disk I get...

Code: Select all

[~] # e2fsck_64 -c -f -v -p -C 0 /dev/mapper/cachedev1   
e2fsck_64: Bad magic number in super-block while trying to open /dev/mapper/cachedev1
/dev/mapper/cachedev1: 
The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

/dev/mapper/cachedev1 contains a crypt_LUKS file system
Any ideas how to move forward with this? Ive put in a ticket with QNAP but I understand its about 24hrs for a response.
FSC830
Experience counts
Posts: 2043
Joined: Thu Mar 03, 2016 1:11 am

Re: A little help needed for the new guy

Post by FSC830 »

Was your NAS exposed to internet, i.e. by using myqnapcloud?
Can you exclude that you have been getting hacked?

If QNAP replies within 24h you are lucky, sometimes it last days!

Regards
BlueWave
New here
Posts: 6
Joined: Thu Jul 27, 2017 2:39 am

Re: A little help needed for the new guy

Post by BlueWave »

Yeah my NAS was exposed to the internet and I'd be lying if I said I had it 100% locked down properly however I did not use port forwarding and did use 2FA. Im acutely aware of the ransomware attacks but never read in depth as I believe I was never a victim. This volume lock suddenly appeared after days of running various commands to restore the Raid5 array and admittably probably running some commands that I didnt fully understand.

Anyways so as a LastPass user I was able to view some old admin passwords I previously used and one of them appears to have worked, it now says "Unlocking...11%" and running TOP in SSH I see "/bin/e2fsck_64 -q 1 -n /dev/mapper/ce_cachdev1" as the top running process. No errors in QNAP gui about a wrong decryption password (All other attempts gave me an error msg) so my guess is that its working however it's been stuck at 11% for the past 30min now.... Is this normal? The last time I ran e2fsck_64 (yesterday lol) it took almost 18+ hours to complete so I hope its not going to do all that again, but can you or anyone confirm if its normal for decryption to take soo long (6 x 10TB Drives in Raid 5)?
Post Reply

Return to “System & Disk Volume Management”