A possible solution for constant HDD activity problem

Discussion about hard drive spin down (standby) feature of NAS.
Post Reply
jozsef.l.kiss
New here
Posts: 3
Joined: Wed Feb 10, 2021 3:08 pm

A possible solution for constant HDD activity problem

Post by jozsef.l.kiss »

Hi All,

I have been struggling with this problem for years, I have been continuously trying to solve it: I spent quite a few days per year for reading articles, forums, investigating the apps, kernel of QTS etc., of course without success. Currently my QNAP NAS periodically (in every 5 seconds) writes something to the HDDs. Now it is very frustrating for me, so I decided to take a deep breath and try to solve the issue now and forever.
Here I would like to summarize what I found and how I took one step forward in this topic.

Finding the root of the problem
You need to find the problematic process which reads or writes files onto the HDD. I tried everything (iotop, inotify-tools, etc), none of these was the right tool for me, I used block_dump. It is a low level dumper for IO, the result is written into the linux kernel ring buffer, don't worry, as far as I know, it is just a buffer, no files are written during the dump.

1.
dmesg -c
Clear the kernel ring buffer

2.
echo 1 > /proc/sys/vm/block_dump
Start the dump, wait for some seconds

3.
dmesg -c
List the result, you will get something like this
[42942.425628] kworker/u4:0(20597): WRITE block 1377711520 on dm-0 (8 sectors)
kworker/u4:0(20597): the process and process id, 1377711520 is the block of the drive which is written (let's name it as X)

4.
echo $(( `sudo blockdev --getbsz /dev/dm-0` / 512 ))
This returns a factor of block sizes (dmesg and debugfs), usually 8. For better understanding please visit this:
https://stackoverflow.com/questions/520 ... -from-vm-b

5. Calculate the debugfs block number using a calculator: Y = X / 8, 1377711520 / 8= 172213940

6.
debugfs
Start debugfs, and type "open /dev/dm-0". Then type "icheck 172213940", 172213940 is the new block number. This returns this:
Block Inode number
172213940 117965700
117965700 is the inode number, then type "ncheck 117965700". Result:
Inode Pathname
117965700 /.qpkg/QVPN/qbelt_log/rated.log
Yes, we have the file name.

7.
df
Filesystem Size Used Available Use% Mounted on
none 400.0M 302.2M 97.8M 76% /
...
cgroup_root 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/mapper/cachedev1
3.5T 1.9T 1.6T 54% /share/CACHEDEV1_DATA
...
The dm-0 logical drive is mounted in /share/CACHEDEV1_DATA, so the full path of the file is: /share/CACHEDEV1_DATA/.qpkg/QVPN/qbelt_log/rated.log.

8.
stat /share/CACHEDEV1_DATA/.qpkg/QVPN/qbelt_log/rated.log
File: /share/CACHEDEV1_DATA/.qpkg/QVPN/qbelt_log/rated.log
Size: 740875 Blocks: 1456 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 117965700 Links: 1
Access: (0666/-rw-rw-rw-) Uid: ( 0/ admin) Gid: ( 0/administrators)
Access: 2022-02-18 12:13:13.000000000
Modify: 2022-02-18 12:16:06.000000000
Change: 2022-02-18 12:16:06.000000000
Checking the access/modify/change fields you can see when the file was touched. In my case, it was written every 5 seconds.

9. Because this file is related to QVPN, I stopped the QVPN service and restarted from 1

Using this method I found some problems, e.g.:
Service, File
dhclient, /mnt/HDA_ROOT/.config/dhclient/wlan0.leases
Codecs Pack, /share/CACHEDEV1_DATA/.qpkg/CodexPack/sys/bus
ProxyServer, /share/CACHEDEV1_DATA/.qpkg/ProxyServer/opt/sbin/squid
CloudConnector, /share/CACHEDEV1_DATA/.qpkg/HybridBackup/CloudConnector3/lib/libsqlite3.so.0.8.6
Media Library, /share/CACHEDEV1_DATA/.system/log/myidbserver.log
...

After switching off the services above, more or less I managed to put my HDDs into standby mode.



Solving the issues one by one

Of course, we need to use all the services, so my idea is to move the problematic files (usually log files) to a ramdisk, and create a symbolic link from the original location to the ramdisk equivalent. Using this method I managed to solve the issue of dhclient (which writes the /mnt/HDA_ROOT/.config/dhclient/wlan0.leases file periodically). This is the method:

1.
mkdir /mnt/ramdisk
mount -t tmpfs -i size=1m tmpfs /mnt/ramdisk/
Create a tmpfs ramdisk and mount it

3.
cd /mnt/ramdisk
mkdir -p /mnt/HDA_ROOT/.config/dhclient/
cd /mnt/ramdisk/mnt/HDA_ROOT/.config/dhclient/
cp /mnt/HDA_ROOT/.config/dhclient/wlan0.leases .
Copy the file to the ramdisk

4.
rm /mnt/HDA_ROOT/.config/dhclient/wlan0.lease
ln -s /mnt/ramdisk/mnt/HDA_ROOT/.config/dhclient/wlan0.leases /mnt/HDA_ROOT/.config/dhclient/wlan0.leases
Remove the original file and create the symbolic link.

By now the dhclient writes the file in every 4 minutes, not the original file but the other located on the ramdisk. This write does not start the HDD to spin up.


Before I go forward I wanted to share my experiences with you and I'm really interested in your opinion. Could this be a solution for the HDD standby problem in general? Can we, the community collect our experiences and write a script that automatically creates the ramdisk and moves all the problematic files to it with links during startup?
jrfbal
New here
Posts: 9
Joined: Wed Jun 12, 2019 7:10 pm

Re: A possible solution for constant HDD activity problem

Post by jrfbal »

Great work.

I'll be following this closely. Even though my disks spin down sometimes, they do randomly wake up for no apparent reason.

When I have some time I'll follow the steps you described here to see if I can also find the reason.
tyler_durden_83
Know my way around
Posts: 171
Joined: Fri Jul 30, 2010 3:56 pm

Re: A possible solution for constant HDD activity problem

Post by tyler_durden_83 »

When I do this: Start debugfs, and type "open /dev/dm-0"
the prompt seems to hang, is it just super slow?
Thank you!
QNAP TS-364 w/ 2 * Western Digital RED 4TB Raid 1
QTS 5.0.0.1986
jozsef.l.kiss
New here
Posts: 3
Joined: Wed Feb 10, 2021 3:08 pm

Re: A possible solution for constant HDD activity problem

Post by jozsef.l.kiss »

It would take time, but no more than 1-2 minutes on my machine (2x4TB HDD).
googleg
Getting the hang of things
Posts: 90
Joined: Sun Mar 14, 2010 2:20 pm

Re: A possible solution for constant HDD activity problem

Post by googleg »

Interesting... Why is your NAS running dhclient, is it not configured with a static IP ?
jozsef.l.kiss
New here
Posts: 3
Joined: Wed Feb 10, 2021 3:08 pm

Re: A possible solution for constant HDD activity problem

Post by jozsef.l.kiss »

For this testing purpose, I relocated the NAS close to me, from where another sub-network was only accessible via WIFI (and DHCP).
tyler_durden_83
Know my way around
Posts: 171
Joined: Fri Jul 30, 2010 3:56 pm

Re: A possible solution for constant HDD activity problem

Post by tyler_durden_83 »

I've tried the above but I can't seem to understand the results... This is what I've discovered:

Code: Select all

[roberto@QNAPTS364 Public]$ sudo ./blkdevMonitor_20151225.sh 3
Password:
===== Welcome to use blkdevMonitor_v2 on Thu May 26 17:28:51 CEST 2022 =====
Stop klogd.sh daemon... Done
Turn off/on VM block_dump & Clean dmesg
Countdown: 3 2 1
Start...
============= 0/3 test, Thu May 26 17:28:59 CEST 2022 ===============
liv<<<<<<<<<<7>[2508486.519536] md9_raid1(3359): WRITE block 1060216

============= 1/3 test, Thu May 26 17:32:41 CEST 2022 ===============
<<<<<<<<<<<<<<<<<<<<<<7>[2508549.664416] kworker/u8:0(20132): WRITE block 3025154168 on dm-0 (8 sectors)

============= 2/3 test, Thu May 26 17:33:44 CEST 2022 ===============
<7>[2508552.272587] debugfs(375): READ block 159383688 on dm-0 (8 sectors)
<7>[2508552.272658] debugfs(375): READ block 159383568 on dm-0 (24 sectors)
<7>[2508552.272596] debugfs(375): READ block 561288 on dm-2 (8 sectors)

Turn off block_dump
Start klogd.sh daemon

[roberto@QNAPTS364 ~]$ sudo debugfs
Password:
debugfs 1.45.5 (07-Jan-2020)
debugfs:  open /dev/dm-0
debugfs:  icheck 378144271
Block   Inode number
378144271       61603868
debugfs:  ncheck 61603868
Inode   Pathname
61603868        /.samba/lock/smbXsrv_tcon_global.tdb
debugfs:  icheck 132527
Block   Inode number
132527  133824520
debugfs:  ncheck 133824520
Inode   Pathname
133824520       /Videos/Tv Shows/House of Cards/Season 3/House.of.Cards.S03E07.720p.WEBRip.DD5.1.x264-FGT.mkv
debugfs:  quit
[roberto@QNAPTS364 ~]$ stat "/share/Videos/Tv Shows/House of Cards/Season 3/House.of.Cards.S03E07.720p.WEBRip.DD5.1.x264-FGT.mkv"
  File: /share/Videos/Tv Shows/House of Cards/Season 3/House.of.Cards.S03E07.720p.WEBRip.DD5.1.x264-FGT.mkv
  Size: 923683954       Blocks: 1804088    IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 133824520   Links: 1
Access: (0770/-rwxrwx---)  Uid: ( 1000/ roberto)   Gid: (  100/everyone)
Access: 2022-03-30 04:40:13.000000000
Modify: 2015-03-02 23:15:21.000000000
Change: 2022-05-11 14:00:25.000000000
[roberto@QNAPTS364 ~]$ date
Thu May 26 17:46:06 CEST 2022
So I've seen that there were 2 WRITES, one for Samba, and the other lead me to a file that I suppose should only be accessed by Plex, but the process accessing it wasn't plex but md9_raid1, and doing a WRITE, but that doesn't seem to be traced as STAT-ing the file it was not changed recently...
QNAP TS-364 w/ 2 * Western Digital RED 4TB Raid 1
QTS 5.0.0.1986
tyler_durden_83
Know my way around
Posts: 171
Joined: Fri Jul 30, 2010 3:56 pm

Re: A possible solution for constant HDD activity problem

Post by tyler_durden_83 »

The other very weird thing is that the files in that path are still causing the disks to not spin down, but Plex isn't active anymore!
QNAP TS-364 w/ 2 * Western Digital RED 4TB Raid 1
QTS 5.0.0.1986
User avatar
graemev
Know my way around
Posts: 199
Joined: Sun Feb 12, 2012 10:17 pm

Re: A possible solution for constant HDD activity problem

Post by graemev »

tyler_durden_83 wrote: Thu May 26, 2022 11:50 pm So I've seen that there were 2 WRITES, one for Samba, and the other lead me to a file that I suppose should only be accessed by Plex, but the process accessing it wasn't plex but md9_raid1, and doing a WRITE, but that doesn't seem to be traced as STAT-ing the file it was not changed recently...
Updating the atime? (I used to turn off atime on early SSD to reduce wear :-) )

If it is atime, seems a reasonable use ... i.e. you did read the disk (it might need to spin up anyhow, if data was not paged in)
jmv
Know my way around
Posts: 111
Joined: Sun Dec 26, 2010 5:06 am

Re: A possible solution for constant HDD activity problem

Post by jmv »

I get this :
debugfs: Bad magic number in super-block while trying to open /dev/dm-0
Any idea ?

Thanks
6. debugfs
Start debugfs, and type "open /dev/dm-0".
Post Reply

Return to “HDD Spin Down (HDD Standby)”