Page 2 of 3

Re: Crash when writing more than 200GB in one batch

Posted: Fri Nov 30, 2018 9:36 pm
by Trexx
storageman wrote:Also not sure why you'd want to use Reds in a top end box like this.
3Yr warranty, No vibration control and 54-5900rpm.

Not the right drive in my view.
Image


Sent from my iPad using Tapatalk

Re: Crash when writing more than 200GB in one batch

Posted: Fri Nov 30, 2018 10:01 pm
by mr-auh
storageman wrote: Fri Nov 30, 2018 7:38 pm Also not sure why you'd want to use Reds in a top end box like this.
3Yr warranty, No vibration control and 54-5900rpm.

Not the right drive in my view.
I am not entirely sure why the WD Red were chosen but so far I have nothing to complain about. What would be the correct drive in your opinion?
The box is "only" used to transfer data from customers to our datacenter or as temporary third backup target. Normally we would do that with Dell EMC Unity or NetApp but thought we'd give QNAP a try seeing that these "only" cost about 4000€ in this configuration. Aside from my current problem I really do like these boxes and in my eyes they are quite a nice solution for this task and price point.

Re: Crash when writing more than 200GB in one batch

Posted: Fri Nov 30, 2018 10:53 pm
by storageman
Hmm, spec of box seems a luxury for a backup - but if you can afford it!
Reds are ok for backup jobs but don't expect great read or write speeds.
Also I don't see the sense in using Qtier for a backup target, most of the data won't be in tier 1.

Re: Crash when writing more than 200GB in one batch

Posted: Sat Dec 01, 2018 8:18 am
by Bob Zelin
make sure that there are no special characters in the file names, as this will cause issues.
This means no . and no /

for example - if you are copying mr-auh.mov, that will work. But mr.auh.mov will crash the copy. And mr/auh.mov will stop the copy.
On a Mac, (I know you are using Win 10) - it is critical to disable DS_Store with this command - defaults write com.apple.desktopservices DSDontWriteNetworkStores true
because the Mac OS puts a hidden ./ in front of folders, and drives the QNAP crazy, when doing large data transfers. The simple fix is to simply run that command on a Mac.

I have clients transferring terabytes at a time without issue (and I use WD RED Pro drives all the time).

Bob Zelin

Re: Crash when writing more than 200GB in one batch

Posted: Tue Dec 04, 2018 2:25 am
by mr-auh
Bob Zelin wrote: Sat Dec 01, 2018 8:18 am make sure that there are no special characters in the file names, as this will cause issues.
This means no . and no /

for example - if you are copying mr-auh.mov, that will work. But mr.auh.mov will crash the copy. And mr/auh.mov will stop the copy.
On a Mac, (I know you are using Win 10) - it is critical to disable DS_Store with this command - defaults write com.apple.desktopservices DSDontWriteNetworkStores true
because the Mac OS puts a hidden ./ in front of folders, and drives the QNAP crazy, when doing large data transfers. The simple fix is to simply run that command on a Mac.

I have clients transferring terabytes at a time without issue (and I use WD RED Pro drives all the time).

Bob Zelin
Hey Bob,
thank you for your comment, however I don't really have control over the data being written to the QNAP in its function as additional backup target. Nevertheless I tried copying a file named "test.vbk" with roughly 2TB size to the QNAP to a share named "Storage" and still the behaviour as described in my first post can be observed. So as a conclusion, what I have done so far to no avail:
  • Access the share using iSCSI
  • Access the share using CIFS
  • Update to latest firmware
  • Rebuild the RAID50 without Qtier (thus leaving out the Samsung SSDs)
  • Swap the RAM back to the original one delivered with the QNAP
  • Use simple file and share names (no special chars, short paths)
Screenshot shows the last update before crash.
So far I still haven't gotten any reaction from QNAP and am slowly running out of time and patience. Really poor customer service of QNAP...
If any of you have any additional ideas, please let me know. Thanks for your help so far! (Not including you, QNAP customer support...)

Re: Crash when writing more than 200GB in one batch

Posted: Tue Dec 04, 2018 3:09 am
by Trexx
Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?

Re: Crash when writing more than 200GB in one batch

Posted: Tue Dec 04, 2018 4:41 am
by mr-auh
Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491

Re: Crash when writing more than 200GB in one batch

Posted: Tue Dec 04, 2018 6:26 am
by Trexx
mr-auh wrote: Tue Dec 04, 2018 4:41 am
Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491
If you go into Storage & Snapshots > Select your Data Volume > Manage > Actions > Set Caching Storage ... is that enabled?

Also go into processes under Resource Monitor, unselect group by app, and sort memory descending. See what is at the top while you are doing the copy and if it changes.

Re: Crash when writing more than 200GB in one batch

Posted: Tue Dec 04, 2018 5:54 pm
by storageman
Can you rebuild it in RAID 6, very few people are using RAID 50 and you probably don't need it for a backup target?
Do not use ISCSi for now and test on simple 1GbE connection (if you're currently running on 10GbE).
I would also SMART test all the drives.

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 12:58 am
by mr-auh
Trexx wrote: Tue Dec 04, 2018 6:26 am
mr-auh wrote: Tue Dec 04, 2018 4:41 am
Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491
If you go into Storage & Snapshots > Select your Data Volume > Manage > Actions > Set Caching Storage ... is that enabled?

Also go into processes under Resource Monitor, unselect group by app, and sort memory descending. See what is at the top while you are doing the copy and if it changes.
Hi,
Caching is not enabled. The Resource Monitor in that view does not show anything at all (other than some SQL processes using 60MB of RAM), as it is not counting the "Cache" RAM towards used RAM.
storageman wrote: Tue Dec 04, 2018 5:54 pm Can you rebuild it in RAID 6, very few people are using RAID 50 and you probably don't need it for a backup target?
Do not use ISCSi for now and test on simple 1GbE connection (if you're currently running on 10GbE).
I would also SMART test all the drives.
I could try the RAID 6, however that kind of misses the point. Why offer RAID50 if it does not work? I still think that this is caused by a memory leak and QNAP has to investigate this. I will let the RAID6 rebuild over night and report back.
The problem occurs using iSCSI or CIFS, both using the internal 1G NICs as well as the 10G NICs of the PCIe card.
SMART-Values are all good, nothing even slightly in the negatives. All drives are brand new, running for less than 20 days.

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 1:05 am
by storageman
So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 8:13 am
by mr-auh
storageman wrote: Wed Dec 05, 2018 1:05 am So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.
I tried to cancel out all the "maybe it's network, maybe it's maybelline" stuff and went straight for the system.
Logged in with SSH and performend the following operation:

Code: Select all

dd if=/dev/zero of=output.dat  bs=1G  count=200
As you can see in the attached screenshot the issue starts to occur right away. I went ahead and aborted the command before the QNAP crashes and looked into clearing the cache in various ways.
Clearing PageCache only didn't do anything, neither did clearing dentries and inodes. However clearing all three did the trick (as you can see in the screenshot).

Code: Select all

sync; echo 3 > /proc/sys/vm/drop_caches
2018-12-05_00_48_26.png
Now I can copy files without interruption or slowing down by just clearing the caches every once in a while.
2018-12-05_01_06_49.png
I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 8:26 am
by dolbyman
mr-auh wrote: Wed Dec 05, 2018 8:13 am I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.

Did you forward these results to QNAP via support ticket ? Only that way will they know about it, as they do not read in or visit this forum

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 4:45 pm
by storageman
mr-auh wrote: Wed Dec 05, 2018 8:13 am
storageman wrote: Wed Dec 05, 2018 1:05 am So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.
I tried to cancel out all the "maybe it's network, maybe it's maybelline" stuff and went straight for the system.
Logged in with SSH and performend the following operation:

Code: Select all

dd if=/dev/zero of=output.dat  bs=1G  count=200
As you can see in the attached screenshot the issue starts to occur right away. I went ahead and aborted the command before the QNAP crashes and looked into clearing the cache in various ways.
Clearing PageCache only didn't do anything, neither did clearing dentries and inodes. However clearing all three did the trick (as you can see in the screenshot).

Code: Select all

sync; echo 3 > /proc/sys/vm/drop_caches
2018-12-05_00_48_26.png
Now I can copy files without interruption or slowing down by just clearing the caches every once in a while.
2018-12-05_01_06_49.png
I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.
Good work!
Would be interested to hear what Qnap say.
Stick around you could help other people!

Re: Crash when writing more than 200GB in one batch

Posted: Wed Dec 05, 2018 7:52 pm
by mr-auh
dolbyman wrote: Wed Dec 05, 2018 8:26 am
mr-auh wrote: Wed Dec 05, 2018 8:13 am I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.

Did you forward these results to QNAP via support ticket ? Only that way will they know about it, as they do not read in or visit this forum
They magically contancted me this morning. I am sending them this link and additional logs, then we will see what happens.
storageman wrote: Wed Dec 05, 2018 4:45 pm
storageman wrote: Wed Dec 05, 2018 4:45 pm Good work!
Would be interested to hear what Qnap say.
Stick around you could help other people!
Thank you! I will keep you all updated on the issue and surely will stick around to see what's going on here.