Crash when writing more than 200GB in one batch

iSCSI related applications
User avatar
Trexx
Ask me anything
Posts: 5393
Joined: Sat Oct 01, 2011 7:50 am
Location: Minnesota

Re: Crash when writing more than 200GB in one batch

Post by Trexx »

storageman wrote:Also not sure why you'd want to use Reds in a top end box like this.
3Yr warranty, No vibration control and 54-5900rpm.

Not the right drive in my view.
Image


Sent from my iPad using Tapatalk
Paul

Model: TS-877-1600 FW: 4.5.3.x
QTS (SSD): [RAID-1] 2 x 1TB WD Blue m.2's
Data (HDD): [RAID-5] 6 x 3TB HGST DeskStar
VMs (SSD): [RAID-1] 2 x1TB SK Hynix Gold
Ext. (HDD): TR-004 [Raid-5] 4 x 4TB HGST Ultastor
RAM: Kingston HyperX Fury 64GB DDR4-2666
UPS: CP AVR1350

Model:TVS-673 32GB & TS-228a Offline[/color]
-----------------------------------------------------------------------------------------------------------------------------------------
2018 Plex NAS Compatibility Guide | QNAP Plex FAQ | Moogle's QNAP Faq
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

storageman wrote: Fri Nov 30, 2018 7:38 pm Also not sure why you'd want to use Reds in a top end box like this.
3Yr warranty, No vibration control and 54-5900rpm.

Not the right drive in my view.
I am not entirely sure why the WD Red were chosen but so far I have nothing to complain about. What would be the correct drive in your opinion?
The box is "only" used to transfer data from customers to our datacenter or as temporary third backup target. Normally we would do that with Dell EMC Unity or NetApp but thought we'd give QNAP a try seeing that these "only" cost about 4000€ in this configuration. Aside from my current problem I really do like these boxes and in my eyes they are quite a nice solution for this task and price point.
User avatar
storageman
Ask me anything
Posts: 5507
Joined: Thu Sep 22, 2011 10:57 pm

Re: Crash when writing more than 200GB in one batch

Post by storageman »

Hmm, spec of box seems a luxury for a backup - but if you can afford it!
Reds are ok for backup jobs but don't expect great read or write speeds.
Also I don't see the sense in using Qtier for a backup target, most of the data won't be in tier 1.
Bob Zelin
Experience counts
Posts: 1375
Joined: Mon Nov 21, 2016 12:55 am
Location: Orlando, FL.
Contact:

Re: Crash when writing more than 200GB in one batch

Post by Bob Zelin »

make sure that there are no special characters in the file names, as this will cause issues.
This means no . and no /

for example - if you are copying mr-auh.mov, that will work. But mr.auh.mov will crash the copy. And mr/auh.mov will stop the copy.
On a Mac, (I know you are using Win 10) - it is critical to disable DS_Store with this command - defaults write com.apple.desktopservices DSDontWriteNetworkStores true
because the Mac OS puts a hidden ./ in front of folders, and drives the QNAP crazy, when doing large data transfers. The simple fix is to simply run that command on a Mac.

I have clients transferring terabytes at a time without issue (and I use WD RED Pro drives all the time).

Bob Zelin
Bob Zelin / Rescue 1, Inc.
http://www.bobzelin.com
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

Bob Zelin wrote: Sat Dec 01, 2018 8:18 am make sure that there are no special characters in the file names, as this will cause issues.
This means no . and no /

for example - if you are copying mr-auh.mov, that will work. But mr.auh.mov will crash the copy. And mr/auh.mov will stop the copy.
On a Mac, (I know you are using Win 10) - it is critical to disable DS_Store with this command - defaults write com.apple.desktopservices DSDontWriteNetworkStores true
because the Mac OS puts a hidden ./ in front of folders, and drives the QNAP crazy, when doing large data transfers. The simple fix is to simply run that command on a Mac.

I have clients transferring terabytes at a time without issue (and I use WD RED Pro drives all the time).

Bob Zelin
Hey Bob,
thank you for your comment, however I don't really have control over the data being written to the QNAP in its function as additional backup target. Nevertheless I tried copying a file named "test.vbk" with roughly 2TB size to the QNAP to a share named "Storage" and still the behaviour as described in my first post can be observed. So as a conclusion, what I have done so far to no avail:
  • Access the share using iSCSI
  • Access the share using CIFS
  • Update to latest firmware
  • Rebuild the RAID50 without Qtier (thus leaving out the Samsung SSDs)
  • Swap the RAM back to the original one delivered with the QNAP
  • Use simple file and share names (no special chars, short paths)
Screenshot shows the last update before crash.
So far I still haven't gotten any reaction from QNAP and am slowly running out of time and patience. Really poor customer service of QNAP...
If any of you have any additional ideas, please let me know. Thanks for your help so far! (Not including you, QNAP customer support...)
You do not have the required permissions to view the files attached to this post.
User avatar
Trexx
Ask me anything
Posts: 5393
Joined: Sat Oct 01, 2011 7:50 am
Location: Minnesota

Re: Crash when writing more than 200GB in one batch

Post by Trexx »

Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Paul

Model: TS-877-1600 FW: 4.5.3.x
QTS (SSD): [RAID-1] 2 x 1TB WD Blue m.2's
Data (HDD): [RAID-5] 6 x 3TB HGST DeskStar
VMs (SSD): [RAID-1] 2 x1TB SK Hynix Gold
Ext. (HDD): TR-004 [Raid-5] 4 x 4TB HGST Ultastor
RAM: Kingston HyperX Fury 64GB DDR4-2666
UPS: CP AVR1350

Model:TVS-673 32GB & TS-228a Offline[/color]
-----------------------------------------------------------------------------------------------------------------------------------------
2018 Plex NAS Compatibility Guide | QNAP Plex FAQ | Moogle's QNAP Faq
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491
User avatar
Trexx
Ask me anything
Posts: 5393
Joined: Sat Oct 01, 2011 7:50 am
Location: Minnesota

Re: Crash when writing more than 200GB in one batch

Post by Trexx »

mr-auh wrote: Tue Dec 04, 2018 4:41 am
Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491
If you go into Storage & Snapshots > Select your Data Volume > Manage > Actions > Set Caching Storage ... is that enabled?

Also go into processes under Resource Monitor, unselect group by app, and sort memory descending. See what is at the top while you are doing the copy and if it changes.
Paul

Model: TS-877-1600 FW: 4.5.3.x
QTS (SSD): [RAID-1] 2 x 1TB WD Blue m.2's
Data (HDD): [RAID-5] 6 x 3TB HGST DeskStar
VMs (SSD): [RAID-1] 2 x1TB SK Hynix Gold
Ext. (HDD): TR-004 [Raid-5] 4 x 4TB HGST Ultastor
RAM: Kingston HyperX Fury 64GB DDR4-2666
UPS: CP AVR1350

Model:TVS-673 32GB & TS-228a Offline[/color]
-----------------------------------------------------------------------------------------------------------------------------------------
2018 Plex NAS Compatibility Guide | QNAP Plex FAQ | Moogle's QNAP Faq
User avatar
storageman
Ask me anything
Posts: 5507
Joined: Thu Sep 22, 2011 10:57 pm

Re: Crash when writing more than 200GB in one batch

Post by storageman »

Can you rebuild it in RAID 6, very few people are using RAID 50 and you probably don't need it for a backup target?
Do not use ISCSi for now and test on simple 1GbE connection (if you're currently running on 10GbE).
I would also SMART test all the drives.
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

Trexx wrote: Tue Dec 04, 2018 6:26 am
mr-auh wrote: Tue Dec 04, 2018 4:41 am
Trexx wrote: Tue Dec 04, 2018 3:09 am Do you have snapshots enabled? If so, disable them and see if the system behaves differently.

Also what is your helpdesk ticket #?
Snapshots are not enabled.
Ticket #: LVV-409-10491
If you go into Storage & Snapshots > Select your Data Volume > Manage > Actions > Set Caching Storage ... is that enabled?

Also go into processes under Resource Monitor, unselect group by app, and sort memory descending. See what is at the top while you are doing the copy and if it changes.
Hi,
Caching is not enabled. The Resource Monitor in that view does not show anything at all (other than some SQL processes using 60MB of RAM), as it is not counting the "Cache" RAM towards used RAM.
storageman wrote: Tue Dec 04, 2018 5:54 pm Can you rebuild it in RAID 6, very few people are using RAID 50 and you probably don't need it for a backup target?
Do not use ISCSi for now and test on simple 1GbE connection (if you're currently running on 10GbE).
I would also SMART test all the drives.
I could try the RAID 6, however that kind of misses the point. Why offer RAID50 if it does not work? I still think that this is caused by a memory leak and QNAP has to investigate this. I will let the RAID6 rebuild over night and report back.
The problem occurs using iSCSI or CIFS, both using the internal 1G NICs as well as the 10G NICs of the PCIe card.
SMART-Values are all good, nothing even slightly in the negatives. All drives are brand new, running for less than 20 days.
User avatar
storageman
Ask me anything
Posts: 5507
Joined: Thu Sep 22, 2011 10:57 pm

Re: Crash when writing more than 200GB in one batch

Post by storageman »

So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

storageman wrote: Wed Dec 05, 2018 1:05 am So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.
I tried to cancel out all the "maybe it's network, maybe it's maybelline" stuff and went straight for the system.
Logged in with SSH and performend the following operation:

Code: Select all

dd if=/dev/zero of=output.dat  bs=1G  count=200
As you can see in the attached screenshot the issue starts to occur right away. I went ahead and aborted the command before the QNAP crashes and looked into clearing the cache in various ways.
Clearing PageCache only didn't do anything, neither did clearing dentries and inodes. However clearing all three did the trick (as you can see in the screenshot).

Code: Select all

sync; echo 3 > /proc/sys/vm/drop_caches
2018-12-05_00_48_26.png
Now I can copy files without interruption or slowing down by just clearing the caches every once in a while.
2018-12-05_01_06_49.png
I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.
You do not have the required permissions to view the files attached to this post.
User avatar
dolbyman
Guru
Posts: 35248
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Crash when writing more than 200GB in one batch

Post by dolbyman »

mr-auh wrote: Wed Dec 05, 2018 8:13 am I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.

Did you forward these results to QNAP via support ticket ? Only that way will they know about it, as they do not read in or visit this forum
User avatar
storageman
Ask me anything
Posts: 5507
Joined: Thu Sep 22, 2011 10:57 pm

Re: Crash when writing more than 200GB in one batch

Post by storageman »

mr-auh wrote: Wed Dec 05, 2018 8:13 am
storageman wrote: Wed Dec 05, 2018 1:05 am So the only way to prove it's a memory leak is to try everything!
Is RAID 50 reliable? - no idea.
Does it crash copying internally from one folder to another via Filestation over 200GB of data?
This is another useful test.
If you can't resolve I'd say box not fit for purpose and Qnap have to fix or refund.
I tried to cancel out all the "maybe it's network, maybe it's maybelline" stuff and went straight for the system.
Logged in with SSH and performend the following operation:

Code: Select all

dd if=/dev/zero of=output.dat  bs=1G  count=200
As you can see in the attached screenshot the issue starts to occur right away. I went ahead and aborted the command before the QNAP crashes and looked into clearing the cache in various ways.
Clearing PageCache only didn't do anything, neither did clearing dentries and inodes. However clearing all three did the trick (as you can see in the screenshot).

Code: Select all

sync; echo 3 > /proc/sys/vm/drop_caches
2018-12-05_00_48_26.png
Now I can copy files without interruption or slowing down by just clearing the caches every once in a while.
2018-12-05_01_06_49.png
I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.
Good work!
Would be interested to hear what Qnap say.
Stick around you could help other people!
mr-auh
Starting out
Posts: 17
Joined: Thu Nov 29, 2018 5:44 pm

Re: Crash when writing more than 200GB in one batch

Post by mr-auh »

dolbyman wrote: Wed Dec 05, 2018 8:26 am
mr-auh wrote: Wed Dec 05, 2018 8:13 am I hope I have done enough investigation now for QNAP and that they can finally go ahead and fix the underlying issue.

Did you forward these results to QNAP via support ticket ? Only that way will they know about it, as they do not read in or visit this forum
They magically contancted me this morning. I am sending them this link and additional logs, then we will see what happens.
storageman wrote: Wed Dec 05, 2018 4:45 pm
storageman wrote: Wed Dec 05, 2018 4:45 pm Good work!
Would be interested to hear what Qnap say.
Stick around you could help other people!
Thank you! I will keep you all updated on the issue and surely will stick around to see what's going on here.
Locked

Return to “iSCSI – Target & Virtual Disk”