TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

iSCSI related applications
Post Reply
martijnatlico
New here
Posts: 6
Joined: Thu May 26, 2016 4:17 pm

TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by martijnatlico » Thu May 26, 2016 4:34 pm

We've set up a new TS-1263U-RP with 10x WD Re4 4TB disks in RAID 10 and 2x Intel DC S3600 SSD as read-write cache. It serves as an iSCSI data store for three vSphere hosts. The server has a single storage pool with three iSCSI data stores (one thin provisioned and two thick provisioned). There are no other service activated on the device and there are no snapshots for the iSCSI volumes.

Performance is excellent, but ever so often (once every two hours) the device will become completely unresponsive: iSCSI traffic stalls and the NAS can no longer be reached over SSH (not possible to log in) or web interface. After three to four minutes the device becomes available again. As soon as I'm able to log in via SSH again it reveals a 5 minute load average of 15, while the normal load average hovers around 3. So it was definitely busy with something during the stall.

Today I set up some terminals tailing /var/log/*log and top hoping to see what process causes this behaviour. I set up an I/O load test from a VM and once that stalled I started watching the logs and top output, but there was nothing there. I did notice the dmesg raining with this repeated message:

Code: Select all

[ 6709.871826] dm-kcopyd track job out of array
[ 6709.871827] dm-kcopyd track job out of array
[ 6709.871828] dm-kcopyd track job out of array
[ 6709.871829] dm-kcopyd track job out of array


Literally thousands of messages per second as you can tell from the timestamps. This seems to be related to a kernel function used by dm-mirror. There was nothing showing up in the logs during the downtime and afterwards everything continued as normal. Attached is the output from tail -f /var/log/*log from the point the array became responsive again.

The dm-kcopyd messages are still scrolling by as I write this, 15 minutes after the incident. After a reboot they're gone.

Additional details:

  • Model name: TS-1263-U RP
  • Firmware version and Build Number: 4.2.1 build 20160221
  • Operation System (OS): ESXi 6.0U2
  • Services enabled: iSCSI
  • External devices: none
  • NAS connection speed/ MTU: 10GbE / 9000
You do not have the required permissions to view the files attached to this post.

martijnatlico
New here
Posts: 6
Joined: Thu May 26, 2016 4:17 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by martijnatlico » Thu May 26, 2016 4:50 pm

As soon as I start some iSCSI traffic the dmesg immediately starts firing off the dm-kcopyd messages.. there's definitely something going on here. After a while (with no I/O load from the servers) the NAS rebooted itself and started resynchronizing the RAID groups after coming back up.

I did notice we're running an outdated firmware, so I'll be upgrading that soon.

User avatar
storageman
Ask me anything
Posts: 5488
Joined: Thu Sep 22, 2011 10:57 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by storageman » Thu May 26, 2016 8:12 pm

martijnatlico wrote:As soon as I start some iSCSI traffic the dmesg immediately starts firing off the dm-kcopyd messages.. there's definitely something going on here. After a while (with no I/O load from the servers) the NAS rebooted itself and started resynchronizing the RAID groups after coming back up.

I did notice we're running an outdated firmware, so I'll be upgrading that soon.


Yes try that, does the problem persist if you disable the cache?

martijnatlico
New here
Posts: 6
Joined: Thu May 26, 2016 4:17 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by martijnatlico » Thu May 26, 2016 10:45 pm

I haven't been able to reproduce the stall yet, but the RAID is still resyncing so the might be a factor in there. There's still plenty of dm-kcopyd warnings, though.

What's weird is that QTS kept insisting I was running the latest build even there have been three new releases since.

martijnatlico
New here
Posts: 6
Joined: Thu May 26, 2016 4:17 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by martijnatlico » Mon May 30, 2016 7:19 pm

Ok, just resumed testing today and I'm immediately facing stalls, albeit shorter this time (around a minute) during which we got the same symptoms as described above except this time it was only the VM I/O that stopped while the NAS itself remained available.

I noticed that even though the TS-1263U-RP has the 'VMware Ready' badge on the product page it's not actually listed in the VMware HCL :shock: That's kind of worrying, so I've inquired with QNAP to see if certification is pending or not.

sekim
New here
Posts: 2
Joined: Tue Jun 14, 2016 4:44 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by sekim » Tue Jun 14, 2016 5:00 pm

Hi - I am having a similar issue with a TVS-871U-RP.

4 x Seagate ST6000VN0001-1SF17Z in RAID 10 as a single datastore for 3 vSphere hosts with approx 20 VMs total.
2 x Internal cache ports in read/write mode.
10GbE network with 9000 MTU
Firmware 4.2.1 Build 20160601

My system doesn't get unresponsive but I'm seeing huge latency spikes - frequently over 100ms and often over 1,000ms. Given that our usage is really low (below 1,000 IOPS) this is totally unacceptable given the spec.

Checking the logs and I see the same error message "dm-kcopyd track job out of array". I am currently talking with qnap support about the issue but I'm kinda wishing I'd built a FreeNAS box right now...

sekim
New here
Posts: 2
Joined: Tue Jun 14, 2016 4:44 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by sekim » Fri Jun 17, 2016 5:06 am

QNAP have acknowledged it's a bug related to having the cache enabled, hoping for a fix soon.

pokrakam
Starting out
Posts: 43
Joined: Wed Sep 17, 2008 5:27 am

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by pokrakam » Wed Oct 26, 2016 4:23 pm

Has this been fixed? I'm seeing massive slowdowns in certain cases with SSD cache enabled and plenty of these messages.
e.g. If I copy large files from one standalone drive to another from the console I get 15-20MB/s. If I disable SSD cache on the two drives it shoots up to a more reasonable 80 MB/s straight away. I know this is not a typical use case, but does reproduce the issue nicely.
Edit: Specs: TS-879 pro with 6 SATA drives and SSD cache on 2 Crucial MX300 SSDs

User avatar
storageman
Ask me anything
Posts: 5488
Joined: Thu Sep 22, 2011 10:57 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by storageman » Wed Oct 26, 2016 5:21 pm

pokrakam wrote:Has this been fixed? I'm seeing massive slowdowns in certain cases with SSD cache enabled and plenty of these messages.
e.g. If I copy large files from one standalone drive to another from the console I get 15-20MB/s. If I disable SSD cache on the two drives it shoots up to a more reasonable 80 MB/s straight away. I know this is not a typical use case, but does reproduce the issue nicely.
Edit: Specs: TS-879 pro with 6 SATA drives and SSD cache on 2 Crucial MX300 SSDs


What kind of LUN is it?
Does it matter if it's block or file LUN (created inside volume)?
Does this only affect ISCSI, what performance shares with cache enabled?

pokrakam
Starting out
Posts: 43
Joined: Wed Sep 17, 2008 5:27 am

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by pokrakam » Wed Oct 26, 2016 9:49 pm

No LUN for the example above, I asked on this thread because it's the only one with this error message and my symptoms seem to indicate the same issue.
After some very good and some disappointing results with the SSD cache I went back to basics. The test was a simple copy via SSH from one drive to another. I used rsync with large files (50GB+) and the --progress option. I switched off the cache during the copy and it bounced straight up to 4x the speed. I see latency above 1000ms, but no reboots or other drastic things like that, everything seems to work OK.

I see these errors when I'm not doing anything on the system. I disabled the SDD cache and haven't seen the error since.

xrstokes
New here
Posts: 2
Joined: Wed Dec 14, 2016 2:30 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by xrstokes » Wed Dec 14, 2016 2:51 pm

Has this been solved.

pokrakam
Starting out
Posts: 43
Joined: Wed Sep 17, 2008 5:27 am

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by pokrakam » Thu Dec 22, 2016 7:38 am

xrstokes wrote:Has this been solved.


The original issue or the one I experienced?

For mine, I've been in discussions with QNAP support, who I must say have been very helpful so far. It's still ongoing, can provide more info if interested.

cryptochrome
Been there, done that
Posts: 545
Joined: Sat Mar 09, 2013 7:26 pm

Re: TS-1263U freezes during iSCSI traffic: dm-kcopyd track job out of array

Post by cryptochrome » Thu Dec 22, 2016 6:17 pm

Just leaving a comment to subscribe to this thread. Sorry. Please disregard.

Post Reply

Return to “iSCSI – Target & Virtual Disk”