Slow transfer speed due to SSD cache acceleration

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
noodles_gb
Getting the hang of things
Posts: 59
Joined: Sun Apr 10, 2016 6:00 pm

Re: Slow transfer speed due to SSD cache acceleration

Post by noodles_gb »

That is what it was set to when I had poor read performance from the NAS. It has been rebuilt now and I am using QTier. All appears to be okay so far but I need to do more testing.
VukDjordjevic
Getting the hang of things
Posts: 60
Joined: Wed Dec 07, 2016 12:45 am

Re: Slow transfer speed due to SSD cache acceleration

Post by VukDjordjevic »

Sorry if I am kinda hijacking the thread but not long ago I got the TS-877 went for 4x1TB samsung 860evos as a static volum in raid 5 as my fast storage and 3x6TB Ironwolfs in raid 5 as backup. I am mainly using the NAS for work a few of us in the office access it for network rendering, opening photoshop files and so on. We are mainly doing intensive read operations sometimes during rendering with 500+ files that are from 1kb till 200mb of size. I am kinda dissapointed with the 10gbe perfromance vs 1gbe performance of my old server although I am getting 1100mbss read and 900mb/s write when copying files. I am not even close to get that performance when doing stuff like opening a photoshop file from the server, saving a photoshop file or opening a 3ds max file/saving it or waiting for render nodes to join network rendering. Sometimes its faster but its not like 10 times faster as file copy in alot of cases is pretty much the same performance as 1gbe network or 5-20% faster... Anyway longstory short I have an offer to sell all my ssd's and was thinking to try NVME ssd's as caching. I would add 3x more Ironwolfs of 6TB and cache them with 2xNVME of 512gb or 1TB in RAID 1. Anyone thinks this would make things go faster or has an opinion of it?
On the other hand I noticed that the 4 sata ssd's in raid 5 in crystal disk mark have faster sequential performance but the random perfromance in raid is actually lower then the random perfromance of a single SSD's so maybe raid isn't a scenario for me at all since I need response time, low latency and fast random access due to large number of files and different type of files in terms of size. If anyone could help with a tip It would be much appreciated.
User avatar
Don
Guru
Posts: 12289
Joined: Thu Jan 03, 2008 4:56 am
Location: Long Island, New York

Re: Slow transfer speed due to SSD cache acceleration

Post by Don »

There are already many topics about system performance and configurations for video rendering. Please search for topics where Bob Zelin has posted.
Use the forum search feature before posting.

Use RAID and external backups. RAID will protect you from disk failure, keep your system running, and data accessible while the disk is replaced, and the RAID rebuilt. Backups will allow you to recover data that is lost or corrupted, or from system failure. One does not replace the other.

NAS: TVS-882BR | F/W: 5.0.1.2346 | 40GB | 2 x 1TB M.2 SATA RAID 1 (System/VMs) | 3 x 1TB M.2 NMVe QM2-4P-384A RAID 5 (cache) | 5 x 14TB Exos HDD RAID 6 (Data) | 1 x Blu-ray
NAS: TVS-h674 | F/W: 5.0.1.2376 | 16GB | 3 x 18TB RAID 5
Apps: DNSMasq, PLEX, iDrive, QVPN, QLMS, MP3fs, HBS3, Entware, DLstation, VS, +
Thisisnotmyname
Easy as a breeze
Posts: 447
Joined: Mon Nov 19, 2018 1:21 am

Re: Slow transfer speed due to SSD cache acceleration

Post by Thisisnotmyname »

VukDjordjevic wrote: Wed Dec 05, 2018 4:23 am Sorry if I am kinda hijacking the thread but not long ago I got the TS-877 went for 4x1TB samsung 860evos as a static volum in raid 5 as my fast storage and 3x6TB Ironwolfs in raid 5 as backup. I am mainly using the NAS for work a few of us in the office access it for network rendering, opening photoshop files and so on. We are mainly doing intensive read operations sometimes during rendering with 500+ files that are from 1kb till 200mb of size. I am kinda dissapointed with the 10gbe perfromance vs 1gbe performance of my old server although I am getting 1100mbss read and 900mb/s write when copying files. I am not even close to get that performance when doing stuff like opening a photoshop file from the server, saving a photoshop file or opening a 3ds max file/saving it or waiting for render nodes to join network rendering. Sometimes its faster but its not like 10 times faster as file copy in alot of cases is pretty much the same performance as 1gbe network or 5-20% faster... Anyway longstory short I have an offer to sell all my ssd's and was thinking to try NVME ssd's as caching. I would add 3x more Ironwolfs of 6TB and cache them with 2xNVME of 512gb or 1TB in RAID 1. Anyone thinks this would make things go faster or has an opinion of it?
On the other hand I noticed that the 4 sata ssd's in raid 5 in crystal disk mark have faster sequential performance but the random perfromance in raid is actually lower then the random perfromance of a single SSD's so maybe raid isn't a scenario for me at all since I need response time, low latency and fast random access due to large number of files and different type of files in terms of size. If anyone could help with a tip It would be much appreciated.
I remember responding to someone here recently that was unhappy with performance of file loads in Photoshop and it seemed that most of the time was taken up by whatever process photoshop applied while opening the file (e.g. compositing, filters, etc...) that are CPU/GPU intensive and NOT the physical access of files on disk. You may want to look into that a bit before you make hardware changes to your NAS as your bottleneck to perceived speed may actually be elsewhere.
VukDjordjevic
Getting the hang of things
Posts: 60
Joined: Wed Dec 07, 2016 12:45 am

Re: Slow transfer speed due to SSD cache acceleration

Post by VukDjordjevic »

@Don

I actually don’t do video rendering. I mainly do 3d rendering so much smaller files in the 20mb-3gb region (3ds max work files) plus the assets that are usually from 10 to 500 textures depending on project and can go from 1kb to 200/300 mb per texture.

@Thisisnotmyname

Yes I have seen your answer you responded to me. Was planing to answer you but since you are posting here I’ll do it here. So actually you were kidna right I did actually change computers tried it on a better workstation with a more powerfull cpu and generally performance was better. I am pretty sure that on this new pc there are no bottlenecks since its an 14core i9 with 4.2ghz all core/64gb ram/gtx 1070ti. Finally the results are kinda the same. Photoshop is newest version and some operations over 10gbe compared to 1gbe are faster but some are identical and for the faster ones we are talkig about 5-15%. Again the bigger the file is this seems to get better but not nearly as close as I expected. I am willing to drop the whole sata ssd raid 5 of 4x1tb 860 evos I am using now as fast storage and give a try to ssd nvme caching of the 6x6tb ironwolfs in raid 5 on my Ts-877 box. So was just hoping to get some opinions on that, thnx.
sohouser1
New here
Posts: 6
Joined: Wed Jan 16, 2019 1:12 am

Re: Slow transfer speed due to SSD cache acceleration

Post by sohouser1 »

Had to register just to post my own SSD cache journey here! So the background, I've got a TS-963x running 4.3.5.0756 with 5-HGST enterprise 3.0tb drives in raid 5, and 4 Kingston 250gb SSD's, and 16gb memory. I have about a dozen VM's booting off the qnap via NFS as well as do a fair amount of large file transfer. All home use type stuff. My ESXi server is direct connected to the qnap via 10gb, the rest is via 1gb.

My first experience was with q-tier, with all the drives in 1 big pool. Worked great, until I hit the point where the SSD tier was full and it had to start migrating data down. Transfer speeds took a nose dive, down to the point VM's started to get unhappy. With qtier being selectable per folder, I could flip it on or off on a single folder and watch the performance drop almost in real time. Opened a ticket, didn't really have much response, so ended up tearing it all down and rebuilding without QTIER but with SSD cache. (pain in the ** since you have to destroy the pool and start over)


Fast forward a week or two of great performance, and the problem comes back! Tried about 10 different permutations of cache settings, sizes, even went down to a single cache ssd in read only, no raid, just in case it is the raid software. All had the EXACT same results. Once the ssd is full, performance goes in the toilet.

At this point QTIER and SSDCACHE are fundamentally broken on qnap in my perspective. Once the ssd is full, it kills the performance for anything that is even remotely demanding. It is absolutely repeatable, I just turn on SSD caching and wait.
User avatar
dolbyman
Guru
Posts: 35005
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Slow transfer speed due to SSD cache acceleration

Post by dolbyman »

not sure if the 963X is the right machine to serve storage for a dozen VM's... and I guess your test confirms it
sohouser1
New here
Posts: 6
Joined: Wed Jan 16, 2019 1:12 am

Re: Slow transfer speed due to SSD cache acceleration

Post by sohouser1 »

dolbyman wrote: Wed Jan 16, 2019 2:46 am not sure if the 963X is the right machine to serve storage for a dozen VM's... and I guess your test confirms it
So i should have clarified, the 963 without caching rocks! It handles the 12 vm's at 10g extremely well, even with rotational drives. Once I turn on caching, the load behind it makes zero difference. I can have all the VM's turned off, and a single windows 10 machine transferring a video file will have horrible results if the ssd cache is enabled and full.
User avatar
dolbyman
Guru
Posts: 35005
Joined: Sat Feb 12, 2011 2:11 am
Location: Vancouver BC , Canada

Re: Slow transfer speed due to SSD cache acceleration

Post by dolbyman »

gotcha

did you open a ticket with qnap ?
Bob Zelin
Experience counts
Posts: 1374
Joined: Mon Nov 21, 2016 12:55 am
Location: Orlando, FL.
Contact:

Re: Slow transfer speed due to SSD cache acceleration

Post by Bob Zelin »

well, I only clicked on this thread because I wanted to see new 10G related questions, but I see the comment that your QNAP rocks without caching. Welcome to the club. For many applications, SSD caching turns out to be a big negative. The QNAP is still great (I am not sure about the 963X) - but QNAP caching - "blechhh".

Bob Zelin
Bob Zelin / Rescue 1, Inc.
http://www.bobzelin.com
sohouser1
New here
Posts: 6
Joined: Wed Jan 16, 2019 1:12 am

Re: Slow transfer speed due to SSD cache acceleration

Post by sohouser1 »

Bob Zelin wrote: Wed Jan 16, 2019 9:21 pm well, I only clicked on this thread because I wanted to see new 10G related questions, but I see the comment that your QNAP rocks without caching. Welcome to the club. For many applications, SSD caching turns out to be a big negative. The QNAP is still great (I am not sure about the 963X) - but QNAP caching - "blechhh".
I had a 451 previously, I sold that off and bought the 963x specifically for the SSD slots to run QTIER and the dedicated 10g port. Really happy with the 963, other than the obvious cache issue. I ended up settling with a raid 5 array of the 5 rotational drives for mass storage, and my 4 ssd's in a raid 0 array that I use for some of the vm's that need faster drives, and swap space for most of the VM's. (and yes, I know, 4 raid 0 ssd's... I also have really good automated backups!)
DiMora
Starting out
Posts: 36
Joined: Tue Aug 20, 2013 8:52 am

Re: Slow transfer speed due to SSD cache acceleration

Post by DiMora »

sohouser1 wrote: Wed Jan 16, 2019 1:28 am Had to register just to post my own SSD cache journey here! So the background, I've got a TS-963x running 4.3.5.0756 with 5-HGST enterprise 3.0tb drives in raid 5, and 4 Kingston 250gb SSD's, and 16gb memory. I have about a dozen VM's booting off the qnap via NFS as well as do a fair amount of large file transfer. All home use type stuff. My ESXi server is direct connected to the qnap via 10gb, the rest is via 1gb.

My first experience was with q-tier, with all the drives in 1 big pool. Worked great, until I hit the point where the SSD tier was full and it had to start migrating data down. Transfer speeds took a nose dive, down to the point VM's started to get unhappy. With qtier being selectable per folder, I could flip it on or off on a single folder and watch the performance drop almost in real time. Opened a ticket, didn't really have much response, so ended up tearing it all down and rebuilding without QTIER but with SSD cache. (pain in the ** since you have to destroy the pool and start over)


Fast forward a week or two of great performance, and the problem comes back! Tried about 10 different permutations of cache settings, sizes, even went down to a single cache ssd in read only, no raid, just in case it is the raid software. All had the EXACT same results. Once the ssd is full, performance goes in the toilet.

At this point QTIER and SSDCACHE are fundamentally broken on qnap in my perspective. Once the ssd is full, it kills the performance for anything that is even remotely demanding. It is absolutely repeatable, I just turn on SSD caching and wait.
Which QTier were you using, original or 2.0?

Thanks,

Shane
GyrusDyne
New here
Posts: 9
Joined: Fri Nov 13, 2009 4:41 pm

Re: Slow transfer speed due to SSD cache acceleration

Post by GyrusDyne »

Logged in just to mention that I am facing the same problem while downloading 10+GB files from TS-670 Pro to PC "accelerated" by 250GB SSD. Speed starts from 90 MB/s with 100% cache hit rate, but than (may be 10 GB after start) drops to 10-20 MB/s with cache hit rate around 75%. I disabled caching during copy operation and speed went to 60 MB/s.

Month or so ago I benchmarked SSD by some QNAP tool (don't remember the name) - and it's performance was comparable to HDD o lower. Don't know how to interpret this...
datahrdr
New here
Posts: 3
Joined: Mon Sep 16, 2019 2:07 pm

Re: Slow transfer speed due to SSD cache acceleration

Post by datahrdr »

Hello all,

I have been struggling with this issue for a few months now (well, it's only in the past few days that I really started looking into it). The craziest thing to me was that even with a read-only cache, I would see writes drop to 50 MB/s (when writes to the HDD RAID would be around 350 MB/s without an SSD cache). I have had to re-think several of my assumptions. This was especially annoying because I picked the device I did specifically because of the 10 GbE support and four 2.5" slots for SSDs, so that I could enjoy quick transfers. I'll go into details below. (Running a TVS-951x with 5 x 8 TB Seagate Ironwolf drives in RAID6, encrypted, and various configurations of Samsung SSDs (mix of 830, 840 EVO, 860 EVO))

Assumption #1: An SSD cache is always worth it. SSDs are much quicker than any HDD.
The truth: This isn't really the case when you work with a lot of data and the SSD is "full". (Given that the cache layer is actually not very smart)

Assumption #2: Surely a read-only cache cannot affect write speeds?
The truth: Because of the way dm-cache works (and this is just my hypothesis, I did not verify this but I have fair reason to believe this is why), all writes are also stored to the cache, so that they can provide an immediate benefit to any subsequent reads. If writes to the cache are slow for some reason or other, writes to a RAID setup, even with a "read-only" SSD cache, will be slow.

Assumption #3: The CPU in my NAS supports AES-NI instructions, surely there shouldn't be any speed impact from using encryption.
The truth: Well there is, at least with the dog of a CPU that they put in the TVS-951x (Celeron 3865U dual-core 1.8 GHz processor, Kaby Lake architecture).


Let's get into it.

Any modern SSD has a what's called a Flash Translation Layer (FTL) between the controller chip and the NAND flash chips. In Flash chips, you can't rewrite individual sectors (you can write individual sectors if they haven't been used since the "page" they belong to was last erased), and sectors are combined into pages, which are the basic unit of size for flash chips. This means that the controller has to do extra work to keep track of which OS-visible ("LBA") sector was written where, and periodically coalesce them as demand for writes forces the controller to erase other pages so that more data can be written to the disk. This is called garbage collection (GC). There are some more tricks that SSD makers use, such as treating part of the TLC flash as SLC for speeding up small (up to some or some tens of gigabytes) writes.

Almost as soon as I started using my 128 GB Samsung 830 SSD for the SSD cache layer, I would see writes drop down to 50 MB/s after a measly few gigabytes of data written sequentially (yes, I chose to use the SSD cache for all i/o, because I wanted to eventually take advantage of quick writing of even big files to the NAS). I kind of always knew that it is possible that with a "full" SSD writes can be slow as the SSD performs garbage collection, but I thought that overprovisioning would help with this. Well, it really doesn't. Garbage collection is something an SSD has to do anyway, and if the GC process is slow (for example, if erasing a page takes a lot of time), it doesn't matter how much you overprovision the SSD, it's going to be slow anyway.

For some sort of evidence, refer to this graph from Anandtech's review of the 830 SSD: https://images.anandtech.com/reviews/st ... 60mins.png. You can clearly see how in a "full" state, the write speeds drop down to a little over 50 MB/s after a few gigabytes or so. Keep in mind that Samsung is actually one of the better SSD makers in this regard (the 830 is already over 5 years old, but the problem manifests to some degree even with newer/bigger drives); if you use cheaper SSDs with crappy controllers, you're still going to see the same problem.

Yes, I tried secure erase, yes I tried overprovisioning (up to 20%), but nothing helped.

I also tried using 4 x 250 GB SSDs (one 840 EVO, three 860 EVO) in various configurations (raid5 and raid10 as a read+write cache, and also raid0 as a read-only cache). I could get the "choke" write speeds up to 250 MB/s (depending on the RAID setup for the SSDs, it would choke between 10 and 50 gigabytes written), but that still wasn't as good as writing directly to the HDDs without an SSD cache. The cache layer definitely "kind of" worked as long as it wasn't hitting this fundamental wall with SSDs: after a secure erase of the SSDs and setting them up in a raid0 readonly cache, the first read of a big file would be around 350 MB/s (limited by the HDDs+encryption, I guess), and the second read would hit almost 800 MB/s over the network.

So let's forget about writes. I do do a lot of reads as well. I would have loved to just use the SSDs as a very quick read-only cache, but it turns out that because of the way the cache layer works, it very much affects write speeds even in a read-only configuration.

I must stress that I only deduce this empirically; I have not verified that this is how dm-cache actually works or if it's configurable.

It appears that dm-cache operates under the very naive assumption that the cache layer is always faster than the backing disk(s). Because of this, writes to the disk are also routed to the cache device(s), so that they can provide an immediate benefit to any reads of the file that was just written to. Now, because of what I described above, as soon as you've written 5 or 10 or 50 gigabytes to an SSD, it will start blocking because of the GC operation. The end result is that even with a read-only cache, writes will choke when the SSDs choke.

You'd think that dm-cache would have the smarts to bypass the cache layer as soon as it detects that the cache layer is slower than the backing disks (regardless of if it's a read or write operation), but that doesn't appear to be the case. If it did, using a cache would never be slower than reading/writing directly to the disk.

The conclusion is: An SSD cache is just not worth it. It doesn't make a difference if it's read-only or read/write. It might be worth it if you really only use it for truly random I/O (if you run VMs off of it), or if the SSDs are truly fast (say for example some PCIe NVME SSDs - I can't test this because the TVS-951x I have doesn't seem to have a PCIe slot) or never suffer from performance degradation (like maybe the Optane devices?). But for the general case, with SATA SSDs - Just Don't Bother.


Now to my other annoyance: writes to an encrypted RAID seem to be limited by the CPU. This is something I did not expect. Modern CPUs can use the AES-NI accelerated instructions and achieve multiple gigabytes per second of encryption or decryption. There's of course some overhead arising from the fact that an encryption mode such as CBC (in the case of QNAP) or XTS has to be used on top of that, but it shouldn't be too bad.

Well, it is. I don't know if there's some inefficiency in how dm-crypt works or if it's because the CPU also has to do raid6 parity calculations (when writing) at the same time, but my CPU is pegged at 99% (per the dashboard) when writing to my NAS.

This is something I should have noticed though: https://www.qnap.com/en/product/tvs-951x the part there in orange color quite clearly shows that with an encrypted disk, you can't expect much more than around 350 MB/s for writes. Without it you should at least get around 700 MB/s (assuming the HDDs aren't the bottleneck), and I did in fact try 4x250GB SSDs in a raid0 (without encryption) and got almost 800 MB/s in writes. But of course that isn't usable because you don't get any of the safety that you want.

Am I dissatisfied? Not really. I should have checked the specifications more carefully, or chosen not to use encryption (I have stored the encryption key in the NAS so that it unlocks automatically - this basically defeats the purpose of the encryption anyway! ... but even if I deleted that, I would still have the password and the key file stored on my PC, which isn't encrypted). On the other hand, for me the NAS is just for storing files - 350 MB/s isn't bad at all compared to the bunch of spinning disks I had in my PC (which would only do 100-150 MB/s) previously! If I were to upgrade my NAS, I would look for a device that has a beefier CPU, but for now this'll do, I guess.


So I guess my takeaways are: Manage your expectations. Do not assume that something is, because you think it is. If something is measurably worse, then it is, and there's probably a reason why. If you care enough, you can figure out why.

You live and you learn.


Addendum 9/17: It appears that the slow writes to an encrypted volume are due to the fact that QNAP uses aes-256-cbc for encrypted volumes. Using aes-256-xts would be a much better choice (openssl speed benchmark indicates ~450 MB/s for aes-256-cbc encryption, ~1800 MB/s for aes-256-xts encryption on my device - the difference from the CBC benchmark to the 350 MB/s write speed limit I am hitting is reasonable due to parity calculation and other overhead). I'll make a separate post about this and see if I can find some way to force the use of XTS instead.
Bob Zelin
Experience counts
Posts: 1374
Joined: Mon Nov 21, 2016 12:55 am
Location: Orlando, FL.
Contact:

Re: Slow transfer speed due to SSD cache acceleration

Post by Bob Zelin »

hello datahrdr -
this is a GREAT post, and you have learned (in much more detail) what many of us have known for a long time. Unless you are dealing with lots of tiny files, and not much data at that - SSD caching is just not worth it. For my applications, which involve hundreds of terabytes of data, all large blocks, all SSD caching does is slow down the system. And I love your closing comment - "do not assume that something is, because you think it is" - and "you live and you learn". That's how I usually find out about assorted products (including of course QNAP, and their various applications and models). Some work great, others, not so great. After getting burned ( I installed SSD's for caching when it first started happening with the TVS-1282T) - you realize that it's only hurting your performance. I assumed, because I thought it was - almost common knowledge. It should be faster. Well - I was wrong - and I am often wrong.

Thank you again for a great detailed post.
Bob Zelin
Bob Zelin / Rescue 1, Inc.
http://www.bobzelin.com
Post Reply

Return to “System & Disk Volume Management”