QNAP firmware issues : Major file system bug

Questions about SNMP, Power, System, Logs, disk, & RAID.
Locked
User avatar
schumaku
Guru
Posts: 43579
Joined: Mon Jan 21, 2008 4:41 pm
Location: Kloten (Zurich), Switzerland -- Skype: schumaku
Contact:

Re: QNAP firmware issues : Major file system bug

Post by schumaku »

A potential risk amount for data loss or corruption is obvious for some time, isn't it?

As a result of the experience and reports, we have asked our system managers to reduce the storage load on the NAS devices a long time ago - by expandig storage, or by temporarely moving data to an archive - well below ot the level the issues showed up. Continuing operations in this status must be cosindered **, **, and **. Fill in your own words there. Fire your system manager for not taking action in time.

Or how good ol'VMS told it:
-[whateverfacility]-I-SYSMGR, please see your system manager
SuperMario
Getting the hang of things
Posts: 75
Joined: Tue May 24, 2011 5:01 pm

Re: QNAP firmware issues : Major file system bug

Post by SuperMario »

Schumaku,

My point is that we asked QNAP several times to confirm if there was any potential for data loss or corruption from this issue.

We asked straight out if we should continue using the units as we didn't want to risk any kind of data loss, as some of these problematic units are also supposed to perform various backup duties (which they're obviously not doing for some months now).

They are the ones who said "No, your data is safe". That's my point in the above post - it obviously wasn't. The main issue being that we were assured that our data was safe and to continue using the NASes. So much for trusting the vendor to know their own product. If they'd had recommended otherwise, we'd have taken appropriate action then.

Mario.
rprengel
Starting out
Posts: 44
Joined: Mon Nov 17, 2008 10:36 pm

Re: QNAP firmware issues : Major file system bug

Post by rprengel »

Hallo,

I ve similar problems using samba or nfs. Here it helps deleting some directories in the target-directorie but it can t be a soloution to delete parts of my backups.
Qnap germany is very friendly but the ve no idea what could cause the problem.
It seems as if qnap has a very serious problem.

ralf
User avatar
schumaku
Guru
Posts: 43579
Joined: Mon Jan 21, 2008 4:41 pm
Location: Kloten (Zurich), Switzerland -- Skype: schumaku
Contact:

Re: QNAP firmware issues : Major file system bug

Post by schumaku »

@ralf, talking of TS-409/409 Pro as shown in your forum profile?
rprengel
Starting out
Posts: 44
Joined: Mon Nov 17, 2008 10:36 pm

Re: QNAP firmware issues : Major file system bug

Post by rprengel »

schumaku wrote:@ralf, talking of TS-409/409 Pro as shown in your forum profile?
No,
a 859U.
I will change my profile .

Ralf
watterman
Starting out
Posts: 23
Joined: Sat Mar 20, 2010 6:41 am

Re: QNAP firmware issues : Major file system bug

Post by watterman »

Is there any update on this issue? I am experiencing similar problems on TS-219P+ for some time now.
I just read the release notes for firmware v3.5 and was disappointed that it does not seem to have been addressed.

My HDD's are SAMSUNG HD154UI 1AG0 (1.5TB, 7200rpm)

My real concern is if this started to be a problem on my TS-439P at work. That would really bring business to a grinding halt quicker than the Gillard gov!
So far, no signs on trouble on the TS-439.

cheers,
a.w.
AlexKe
Experience counts
Posts: 1820
Joined: Wed Jan 06, 2010 2:49 pm

Re: QNAP firmware issues : Major file system bug

Post by AlexKe »

Hi watterman

The issue is still under investigation. However, the slow response issue on reading large amount files be considered only occur on larger disk volume, but not 1.5TB single /RAID1 disk volume. I will update the progress later.

On your case, please help to run check file system and bad block scan on each HDD.
//Support Resource//
Online User Manual: http://docs.qnap.com
Tech Support Form: http://www.qnap.com/en/index.php?lang=en&sn=4574
Download Center: http://www.qnap.com/en/index.php?lang=en&sn=848
dannydk
New here
Posts: 4
Joined: Mon Aug 22, 2011 9:09 pm

Re: QNAP firmware issues : Major file system bug

Post by dannydk »

After reading through most of this thread, hoping for some indication of a solution or positive steps towards one, I would also like to contribute to SuperMarios and schumakus information given, and inform that I see this hanging smbd several times a day too.

I can think of many reasons why files get "busy" and data gets "corrupted" when this issue arises, but likely not that the fault in the samba subsystem or journaling part is directly corrupting data, but when a file is being altered or written and another system or service is triggering the lock-up situation, the corruption occurs on the other process that actually did noting "wrong". So in my mind it is more the consequence of using a system which has this faulty condition is not safe - and QNAP better get this fixed NOW!

My 809 has 8 2TB disks in one RAID 5 and has several million files (large and small) in many thousands of folders (which QNAP can' have a copy of either :wink: ). I also have a 409 Pro myself - and just recommended two others to buy respectively a 219P+ and a 659 Pro+. I really hope they don't have to experience the same situation as the latter will be using this in a business scenario and for customer upload access etc (not via SAMBA, though - but local access will be affected).

And this leads to my info regarding all of this: It seems to me, that only SAMBA access is affected, whilst you can still access the Admin Web GUI and other services even if SAMBA access is hung. I can trigger the condition by:
  • Have the NAS being idle for some time, perhaps 5 minutes is enough, but perhaps more... say 15 minutes
  • Read access via SAMBA is fine at this point
  • Create a folder in some SAMBA share via SAMBA (from Windows)
  • At that point I can't name the new folder (which Windows usually initially names "New folder") but the parent folder view becomes unresponsive
  • The admin interface and FTP access still works - and I think other non-SAMBA subsystems probably do too
  • As said, either the folder view locks up for some seconds (20-40, someone mentions) or times out with an error to follow
  • The operation then completes (if the app performing the operation didn't time out itself) but all SAMBA access is unresponsive during the black-out period.
  • I have seen this being triggered by creating files too (not just directories)
The bottom line is that it was good to read that SuperMario had the condition described quite well and that I finally too got a diagnosis that fit what I saw - I just hope this has the proper priority at QNAP - because the SAMBA service is probably the most vital function and must have absolutely rock stable performance!!! :!:
AlexKe
Experience counts
Posts: 1820
Joined: Wed Jan 06, 2010 2:49 pm

Re: QNAP firmware issues : Major file system bug

Post by AlexKe »

Hi dannydk,

About the files get "busy" and data gets "corrupted" when running out of disk space, we will check if any other system or service is trigging the lock-up situation. Thanks for your clear description and advice here.

We have already reproduced the issue in our lab. The RD is investigating on this issue and need more time to find out the solution. The issue is regarding to the raid volume, not system services. When the issue occurred, all write performance will become slow or late response.
//Support Resource//
Online User Manual: http://docs.qnap.com
Tech Support Form: http://www.qnap.com/en/index.php?lang=en&sn=4574
Download Center: http://www.qnap.com/en/index.php?lang=en&sn=848
AdrianW
Know my way around
Posts: 249
Joined: Thu Jul 10, 2008 6:17 pm

Re: QNAP firmware issues : Major file system bug

Post by AdrianW »

QNAPAlex wrote:When the issue occurred, all write performance will become slow or late response.
It also stalls all READ processes as well.
TS-853 Pro; TS-859 Pro; TS-409
SuperMario
Getting the hang of things
Posts: 75
Joined: Tue May 24, 2011 5:01 pm

Re: QNAP firmware issues : Major file system bug

Post by SuperMario »

Hello all,

I have had a few PM's and emails asking me to update this thread, sadly, not much to say or add - except that as discussed above, QNAP are aware of and have replicated the issue and are trying to fix it.

We got another NAS (a TS-659Pro) and have backed up the most important 20% of our systems to it - but we are forced to continue using the other units for day to day operations, we can't just stop all work whilst QNAP are truing to fix it. We have tried to minimise all writes to the NAS units, but that isn't really a workable solution.

What I can tell you is that as you continue to use the system, the problem gets worse. Now we have the stall issue being triggered on simple file creation (not just directories) and the stall lasts for ~2 minutes now. So as the volume grows, so does the length of the lockup of the guest SAMBA processes. As someone else posted above, when the stall occurs, all SAMBA processes are stalled, even reads.

Due to the nature of the issue and how it is happening, my gut feel is that it's a file system issue (not sure if it is RAID related, but I'm not willing to go and build a large JBOD to confirm that) - but as it is so very obviously affected by the size of the volume and as the volume grows, so does the severity of the problem, they're very clearly linked to each other.

My best guess is that the filesystem handler can't cache/buffer enough of the directory tables and has to start paging for extra memory or something like that. As I said earlier, we don't run any other services on our units, so they normally sit with unloaded CPU (~2% usage, normalized) with more than 66% of their memory free and unused all of the time, so I'd have expected that the low level filesystem would be able to grab as much RAM as it needs, unless there is some artificial limit stopping that from happening and forcing disc swapping to virtual memory?

Anyone know anything about the linux variant used by QNAP to know if its possible to alter the amount of system RAM that can be allocated to the filesystem for journaling and housekeeping backend processes? It'd be a good test to give it alot more memory and see if it affects the issue?

Mario.
SuperMario
Getting the hang of things
Posts: 75
Joined: Tue May 24, 2011 5:01 pm

Re: QNAP firmware issues : Major file system bug

Post by SuperMario »

Quickly to add to my previous post:

I just did a test on a hunch about the filesystem, and I can confirm that it's definitely the file system - not RAID at fault.

There is a definite and provable affect on the issue with respect to where the file/directories are on the volume.

Ie: If I try to create a file/directory in a very old directory (as in, one made a long time ago when the volume was new, such as the root directory) and the last access was in a new directory (one created in the last few days) then the issue is triggered all the time.

So there is a correlation between the position of the directory (in time, as to when it was created) therefore its position in the volume, ergo on disk. This to me proves that the issue is with the filesystem and the way that the file system is being maintained. It also explains why when an older directory is removed and then a new one is immediately created, the next few write accesses occur without triggering the problem.

Quite obviously, there is a problem with how much of the file system housekeeping and journals and whatever other data is required by EXT3/4 can be kept in system RAM at once.

Question for QNAP: What is the size of the buffer/s allocated to the file system backend processes before they are forced to use VM?

Mario.
SuperMario
Getting the hang of things
Posts: 75
Joined: Tue May 24, 2011 5:01 pm

Re: QNAP firmware issues : Major file system bug

Post by SuperMario »

Addition:

To further clarify that this is absolutely a file system problem, I just performed a number of tests - and the result:

On the unit that has the largest volume, and therefore exhibits the problem the worst, if I simply attempt to create (ie: copy) a file to it, the stall now always occurs.

However, if I go back up one directory (ie: to its parent) and then I create a new directory I will get the stall - but if then after the timeout period I delete the directory I just created, then I attempt a file create (ie: another file copy) in the original sub-directory, there is *no* stall.

This is most definitely a file system issue.

Mario.
sh4d0wt3ch
Starting out
Posts: 17
Joined: Wed May 25, 2011 5:19 am

Re: QNAP firmware issues : Major file system bug

Post by sh4d0wt3ch »

I had the same problem and I had posted my problem on a similar thread:

http://forum.qnap.com/viewtopic.php?f=189&t=47849

I made some test and the last thing I remember before everything slowed down is I copied a bunch of small files into the NAS. I had a very low disk space (300GB out of 6TB, thats really low) but I don't have any slowness problem. It appeared when I started copying about 20GB of pictures with around 12,500 files and 60GB of music with 25,000 files. Deleting both of those folders sped up the NAS back to its performance which was fast, to make sure that I just didn't hit a critical space issue, I copied some files with the same space about 80GB comprising of big files (4GB-12GB each files) and it was still fast.

So I can conclude that the performance hit can be attributed to the number of files and not on the space it occupied on the NAS. How many that I'm not sure of.
sh4d0wt3ch
Starting out
Posts: 17
Joined: Wed May 25, 2011 5:19 am

Re: QNAP firmware issues : Major file system bug

Post by sh4d0wt3ch »

Update:

The NAS was back to being slow again after copying a bit more big files. Waiting for a solution. :(
Locked

Return to “System & Disk Volume Management”