Page 1 of 1

TS-253A Self-shutdown caused by segfault error

Posted: Sat Dec 16, 2017 5:00 pm
by sergey_taurus
My almost brand-new (2 weeks uptime) QNAP TS-253A had shutdown itself twice in the past few days.

System configuration details:
Firmware version Build 20171213
RAID1, full volume encryption

The system seem to shutdown itself gracefully but I see no errors in system log, just a shutdown event:
Information 2017/12/15 21:16:45 System localhost System was shut down on Fri Dec 15 21:16:45 PST 2017.
System beeps twice, blinks power on button and then gracefully shutdown itself. Upon boot there are no filesystem errors reported meaning it was not some unexpected hardware or power failure which usually causes 'bad filesystem' state reported. It is clearly some silent watchdog graceful shutdown.

I've looked thought /mnt/HDA_ROOT/.logs as well.

[/mnt/HDA_ROOT/.logs] # ls -l | grep kmsg
-rw-r--r-- 1 admin administ 73843 Dec 15 23:20 kmsg
-rw-r--r-- 1 admin administ 937136 Dec 15 23:15 kmsg.1
-rw-r--r-- 1 admin administ 93131 Nov 11 00:34 kmsg.2
-rw-r--r-- 1 admin administ 50 Dec 15 23:15 kmsg_time

kmsg.1 has a timestamp close to the shutdown on 12/15/2017 wich nothing suspicious expect Chrome segfault followed by what looks like a system shutdown:

<6>[92807.414820] Chrome_ChildThr[20276]: segfault at 0 ip 00007f0cb3566585 sp 00007f0cb14f3b80 error 6 in[7f0cb28ad000+3f80000]
<4>[92854.834526] iscsi target qlog cleanup successfully.
<6>[92854.965488] fbdisk: module unloaded.
<4>[92875.875247] nfsd: last server has exited, flushing export cache
<6>[92898.260509] device eth0 left promiscuous mode
<6>[92898.265171] br0: port 1(eth0) entered disabled state
<6>[92898.569951] 8021q: adding VLAN 0 to HW filter on device bond0
<6>[92900.530538] bond0 (unregistering): Released all slaves
<6>[92904.643495] usbcore: deregistering interface driver cdc_ether
<6>[92904.660456] usbcore: registered new interface driver cdc_ether
<4>[92904.662522] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
<6>[92904.662536] NFSD: starting 90-second grace period (net ffffffff81cfb080)

In fact, both times when NAS shutdown itself HD Station was used with Firefox open, nothing crazy, light browsing.

Just wondering if anyone seen this before and/or has any ideas which logs I can check to root cause this issue.
I did opened a support request with QNAP but have had no response yet.


Re: TS-253A Self-shutdown

Posted: Sat Dec 16, 2017 5:17 pm
by OneCD
Hi and welcome to the forum. :)

sergey_taurus wrote:error 6 in[7f0cb28ad000+3f80000]

Wow - QNAP are still using XULRunner? :S

Re: TS-253A Self-shutdown

Posted: Mon Dec 18, 2017 7:59 am
by sergey_taurus
Happened again today. Again, coincidentally while running Firefox in HD Station. As last time, no system events reported.
After I heard the beep, I was able to quickly jump into WebUI and check on system status before NAS shutdown itself.
No software/hardware issues, CPU Temp 55'C, System Temp 41'C, Disks Temp 40'C.

Installed Chrome in HD Station, will see if running it instead of Firefox will cause the same effect.
No response from QNAP support for 2+ days.

Does anybody have any idea what logs to check if WebUI reports "all clear"?


Re: TS-253A Self-shutdown

Posted: Wed Dec 20, 2017 8:43 am
by sergey_taurus
Got a response back from QNAP support and after log dumps provided suport engineer is suspecting 3rd party RAM at fault referencing following segfault events:

<6>[13929.204243][3937]: segfault at 26cb614 ip 00007f61df6fe342 sp 00007ffe41c38420 error 4 in[7f61df6ef000+1f000]
<6>[52161.581635][29186]: segfault at 253ae22 ip 00007f7cb711d342 sp 00007ffee62e9e60 error 4 in[7f7cb710e000+1f000]
<6>[ 380.850163] sysRequest.cgi[29202]: segfault at 0 ip 00007efd019c2a06 sp 00007ffdb2f44708 error 4 in[7efd0189b000+199000]
<6>[81124.020822] Chrome_ChildThr[29332]: segfault at 0 ip 00007f1e75266585 sp 00007f1e731f3b80 error 6 in[7f1e745ad000+3f80000]
<6>[ 227.075973] sysRequest.cgi[15851]: segfault at 7ffdcf9a0008 ip 00007fdb76b51321 sp 00007ffdcf965820 error 4 in[7fdb76b4f000+4000]
<6>[181620.943316][31895]: segfault at 2639f9f ip 00007f65b89b6342 sp 00007fff9391e1e0 error 4 in[7f65b89a7000+1f000]
<6>[217652.833407][2123]: segfault at 190fc30 ip 00007f2b4c18a342 sp 00007fffdf0b0150 error 4 in[7f2b4c17b000+1f000]
<6>[219492.336685][30143]: segfault at 4 ip 00007ffbaf5390d2 sp 00007ffc4ece4c50 error 6 in[7ffbaf485000+199000]
<6>[247677.314334][2488]: segfault at 1c28873 ip 00007f1e8d1e8342 sp 00007ffc0e4f2580 error 4 in[7f1e8d1d9000+1f000]
<6>[92807.414820] Chrome_ChildThr[20276]: segfault at 0 ip 00007f0cb3566585 sp 00007f0cb14f3b80 error 6 in[7f0cb28ad000+3f80000]
<6>[135885.538414] Chrome_ChildThr[14929]: segfault at 0 ip 00007f9c1f066585 sp 00007f9c1cff3b80 error 6 in[7f9c1e3ad000+3f80000]

RAM was in fact upgraded, I am planning to put stock 2G+2G modules to see if it helps.
Still, it is very weird that unexpected self-shutdown happens while running Firefox in HD Station... Apart from that, the unit runs just fine for days.

Re: TS-253A Self-shutdown

Posted: Wed Dec 20, 2017 7:13 pm
by storageman
Good you finally told us about the RAM. What RAM did you use?

Re: TS-253A Self-shutdown

Posted: Thu Dec 21, 2017 1:59 am
by sergey_taurus
Yeah, sorry, totally missed to mention RAM upgrade from stock 4GB (2GBx2) Transcend :)

I happen to have Crucial 16GB Kit (8GBx2) DDR3/DDR3L 1600 MT/S (PC3-12800) Unbuffered SODIMM 204-Pin Memory - CT2KIT102464BF160B and decided to use it. The specs for TS-253A say 8GB max, as well as Intel Celeron N3160 datasheet, but based on this post 16GB worked fine for multiple customers: viewtopic.php?f=12&t=96171#p440150

Re: TS-253A Self-shutdown

Posted: Thu Dec 21, 2017 2:24 am
by Trexx
That is the same Crucial kit I used in my TS-453Pro when I had it, but that was a different processor than the x53a series uses, so it may or may not be compatible. It could also be a bad DIMM.

I would try creating a memtest+ boot USB and let that run for a while on the memory to see if it flags 1 dimm or not.

Re: TS-253A Self-shutdown

Posted: Mon Jan 01, 2018 7:13 am
by sergey_taurus
Ran memtest+ overnight on Crucial 16GB Kit (8GBx2) kit. No issues as far as I can tell from the results (attached):

The fact is that I did not have more self-shutdown event since putting back stock 4GB (2GBx2) Transcend RAM.

I guess it could well be that TS-253A and/or specific chipset in fact does not fully support 16GB of RAM compared to TS-251/TS-253 reported working just fine in viewtopic.php?f=12&t=96171#p440150

Re: TS-253A Self-shutdown

Posted: Wed Jan 03, 2018 1:09 pm
by sergey_taurus
Issue re-appeared today with NAS running on stock 4GB (2GBx2) Transcend RAM. Clearly, RAM is not the root cause of self shutdown and sigfault errors here.
Symptoms are the same: device self-shut itself while running Firefox HD Station (it upgraded from 3.2.1 to 3.2.3 with recent firmware upgrade).
Back to QNAP support it goes.


Re: TS-253A Self-shutdown

Posted: Thu Jan 04, 2018 4:46 am
by sergey_taurus
It feels that QNAP support is shooting in the dark: I was asked today to switch AC power from UPS to direct feed...

NAS had this issue (sigfault errors) when it was connected to AC power directly and running on upgraded RAM.
NAS still has this issue (sigfault errors) when connected to brand-new pure sin-wave UPS and running on stock RAM.

The issue is present on UPS and direct power, on upgraded and stock RAM, on old and new firmware. To me it is something else and we are going wrong way with troubleshoowing.
It sounds more like some general hardware issue or software bug in HD Station and/or Firefox app it is running. I will unplug NAS from UPS just to make QNAP support happy but I am more that certain that it won't help.


Re: TS-253A Self-shutdown caused by segfault error

Posted: Tue Apr 03, 2018 1:21 pm
by sergey_taurus
Switching to direct AC power from pure sine-wave UPS indeed did not help. The issue came back and NAS has shutdown itself again exactly in the same fashion, running on stock RAM, latest and greatest firmware, direct AC power - everything QNAP support asked for.

System log (immediately preceding the NAS shutdown):

Information 2018/04/02 20:31:18 System localhost System was shut down on Mon Apr 2 20:31:18 PDT 2018.
Information 2018/04/02 20:30:21 System localhost [Network & Virtual Switch]Stop Network & Virtual Switch.
Information 2018/04/02 20:30:11 System localhost Stopping Plex Media Server...
Information 2018/04/02 00:32:03 System localhost [Disk S.M.A.R.T.] Host: Disk 2 Rapid Test result: Completed without error.
Information 2018/04/02 00:30:00 System localhost [Disk S.M.A.R.T.] Host: Disk 2 Rapid Test started.
Information 2018/04/02 00:17:00 System localhost [Disk S.M.A.R.T.] Host: Disk 1 Rapid Test result: Completed without error.
Information 2018/04/02 00:15:00 System localhost [Disk S.M.A.R.T.] Host: Disk 1 Rapid Test started.

KLOG entries:

ata2: hard resetting link -- Count:8
ata1: hard resetting link -- Count:4[3937]: segfault at 26cb614 ip007f61df6fe342 sp007ffe41c38420 error 4 in[7f61df6e0+10] -- Count:0[29186]: segfault at 253ae22 ip007f7cb711d342 sp007ffee62e9e60 error 4 in[7f7cb7100+10] -- Count:0
sysRequest.cgi[29202]: segfault at 0 ip007efd019c2a06 sp007ffdb2f44708 error 4 in[7efd01890+190] -- Count:0
Chrome_ChildThr[29332]: segfault at 0 ip007f1e75266585 sp007f1e731f3b80 error 6 in[7f1e745a0+3f00] -- Count:0
sysRequest.cgi[15851]: segfault at 7ffdcf908 ip007fdb76b51321 sp007ffdcf965820 error 4 in[7fdb76b40+0] -- Count:0[31895]: segfault at 2639f9f ip007f65b89b6342 sp007fff9391e1e0 error 4 in[7f65b89a0+10] -- Count:0[2123]: segfault at 190fc30 ip007f2b4c18a342 sp007fffdf0b0150 error 4 in[7f2b4c170+10] -- Count:0[30143]: segfault at 4 ip007ffbaf5390d2 sp007ffc4ece4c50 error 6 in[7ffbaf480+190] -- Count:0[2488]: segfault at 1c28873 ip007f1e8d1e8342 sp007ffc0e4f2580 error 4 in[7f1e8d1d0+10] -- Count:0
Chrome_ChildThr[20276]: segfault at 0 ip007f0cb3566585 sp007f0cb14f3b80 error 6 in[7f0cb28a0+3f00] -- Count:0
Chrome_ChildThr[14929]: segfault at 0 ip007f9c1f066585 sp007f9c1cff3b80 error 6 in[7f9c1e3a0+3f00] -- Count:0[32095]: segfault at 4 ip007f7363b670d2 sp007ffd382710c0 error 6 in[7f7363ab0+190] -- Count:0
Chrome_ChildThr[9915]: segfault at 0 ip007f5f85d66585 sp007f5f83cf3b80 error 6 in[7f5f850a0+3f00] -- Count:0
Chrome_ChildThr[31]: segfault at 0 ip007fd217a66585 sp007fd2159f3b80 error 6 in[7fd216da0+3f00] -- Count:0
Chrome_ChildThr[9352]: segfault at 0 ip007f08fc466585 sp007f08fa3f3b80 error 6 in[7f08fb7a0+3f00] -- Count:0
Chrome_ChildThr[29309]: segfault at 0 ip007f8fdbf66585 sp007f8fd9ef3b80 error 6 in[7f8fdb2a0+3f00] -- Count:0
exception -- Count:6

I am no Linux expert, are those segfaults are likely hardware related of software related? Non-stock RAM tests above did not reveal any issues, issue is present on stock RAM as well.
To me, this is something else, either some other hardware issue (motherboard?) or some firmware issue causing incorrect RAM addressing.

Please, advise if anyone seen this before on QNAP and know how to get it to the root cause as QNAP support seems to be going circles, shooting in the dark and buying time by asking all possible combinations of components and hardware. I expect now when all options were exhausted, I will be asked to swap drives to some specific ones and then wait again...


Re: TS-253A Self-shutdown caused by segfault error

Posted: Tue Sep 04, 2018 10:08 pm
by barrie.kenyon
I have a TS-453A, 2 x 2TB disks, which has been experiencing similar shutdowns. Seemingly at random, but regular. This is with various releases of QTS over the past 6 months, since I upgraded the memory.

I considered temperature, it has been hotter that usual here in the UK, but struggled to spot an obvious pattern. Like above, it seemed to just stall, often in the middle of downloading/uploading files, with the web GUI remaining active for some minutes before also stalling. It then needed a hard power off from the front button, followed by power on and check the disks.

Eventually I loaded Qnap's Diaagnostic Tools, and downloaded the logs immediately after another stall. They were sent off to Qnap Support, and very quickly I got a reply that the memory was non-standard. Replace with standard Qnap memory and see if if continues to occur.

The memory installed was a single 8GB SK Hynix HMT41GS6AFR8A-PB, 8GB 2RX8 PC3L 12800S-11-13-F3

Although not easy to find, there are several threads on the forum that discuss memory problems, specifically regarding upgrading the TS-x53A to more than the recommended 8GB total. These discussions indicate that the NAS is sensitive to the make of RAM installed, with several people experiencing similar random stalls. They seem mostly to reach the conclusion that major manufacturer memory is safer and more stable, with Kingston and Crucial being preferred solutions.

I have now replaced my single 8GB stick with a 2 x Crucial 4GB CT51264BF160BJ kit.

So far (only 2 days) there has been no interruption, everything is working perfectly. I have also just upgraded to the latest 4.3.5 beta without issue (I know, only ever change one thing at a time when researching a problem).

I will post back in a week or so and let you know if it has fixed the stability problem.