"Disk Failed", then "Disk Unplugged" errors

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

My model is the TS-853A - 8 bay

I am also about to attempt this same solder joint. I have high confidence in the soldering part of it and picking the right pins.

Would it be bad to just jump all 8 mosfets on mine and have any other bays fixed then and there.
also
Will the drives still spin down if power saving is enabled if I do this fix.
----
Regardless of the questions above I decided to do all mosfets and if power saving didn't work well I would still have a server...
After my 4th rebuild I got home to find HDD 1 had been ejected. This is the first time this one has been an issue. So now 1, 3,4,5 have been issues. I will be attempting to jump the mosfets tomorrow.. Also I had completed a full SMART scan which took about the same time the entire raid rebuild took. No errors at all on any of the 5 disks plus 1 ssd.
----
Just found 6 mosfets which was confusing for an 8 bay NAS. Anyway after soldering the 6 freakishly tiny joints in various ways its back online and again doing a rebuild... 30 hours until I will know if it was successful. At the minimum the drives are online so that's a plus. I had an issue with some of the mosfets so I had to remove 2.
KDallas74
First post
Posts: 1
Joined: Thu Nov 14, 2019 8:31 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by KDallas74 »

I have an old trusty model TS-412 that has been running for 6 or 7 years with 4x 3TB Toshiba drives. Never had an issue until I decided to replace the drives with WD Reds.

It's a long sordid tale, but suffice to say after reading through the last 20 pages or so I'm pretty convinced the changing of the drives has killed them. I'm seeing all the symptoms of a back-plane issue: the Abnormal SMART reading, the "drive unplugged" or "no disk" error... particularly when first inserting the drives and moments after the rebuild is started.

I live in a hot climate so I wonder if it's a case of dry, rigid PCB and/or combined with dry solder joins. Perhaps after years of being untouched, the brutality of swapping out the drive and applying pressure along the SATA connectors tips it over the edge!

Kudos to the people who posted their attempts at soldering their boards. Anyway I needed to register on this forum just so I could see the attached images. I wanted to collate the results for myself, so I hacked together a quick page with the posts from those helpful souls. Figured if it'll help someone else (you could make it into a PDF for yourself or whatever), I've shared the posts here: https://www.weblabsperth.com.au/qnap_nas_backplane_fix/
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

Well unfortunately after it finished rebuilding. I got the same error within about 4 hours of it finishing.

4191, 2,2019-11-16,09:41:08,System,127.0.0.1,localhost,[Hardware Status] Failed to access the installed drive. Check the drive for errors., 41,1573857668,A049,Hardware Status,C009,I/O Ports,
4192, 2,2019-11-16,09:43:11,System,127.0.0.1,localhost,[Hardware Status] Failed to access the installed drive. Check the drive for errors., 41,1573857791,A049,Hardware Status,C009,I/O Ports,
4193, 0,2019-11-16,09:43:33,System,127.0.0.1,localhost,[Storage & Snapshots] Finished hot-removing disk "Host: Disk 4".,521,1573857813,A002,Storage & Snapshots,C001,Volume,
4194, 1,2019-11-16,09:43:34,System,127.0.0.1,localhost,[Storage & Snapshots] RAID group "1" is degraded. Storage pool: 1.,103,1573857814,A002,Storage & Snapshots,C003,RAID Group,
4195, 2,2019-11-16,09:43:34,System,127.0.0.1,localhost,[Hardware Status] "Host: Disk 4": Disconnected., 37,1573857814,A049,Hardware Status,C008,Drives,

I am almost out of ideas now and am having to contemplate a new system...
stuartsm
New here
Posts: 3
Joined: Sun Nov 17, 2019 2:37 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by stuartsm »

Just wondering, do they use the same backplanes in the 653 Pro. I haven't seen this issue at this time, but I wont be surprise if I do. Mine has been running continuously since original purchase. I have no idea when I bought it, but it was current at the time.


Incidentally, I have see the same issue on a Thecus NAS and wonder if it may be a similar problem. It was always the same drive which makes me wonder if it was one MOSFET failing. I eventually gave up on it and bout the 653 Pro which has been fantastic, but I am getting to the point that I will have to upgrade all disks as it has 6 x 4TB installed in RAID5. The next upgrade would seem to be to move to 8TB disks, but the price is a bit scary as well as the thought of doing 6 rebuilds to replace them one at a time.

Stuart
stuartsm
New here
Posts: 3
Joined: Sun Nov 17, 2019 2:37 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by stuartsm »

Shirec, if you compare where you have soldered the wires, with https://www.weblabsperth.com.au/qnap_nas_backplane_fix/, I think you will find that you have soldered one of the wires to the wrong pins. It appears that you have soldered a gate to a drain, rather than a source to a drain. Check the photos and let us know what you find.

Stuart
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

Thanks for the reply. I checked the posts but I confirmed that its the 4957agm. I have done d1 to s1 and d2 to s2. I metered them out and they all have a direct connection back to the bords main 5v and 12v pins from the power supply. I tried removing the mosfet entirely for hdd4 a few days ago. Last night I just had hdd1 and hdd3 both throw the error again.

Also I am not sure if its relevant to the cause. But I only identified 6 mosfets and its 8 bays. It appeared that bays 1,2,3 were run of the same mosfet. Yet just then only hard drives 1,3 failed. I am sure there is some connection here but I don't know enough about the way this runs. Hopefully someone can decode my message. :)

Very annoying considering the issue is identical to this post. I am concerned now there might be a power supply issue. I get the same kernal log error to so I checked that first.

I appreciate the help so far. I might try and remove the mosfets entirely on the rest to see what that does. Honestly at this point I am flat out of ideas.

I am probably missing something entirely. I appreciate your help.

I would also really like to know what these error codes in this log message are. This error came up within 1 hour of the rebuild finishing which took 2 days to finish.

[Hardware Status] Failed to access the installed drive. Check the drive for errors., 41,1574003799,A049,Hardware Status,C009,I/O Ports

<2>[127466.103318] md/raid:md1: Disk failure on sda3, disabling device.
<2>[127466.103318] md/raid:md1: Operation continuing on 4 devices.
<6>[127466.115155] md/raid:md1: report qnap hal event: type = HAL_EVENT_RAID, action = SET_RAID_PD_ERROR
<6>[127466.124177] md/raid:md1: report qnap hal event: raid_id=1, pd_name=/dev/sda3, spare=/dev/(null), pd_repair_sector=0
<4>[127466.135143] blk_partition_remap: fail for partition 3
<3>[127466.140394] md: super_written gets error=10
<2>[127466.144167] md/raid:md1: Disk failure on sdc3, disabling device
kommisar
Starting out
Posts: 10
Joined: Thu Nov 01, 2018 4:35 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by kommisar »

shirec wrote: Mon Nov 18, 2019 1:09 am Thanks for the reply. I checked the posts but I confirmed that its the 4957agm. I have done d1 to s1 and d2 to s2. I metered them out and they all have a direct connection back to the bords main 5v and 12v pins from the power supply. I tried removing the mosfet entirely for hdd4 a few days ago. Last night I just had hdd1 and hdd3 both throw the error again.

Also I am not sure if its relevant to the cause. But I only identified 6 mosfets and its 8 bays. It appeared that bays 1,2,3 were run of the same mosfet. Yet just then only hard drives 1,3 failed. I am sure there is some connection here but I don't know enough about the way this runs. Hopefully someone can decode my message. :)

Very annoying considering the issue is identical to this post. I am concerned now there might be a power supply issue. I get the same kernal log error to so I checked that first.

I appreciate the help so far. I might try and remove the mosfets entirely on the rest to see what that does. Honestly at this point I am flat out of ideas.

I am probably missing something entirely. I appreciate your help.

I would also really like to know what these error codes in this log message are. This error came up within 1 hour of the rebuild finishing which took 2 days to finish.

[Hardware Status] Failed to access the installed drive. Check the drive for errors., 41,1574003799,A049,Hardware Status,C009,I/O Ports

<2>[127466.103318] md/raid:md1: Disk failure on sda3, disabling device.
<2>[127466.103318] md/raid:md1: Operation continuing on 4 devices.
<6>[127466.115155] md/raid:md1: report qnap hal event: type = HAL_EVENT_RAID, action = SET_RAID_PD_ERROR
<6>[127466.124177] md/raid:md1: report qnap hal event: raid_id=1, pd_name=/dev/sda3, spare=/dev/(null), pd_repair_sector=0
<4>[127466.135143] blk_partition_remap: fail for partition 3
<3>[127466.140394] md: super_written gets error=10
<2>[127466.144167] md/raid:md1: Disk failure on sdc3, disabling device
It is exactly the reason I initially decided not to provide the fix instructions along with issue analysis. People will try to use it where it doesn't belong.

I WOULD LIKE TO EMPHASIZE, THE MOSFET FIX I PROPOSE IS FOR VERY SPECIFIC LINE OF THE QNAP DEVICES TVS-x73. IN THE ORIGINAL POST I PROVIDE INSTRUCTIONS HOW TO MAKE SURE THE SLOT POWER IS THE ISSUE. IF YOU HAVE DIFFERENT NAS AND BELIEVE IT IS MOSFET ISSUE, YOU HAVE TO FOLLOW THOSE INSTRUCTIONS FIRST AND CHECK IF IT HELPS IN YOUR CASE. GREAT EXAMPLE IS THE FIX BY MagicStuff FOR HIS TS-431.

Shirec, your issue looks different. Your backplan has different construction and I don't believe there is someone who tried to fix it before. Bottom line, at this stage you're alone to fight this battle. Good luck.
Last edited by kommisar on Mon Nov 18, 2019 5:17 am, edited 1 time in total.
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

I totally understand. However at the point I am at I did not have much to loose. I will be attempting one last thing tonight and then I will have to bite the bullet on buying a new NAS. I'll be going with the extended warranty this time for sure. It has been an ordeal.
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

On the link with those photos I had soldered them exactly like the 4th photo. I assume the pins from the photo are 2x D1 pins and 2x D2 pins which crosses over to the gate and source pins. This afternoon I will be putting jumpers directly from the D1 and D2 pins to the power supply directly.

This is what I have done with HDD4 which is one that didn't fail this time.

I will be jumping the first 5 drives both 3v and 12v back to there original source. The only issue I see with this is that there might have been some other sort of protection before the hardrives met the power supply.

In this situation the hard drives will be directly connected to the power supply. This will be the last option that I can take as it will bypass the entire board basically.

Please someone let me know if this sounds like a horrendous idea. Cheers. I will upload a photo of my shoddy idea when i get home.
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

https://imgur.com/a/dkEwYSR

As it turns out I had ready soldered a direct line from the supply to hdd3. Now I am out of ideas.
User avatar
oyvindo
Experience counts
Posts: 1399
Joined: Tue May 19, 2009 2:08 am
Location: Norway, Oslo

Re: "Disk Failed", then "Disk Unplugged" errors

Post by oyvindo »

I have exactly the same issue with my TS-453Mini. The HDD in slot 1 stopped running. I can hear that it doesn't spin up after power on. The NAS is out-of-warranty.
I contacted QNAP support, and they offered me a replacement backplane at a cost almost identical to a brand new NAS!!
They know about this design flaw, and still they refuse to give customer support. I don't mind paying, but a reasonable amount please!
I'm ready to try the soldering method now, but first I would like to verify by applying external power.
Can someone please explain in some more detail how to do that - with better pictures? Where can I buy the right cables? Links please.
ImageImageImage
shirec
New here
Posts: 8
Joined: Wed Nov 13, 2019 12:24 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by shirec »

That might be a different issue again. From the previous comments it looks like many of them did spin up again, rebuild and then fail hours to days later requiring another rebuild. I also requested a backplane and they said they wont said that part out as a replacement which was unfortunate. In this post someone has already shown how to do it. Basically just run the hard drives external to the unit and use these type of items to run it of a separate power supply.

Use this to plug the HDD's in to its own power supply.
https://www.ebay.com.au/itm/SATA-15pin- ... 2483328567
https://www.ebay.com.au/itm/500W-ATX-Po ... 3858270813

This to plug the data cable into the nas.
https://www.ebay.com.au/itm/SATA-7pin-H ... 2461429742?

This is my own idea of how to test it and maybe someone can correct me if this is not the way.
User avatar
oyvindo
Experience counts
Posts: 1399
Joined: Tue May 19, 2009 2:08 am
Location: Norway, Oslo

Re: "Disk Failed", then "Disk Unplugged" errors

Post by oyvindo »

@shirec, thanks for the links. They helped med understand exactly what type of cables you were referring to. I ended up buying all of them (and som more) on Aliexpress.com at a very low price.
I too asked QNAP if I could buy just purchase the back plane as a sparepart, but the answer was NO!
I wonder if schematics for these NAS's can be found somewhere, i.e. on the net?
ImageImageImage
Shmal
New here
Posts: 9
Joined: Wed Aug 13, 2014 1:03 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by Shmal »

shirec wrote: Mon Nov 18, 2019 1:52 pm https://imgur.com/a/dkEwYSR

As it turns out I had ready soldered a direct line from the supply to hdd3. Now I am out of ideas.
It is hard to tell from your pictures what the orientation of the mosfet chips are. If you can post a clearer picture of the top of one of them (labels, and the divot/key in the corner of the chip), it could help to verify the correct pinouts. My board had less mosfets than the number of drives too, so there is some sharing in the circuit like you are seeing on yours.

As kommisar says, it looks like a more complex circuit than we had on our 473 so you may have to do a lot of tracing of the circuit to figure out the error. If you are going to continue down the path, try to verify the connectivity of D1/D2 to each drive bay 5v/12v pins.

Good luck!
ngardiner
First post
Posts: 1
Joined: Sat Dec 21, 2019 1:36 pm

Re: "Disk Failed", then "Disk Unplugged" errors

Post by ngardiner »

Uh oh, looks like I have this issue too. My TS-653A with 6 x WD Red 8TB has had drive bay 1 drop out about 6 times now, it just happened again now (causing another 48 hour rebuild session) and I've replaced the disk twice now, leading me to be confident it is not a disk issue.

I am seeing some of the logs mentioned in the thread:
[19077132.729359] ata1: irq_stat 0x00400040, connection status changed
[19077132.735787] ata1: SError: { HostInt PHYRdyChg 10B8B DevExch }
[19077132.742029] ata1: hard resetting link
[19077132.742039] ata1: SERR_PHYRDY_CHG SERR_DEV_XCHG. Clear the bit/burst error record
[19077133.471575] ata1: SATA link down (SStatus 0 SControl 330)
[19077133.798050] ata1: hard resetting link
[19077133.798062] ata1: SERR_PHYRDY_CHG SERR_DEV_XCHG. Clear the bit/burst error record
[19077133.798067] ata1: SERR_PHYRDY_CHG SERR_DEV_XCHG. Clear the bit/burst error record
[19077139.114891] ata1: link is slow to respond, please be patient (ready=0)
[19077143.203245] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 330)
[19077143.203888] ata1.00: both IDENTIFYs aborted, assuming NODEV
[19077143.203894] ata1.00: revalidation failed (errno=-2)
[19077144.882367] ata1: hard resetting link
[19077144.882378] ata1: SERR_PHYRDY_CHG SERR_DEV_XCHG. Clear the bit/burst error record
[19077145.188270] ata1: SATA link down (SStatus 0 SControl 330)
[19077145.188287] ata1.00: disabled
[19077145.189241] sd 0:0:0:0: rejecting I/O to offline device
[19077145.194739] sd 0:0:0:0: [sda] killing request
[19077145.194772] sd 0:0:0:0: [sda] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[19077145.194779] sd 0:0:0:0: [sda] CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
I'm running firmware version 4.3.6.0895. Sadly there doesn't seem to be a hardware fix for us TS-X53A owners outside of an RMA which I won't be paying for. Would like to keep an eye on the thread at least to see what my options are.
Post Reply

Return to “System & Disk Volume Management”