Disk Read/Write errors

Questions about SNMP, Power, System, Logs, disk, & RAID.
Post Reply
User avatar
rainjam
Getting the hang of things
Posts: 88
Joined: Thu Sep 24, 2009 11:40 pm

Re: Disk Read/Write errors

Post by rainjam »

Thanks, that's brilliant. I'll give it a go when I'm back in the office tomorrow.

Nick
TS-EC1279, 120TB, 8GB RAM (main) and TS-809, 32TB, 2GB RAM (backup)
User avatar
rainjam
Getting the hang of things
Posts: 88
Joined: Thu Sep 24, 2009 11:40 pm

Re: Disk Read/Write errors

Post by rainjam »

I actually (since my office isn't far from my house) came back in and just tried it. Same result as before :(

I've checked the drive and it's physically the same as the others - ie no jumper settings I can see that might be making it somehow read-only.
You do not have the required permissions to view the files attached to this post.
TS-EC1279, 120TB, 8GB RAM (main) and TS-809, 32TB, 2GB RAM (backup)
User avatar
dnz16
Getting the hang of things
Posts: 52
Joined: Mon Sep 28, 2009 8:57 pm

Re: Disk Read/Write errors

Post by dnz16 »

rainjam wrote:Hi dnz

...I tried updating the firmware on the disk using the latest version from WD. The attached shows the error message I got,
1) Did you get this error message with all disks, or just with disk 3 (since you included the bin file in the same dir.)?
If just with disk 3: Is it possible to run NAS in degraded mode (disk 1, 2 and 4)?
If NAS runs in degraded mode with disk 1,2 and 4: Then try to format disk 3 (or delete all partitions and create a new i.e. ext3 and write this to the drive, probably with this last action this will reset the drive fault status) and try the fw update again.
If update went OK: Then put disk into NAS (maybe do a Bad Blocks Scan in "Home >> Disk Management >> Volume Management ") and try to let the NAS rebuild your raid setup.
rainjam wrote:...including "admin" and "password" and...
2) user: admin
pwd: admin
??

Regards, dnz
[update]
Explained my solution a bit more, as in point 1. I also experienced this after a hdd dropout with fw 04.05G04. I deleted all partitions from the drive with GParted, which I connected to my laptop with a usb-sata adapter. Then connected to my desktop pc booted into dr-dos (1) (2) and installed the firmware.
Last edited by dnz16 on Fri Jan 29, 2010 6:54 am, edited 3 times in total.
TS-509 Version 3.8.4 build 20130816 / WD2002FYPS Firmware Revision 04.05G05 / CPU Core2 Duo 3.06GHz (E7600) / UPS APC Smart 750VA USB
User avatar
QNAPJason
QNAP Staff
Posts: 5398
Joined: Thu May 21, 2009 2:14 pm
Location: Taipei

Re: Disk Read/Write errors

Post by QNAPJason »

Hi rainjam,
The .exe file we provided to you already contained the 2002FYPS firmware.
Have you enabled any encryption on the HDD volumes?


Jason
User avatar
schumaku
Guru
Posts: 43579
Joined: Mon Jan 21, 2008 4:41 pm
Location: Kloten (Zurich), Switzerland -- Skype: schumaku
Contact:

Re: Disk Read/Write errors

Post by schumaku »

"drive security is frozen" ... this can be caused by some BIOS attempting to protect overwriting or modifying the HDD firmware and configuration. Had a similar case a while back (was a Maxtor at that time), we plugged it to a PC with the ability ro remove the ATA security bit. I do not think, the QNAP BIOS (on the Intel based units, Marvell ARM are uboot) can handle this, but have never checked if there are options to change this settings.

Suspect this drive has been used in a different computer before, or is a replacement unit. ATA security (sometimes named disk password) should be disabled hen coming from the factory or from repair.

-Kurt.
michaelrode
Starting out
Posts: 13
Joined: Sun Nov 23, 2008 8:47 am

Re: Disk Read/Write errors

Post by michaelrode »

I have a 509pro with 2 Seagate ST31500341AS SD1A (Raid 1) and one WD15EADS (single volume). I upgraded form 3.1.0 Build 0815T to 3.2.1 over the weekend. After that I put two new Samsung HD154UI (both single volume) into the system and formated them. That worked fine in general (only for one of the Samsungs it took some time to be recognised by SMART).

When i then started coping 100 GB to one of the new Samsung, after 20-30 Minutes the error messaged popped up "Drive ejected" and the server rebooted. I was about to send the two Samsung back to the seller, but it really seems to be a QNAP firmware problem. I will downgrade to 3.1.0 tonight and let you know if it worked. Does anybody have tested if the issues disappears with the new 3.2.2?

@QNAP: Why do you stay silent about this issue? With so many customer complaining you should at least confirm that your are working on a fix!
Last edited by michaelrode on Mon Feb 01, 2010 6:49 pm, edited 3 times in total.
scubajwd
Getting the hang of things
Posts: 56
Joined: Wed Apr 08, 2009 11:10 pm
Location: Las Vegas NV

Re: Disk Read/Write errors

Post by scubajwd »

Michaelrode, I elected to fall back to 3.1.2 and now my system has
stabilized again; (see the other thread below on disk read/write errors);
3.21 caused two of my disks to fall out and not be recognized which
pushed my raid-5 system into degraded mode; I didn't lose data but
I went thru a rebuild w/o a spare so I could have lost everything..
I'm going to sit tight now and NOT go to 3.2.2..I've had a rough
couple of days
TS639
RAID5
3.1.2
Seagate 1.5TB Scorpions (6) CC1H
TS-269 Pro
TS-653 Pro
User avatar
rainjam
Getting the hang of things
Posts: 88
Joined: Thu Sep 24, 2009 11:40 pm

Re: Disk Read/Write errors

Post by rainjam »

I'm not ignoring this BTW - my system is now with the UK qnap distributor so is hopefully being sorted....
TS-EC1279, 120TB, 8GB RAM (main) and TS-809, 32TB, 2GB RAM (backup)
User avatar
dnz16
Getting the hang of things
Posts: 52
Joined: Mon Sep 28, 2009 8:57 pm

Re: Disk Read/Write errors

Post by dnz16 »

After updating fw from 3.1.2 to 3.2.2 my drive 3 dropped again - and flushed me back into raid 5 degraded mode.
As I predicted earlier, or even better, feared(...) the UDMA_CRC_Error_Count for drive 3 went from 10 to 16 (20 days) under 3.1.2, and from 35, 79 and 90 (1,5hour) under 3.2.2. :(

(Re: Disk Read/Write Error - Postby dnz16 » Sat Oct 24, 2009 9:39 am)
Last edited by dnz16 on Wed Sep 08, 2010 4:10 am, edited 1 time in total.
TS-509 Version 3.8.4 build 20130816 / WD2002FYPS Firmware Revision 04.05G05 / CPU Core2 Duo 3.06GHz (E7600) / UPS APC Smart 750VA USB
michaelrode
Starting out
Posts: 13
Joined: Sun Nov 23, 2008 8:47 am

Re: Disk Read/Write errors

Post by michaelrode »

I downgrades from 3.2.1 to 3.1.0. The downgrade went thankfully completely smooth. I then decided to do a stress test from Friday evening to Sunday evening. I moved 200 GB back and forth for 48 hours between my Samsung HD154UI 1AG0 and my WD15EADS-00P8B0 01.0. There was no problem at all !!! With the 3.2.1 firmware in the same set up copying large files did not work and caused the "drive ejected" for the Samsung drive after around 20-30 minutes (this error can be reproduced). For me that is proof enought that the "drive eject" is a problem with the new firmware.

@QNAP: could you please advise when you will fix the problem? I really would like to use Twonky 5, but can not update to the latest firmware due to the issue described.
conandrum
Getting the hang of things
Posts: 99
Joined: Mon Dec 22, 2008 9:46 pm

Re: Disk Read/Write errors

Post by conandrum »

Hello again,

Update to my previous posts:

I have finally received the replacement TS-509 Pro from QNAP yesterday.
I popped my old Hard disks in 1,2,3, 4 (was the new harddisk I bought when I thought my old one was busted), 5.
The above array was rebuilt and working BEFORE I sent the 509 for RMA.
Sure enough the array was up and running in minutes with NO problems and all my old settings came back.

Current firmware version: 3.1.2 Build 1014T
While waiting for the 509, I run some checks on the old failed #4 HD, using WD Data Lifeguard and HDAT2.
I did this because I wanted to place the HD back in the array if it was healthy. The reason was because the new WD Caviar HD I bought did not support WDTLER modifications and I wanted to change TLER on ALL array members when the new 509 arrived.

Test Option: QUICK TEST
Model Number: WDC WD1001FALS-00J7B0
Unit Serial Number: WD-WMATV0455514
Firmware Number: 05.00K05
Capacity: 1000.20 GB
SMART Status: PASS
Test Result: PASS
Test Time: 20:48:49, January 31, 2010

Test Option: EXTENDED TEST
Model Number: WDC WD1001FALS-00J7B0
Unit Serial Number: WD-WMATV0455514
Firmware Number: 05.00K05
Capacity: 1000.20 GB
SMART Status: PASS
Test Result: PASS
Test Time: 00:06:47, February 01, 2010

The WD DLG tests proved the drive to be worthy.
The SMART output from HDAT2 showed also the drive to be healthy with one WARNING.

ID Description Value Worst Value Threshold Raw value Status
1 Raw_Read_Error_Rate 200 200 51 0 OK
3 Spin_Up_Time 253 253 21 5983 OK
4 Start_Stop_Count 100 100 0 247 OK
5 Reallocated_Sector_Ct 200 200 140 0 OK
7 Seek_Error_Rate 200 200 51 0 OK
9 Power_On_Hours 89 89 0 8374 OK
10 Spin_Retry_Count 100 100 51 0 OK
11 Calibration_Retry_Count 100 100 51 0 OK
12 Power_Cycle_Count 100 100 0 121 OK
192 Power-Off_Retract_Count 200 200 0 91 OK
193 Load_Cycle_Count 200 200 0 247 OK
194 Temperature_Celsius 109 98 0 41 OK
196 Reallocated_Event_Count 200 200 0 0 OK
197 Current_Pending_Sector 200 200 0 0 OK
198 Offline_Uncorrectable 200 200 0 0 OK
199 UDMA_CRC_Error_Count 200 200 0 18 OK
200 Multi_Zone_Error_Rate 200 200 51 0 OK

http://wiki.lunarsoft.net/wiki/Data_Recovery SAYS:
199 - UDMA_CRC_Error_Count
This is generally the literal count of how many times the controller encountered an error while processing an ATA command in UDMA mode. It also counts how many times a CRC checksum has mismatched during operations. Usually, this indicates a problem with the cabling or drive electronics. These errors can also be triggered by incorrect IDE device chaining or simply a faulty IDE or SATA cable, or when device firmware bug match some chipset detection problem (SATA II at 3.0Gb/s drive misdetecting SATA I at 1.5Gb/s bus controller).

The UDMA_CRC_Error_Count was not a FATAL error with the drive and I decided to place it back in the array, so I shut down the 509 and rebooted without the newer #4 drive. When fully started, I inserted the old #4 drive and the array was rebuilt in a few hours.

I then shut down the 509 and proceded in enabling WDTLER on all HDs with WDtler -r7 -w0 as per Wikipedia recomendations.
Model TLER Default ( Read / Write ) Stand-Alone Recommendation RAID Recommendation
Caviar, SE, SE16, GP, Raptor Disabled ( 0s / 0s ) Default Enabled ( 7s / 0s )
Caviar RE2, RE2-GP Enabled ( 7s / 0s ) Disabled ( 0s / 0s ) Default

(I did not mess with WDIDLE3 as some have suggested as Start_Stop_Count and Load_Cycle_Count increment at a slow rate. Maybe I need to though since my HDs never sleep? I don't know yet.)

On my Vista64 computer I have a 2xVelociraptor RAID0, which in the past has ejected one HD. At the time I knew nothing about TLER and such so I just wiped the offending drive and rebuilt the array. Since then It happened a second time, so I decided to look deeper now that I knew more. HDAT2 revealed SMART errors on the drive even though WD DLG was showing that SMART status was OK. I then run the DLG short test and sure enough it did not pass it.
Test Error Code: 06-Quick Test on drive 3 did not complete! Status code = 07 (Failed read test element), Failure Checkpoint = 65 (Error Log Test) SMART self-test did not complete on drive 3!
3 out of 4 times, the test failed. Once in a while it would pass it!!! WTF???
I then run the extended test, which found bad sectors and corrected them. Since then, the short test was run 10s of times and it passed every one. I then run the extended test again and it passed without any bad sectors or warnings. I therefore felt confident that the HD is healthy again and that there was legitimate reason in this case for being dropped.
So I rebuilt the array and set TLER on both velociraptors to WDtler -r7 -w7. It was running like a dream for a few weeks.

So yesterday, while updating the 509 drives' TLER, I also updated the Velociraptors to WDtler -r7 -w0 and as soon as I went into windows,
IASTOR: The device, \Device\Ide\iaStor0, did not respond within the timeout period,
disk: An error was detected on device \Device\Harddisk0\DR0 during a paging operation.
IAANTMON: Status for hard drive 'WD-WXL508081812' changed from 'Normal' to 'Error occurred'.
IAANTMON: Status for hard drive 'WD-WXL508075547' changed from 'Normal' to 'Error occurred'.
ESENT: A request to write to the file "C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Windows.edb" succeeded, but took an abnormally long time (127 seconds) to be serviced by the OS.
Both array members were shown to have problems and the OS suggested I run chkdsk. I shut down and changed the TLER settings back to WDtler -r7 -w7 and run chkdsk which fixed the filesystem.

I then proceded in stress testing both arrays. I run some benchmarks with sandra, HDTune and started to transfer 50GB of files from the 509 to the velociraptors. No problems!
Today I transfered 150GB of data from the 509 while transfering the windows folder to the 509. Both arrays were crunching away for an hour with no problems.

I Hope this will be the last of it.

IMPORTANT: I believe, from my experience, that we should run the SMART tests on a regular basis on every drive, otherwise the SMART information alone is NOT enough indication of the health of the drives. The short test may indicate a fault, while the extended test can actually fix it.

QUESTIONS TO OTHER USERS:
1. Should the CAVIAR BLACKS be set with "WDtler -r7 -w7" or "WDtler -r7 -w0" ?
2. Why don't the Velociraptors like "WDtler -r7 -w0"? (is it an issue with ICH10R 8.9.0.1023 as read in other forums)
3. Should I use the WDIDLE3 utility on the Caviars and Velociraptors?
4. Can the QNAP extended SMART test proceed in fixing drive problems (e.g. if bad sectors are found), or will it just report the problems in the log?

Take care!
TS509 firmware v4.2.6
RAID 5 10TB (single disk volume 5x Hitachi HUA723020ALA640 MK7O)
Windows 10.
User avatar
thanatos74
Starting out
Posts: 46
Joined: Wed Jan 21, 2009 5:46 pm
Location: Munich

Re: Disk Read/Write errors

Post by thanatos74 »

Hey Guys,

I would like to give you an short update on whats going on here regarding the disk dropouts.
A few days ago Qnap support told me to update my NAS to the latest firmware 3.2.3 - because of the disk dropout problem I was running 3.1.1 0815 before.

The good news: It seems that the disk dropout problem was solved within the last release!!! :D :D :D :D
After upgrading I did intensive testing and moved a lot of data to and from the nas. Additionally I copied data from a direct attached usb drive.

There were no dropouts at all!!

I really hope that the dropouts dont come back.
I would like to encourage everyone experiencing the disk dropouts to try out the new release - hopefully it will also work for you guys!

Oh, and of course I would be interested in your feedback.

Fingers crossed :D

thanatos74
~ Two hours of trial and error can save ten minutes of RTFM ~
michaelrode
Starting out
Posts: 13
Joined: Sun Nov 23, 2008 8:47 am

Re: Disk Read/Write errors

Post by michaelrode »

Guys,

I also upgraded to the latest 3.2.3 firmware release. No more drop outs (fingers crossed). 3.2.3 really seems to be the solution to the problem. I hope it works as good for everybody else.

Cheers
Michael
conandrum
Getting the hang of things
Posts: 99
Joined: Mon Dec 22, 2008 9:46 pm

Re: Disk Read/Write errors

Post by conandrum »

Hi guys,

Let's not jinx it,but I do hope the firmware will work to solve the problem for you.

All I can add is that BOTH my #4 drives suffered from 199 UDMA_CRC_Error_Count which I believe is a side-effect of what caused my 509 to drop them. I think that in my case it was hardware related. At the moment all is fine on my QNAP for the past 2 weeks.


CAN SOMEONE PLEASE ANSWER/CONFIRM SOME OF THE FOLLOWING QUESTIONS?

1. Should the CAVIAR BLACKS be set with "WDtler -r7 -w7" or "WDtler -r7 -w0" ?
2. Should I use the WDIDLE3 utility on the Caviars and Velociraptors?
3. Can the QNAP extended SMART test proceed in fixing drive problems (e.g. if bad sectors are found), or will it just report the problems in the log?

Take care!
TS509 firmware v4.2.6
RAID 5 10TB (single disk volume 5x Hitachi HUA723020ALA640 MK7O)
Windows 10.
User avatar
ECKEROTH
Getting the hang of things
Posts: 55
Joined: Wed Dec 30, 2009 8:38 pm

Re: Disk Read/Write errors

Post by ECKEROTH »

I allso have this disk dropout problem.

I have a TS-659 Pro with 6 x 'Seagate ST32000542AS CC34' 2TB disks
Firmware is 3.2.3 Build 0209T

TS claims that the disk was removed???

2010-02-19 448 18:38:08 System 127.0.0.1 localhost [Drive 2] Start scanning bad blocks.
2010-02-19 447 18:35:41 System 127.0.0.1 localhost Drive 2 plugged out.
2010-02-17 446 06:13:28 System 127.0.0.1 localhost [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Drive 2 failed.
2010-02-17 445 06:07:09 System 127.0.0.1 localhost [RAID5 Disk Volume: Drive 1 2 3 4 5 6] RAID device in degraded mode.
2010-02-17 444 06:07:08 System 127.0.0.1 localhost [RAID5 Disk Volume: Drive 1 2 3 4 5 6] Drive 2 removed. <-- :?:

How do I fix this problem?
TS-209, TS-659 PRO+, TS-670, TVS-882
Post Reply

Return to “System & Disk Volume Management”