Tux

...making Linux just a little more fun!

Disk Errors causing machine to freeze

Ben Whyte [ben at whyte-systems.co.uk]


Sat, 07 Feb 2009 11:55:03 +0000

Hi

When I am writing to my disk every now and again, the machine stops responding for a period. When I look in syslog I see messages like this

Feb  7 11:18:02 thor kernel: [ 1515.415879] ata1.00: exception Emask 0x0
SAct 0x0 SErr 0x0 action 0x6 frozen
Feb  7 11:18:02 thor kernel: [ 1515.415896] ata1.00: cmd
35/00:e0:57:5c:8e/00:03:08:00:00/e0 tag 0 dma 507904 out
Feb  7 11:18:02 thor kernel: [ 1515.415899]          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  7 11:18:02 thor kernel: [ 1515.415905] ata1.00: status: { DRDY }
Feb  7 11:18:02 thor kernel: [ 1515.415923] ata1: soft resetting link
Feb  7 11:18:03 thor kernel: [ 1515.964911] ata1.00: configured for UDMA/133
Feb  7 11:18:03 thor kernel: [ 1515.964911] ata1: EH complete
Feb  7 11:18:03 thor kernel: [ 1515.964911] sd 0:0:0:0: [sda] 1953523055
512-byte hardware sectors (1000204 MB)
Feb  7 11:18:03 thor kernel: [ 1515.964911] sd 0:0:0:0: [sda] Write
Protect is off
Feb  7 11:18:03 thor kernel: [ 1515.964911] sd 0:0:0:0: [sda] Mode
Sense: 00 3a 00 00
Feb  7 11:18:03 thor kernel: [ 1515.964911] sd 0:0:0:0: [sda] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb  7 11:21:07 thor kernel: [ 1709.741765] ata1.00: exception Emask 0x0
SAct 0x0 SErr 0x0 action 0x6 frozen
Feb  7 11:21:07 thor kernel: [ 1709.741780] ata1.00: cmd
35/00:e8:4f:76:ad/00:01:0a:00:00/e0 tag 0 dma 249856 out
Feb  7 11:21:07 thor kernel: [ 1709.741782]          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb  7 11:21:07 thor kernel: [ 1709.741785] ata1.00: status: { DRDY }
Feb  7 11:21:07 thor kernel: [ 1709.741799] ata1: soft resetting link
Feb  7 11:21:08 thor kernel: [ 1710.444091] ata1.00: configured for UDMA/133
Feb  7 11:21:08 thor kernel: [ 1710.444091] ata1: EH complete
Feb  7 11:21:08 thor kernel: [ 1710.444091] sd 0:0:0:0: [sda] 1953523055
512-byte hardware sectors (1000204 MB)
Feb  7 11:21:08 thor kernel: [ 1710.444091] sd 0:0:0:0: [sda] Write
Protect is off
Feb  7 11:21:08 thor kernel: [ 1710.444091] sd 0:0:0:0: [sda] Mode
Sense: 00 3a 00 00
Feb  7 11:21:08 thor kernel: [ 1710.459700] sd 0:0:0:0: [sda] Write
cache: enabled, read cache: enabled, doesn't support DPO or FUA

So far this appears to cause the hard drive to collect errors and eventually for filesystem corruption to occur.

Has anyone seen this sort of thing before and are you able to shed any light on what might be happening.

This has been on going for a while now, and I have tried different drives and different cables.

Although it may not be immediately obvious its using SATA

Thanks

Ben


Top    Back


Thomas Adam [thomas.adam22 at gmail.com]


Sat, 7 Feb 2009 12:01:38 +0000

2009/2/7 Ben Whyte <ben@whyte-systems.co.uk>:

> Hi
>
> When I am writing to my disk every now and again, the machine stops
> responding for a period.  When I look in syslog I see messages like this
>
> Feb  7 11:18:02 thor kernel: [ 1515.415879] ata1.00: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x6 frozen

https://lists.debian.org/debian-powerpc/2007/08/msg00155.html

-- Thomas Adam


Top    Back


Ben Whyte [ben at whyte-systems.co.uk]


Sat, 07 Feb 2009 12:07:36 +0000

> https://lists.debian.org/debian-powerpc/2007/08/msg00155.html
>
> -- Thomas Adam
>   

Thomas

Thanks for your reply, I am not sure what the relevant section of that link is.

I have the ide-core module when I do a lsmod

I am 99.9% certain that this is not a problem with the disk as this is a brand new disk, and it has done this with 4 different hard disks now.

Ben


Top    Back


Thomas Adam [thomas.adam22 at gmail.com]


Sat, 7 Feb 2009 12:16:03 +0000

2009/2/7 Ben Whyte <ben@whyte-systems.co.uk>:

> Thanks for your reply, I am not sure what the relevant section of that link
> is.

Read the entire thread, it suggests that your problem is either that your disk is on the way out, or more likely the initrd (and initfs) is incompatible with the kernel you're running.

-- Thomas Adam


Top    Back


Ben Whyte [ben at whyte-systems.co.uk]


Sat, 07 Feb 2009 12:20:40 +0000

> Read the entire thread, it suggests that your problem is either that
> your disk is on the way out, or more likely the initrd (and initfs) is
> incompatible with the kernel you're running.
>
> -- Thomas Adam
>   

How can I try and identify if the initrd and initfs are incompatable, this is a installation of debian unstable using the kernel 2.6.26-1.

As I said this is the 4th drive in a row to report this error.

I am starting to think that I may have a problem with the motherboard.

Ben


Top    Back


Chris Bannister [mockingbird at earthlight.co.nz]


Sat, 14 Feb 2009 23:20:29 +1300

On Sat, Feb 07, 2009 at 12:20:40PM +0000, Ben Whyte wrote:

> 
> > Read the entire thread, it suggests that your problem is either that
> > your disk is on the way out, or more likely the initrd (and initfs) is
> > incompatible with the kernel you're running.
> >
> > -- Thomas Adam
> >   
> 
> How can I try and identify if the initrd and initfs are incompatable, 
> this is a installation of debian unstable using the kernel 2.6.26-1.

"dpkg-reconfigure linux-image-2.6.26-1-686" should do it.

> As I said this is the 4th drive in a row to report this error.

Same type of drives? Maybe the driver is too generic?

> I am starting to think that I  may have a problem with the motherboard.

I would suggest subscribing to debian-user@lists.debian.org and asking your question there. You could try a latter kernel and see if that fixes it, particularly if your hardware is fairly new.

-- 
Chris.
======
I contend that we are both atheists. I just believe in one fewer god
than you do. When you understand why you dismiss all the other
possible gods, you will understand why I dismiss yours.
                                           -- Stephen F Roberts


Top    Back