scsi time outs and sense data errors w/ tape drive

Chris Harwell charwell at digitalpulp.com
Tue Jun 12 10:42:00 CDT 2001


greetings,

i've been seeing scsi time outs followed by
sense data errors with a dell poweredge and sony tape drive (more specs below).

i've actually moved the tape drive to a different box (redhat 6.2 to 7.1)
and changed the cable but i get the same problem... which i only see periodically,
seemingly when a lot of data may be going to the tape drive and perhaps overwhelms it.

any suggestions? is there an FAQ i should look at?

do you know how to increase the timeout period or buffering?

i had fooled myself into believing the scsi resets had gone away
when i turned off datcompression ( su - -c "mt -f /dev/nst0 datcompression 0" )
 that had been on for some strange reason, but it didn't - though i did notice it
claimed to be back on after the problem manifested again.

thanks :>

-- 
chris
charwell at digitalpulp.com



i'm using the dump, tar and mt-st programs with st, aic7xxx, sd_mod and
scsi_mod modules on  a redhat 7.1 install w/ all redhat updates ( 2.4.2-2
kernel ), DELL poweredge 300 with Sony SDT9000 tape drive.

tar (GNU tar) 1.13.19
dump 0.4b21
mt-st v. 0.5b

cat /proc/scsi/aic7xxx/1
Adaptec AIC7xxx driver version: 5.2.4/5.2.0
Compile Options:
  TCQ Enabled By Default : Enabled
  AIC7XXX_PROC_STATS     : Enabled

Adapter Configuration:
           SCSI Adapter: Adaptec AIC-7850 SCSI host adapter
                           Narrow Controller at PCI 0/16/0
    PCI MMAPed I/O Base: 0xfe100000
 Adapter SEEPROM Config: SEEPROM found and used.
      Adaptec SCSI BIOS: Disabled
                    IRQ: 14
                   SCBs: Active 0, Max Active 1,
                         Allocated 31, HW 3, Page 255
             Interrupts: 4757640
      BIOS Control Word: 0x0000
   Adapter Control Word: 0x0055
   Extended Translation: Disabled
Disconnect Enable Flags: 0x00ff
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 32
    Tagged Queue By Device array for aic7xxx host instance 1:
      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
    Actual queue depth per device for aic7xxx host instance 1:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi1:0:0:0)
  Device using Narrow/Sync transfers at 10.0 MByte/sec, offset 15
  Transinfo settings: current(25/15/0/0), goal(25/15/0/0), user(25/15/0/0)
  Total transfers 4754750 (139 reads and 4754611 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:       0       0       0       0       0     139       0       0
  Writes:       0       0       0 4638228  116383       0       0       0

cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: QUANTUM  Model: ATLAS V  9 WLS   Rev: 0201
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: QUANTUM  Model: ATLAS V  9 WLS   Rev: 0201
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: SONY     Model: SDT-9000         Rev: 0400
  Type:   Sequential-Access                ANSI SCSI revision: 02

/var/log/messages:
Jun 12 03:46:47  kernel: scsi : aborting command due to timeout :
pid 0, scsi1, channel 0, id 0, lun 0 Write (6) 01 0
0 00 14 00
Jun 12 03:46:47  kernel: (scsi1:0:0:0) SCSISIGI 0xe6, SEQADDR
0x11b, SSTAT0 0x2, SSTAT1 0x13
Jun 12 03:46:47  kernel: (scsi1:0:0:0) SG_CACHEPTR 0x0, SSTAT2
0x10, STCNT 0x1ede
Jun 12 03:46:49  kernel: SCSI host 1 abort (pid 0) timed out -
resetting
Jun 12 03:46:49  kernel: SCSI bus is being reset for host 1
channel 0.
Jun 12 04:01:49  kernel: SCSI host 1 abort (pid 0) timed out -
resetting
Jun 12 04:01:49  kernel: SCSI bus is being reset for host 1
channel 0.
Jun 12 04:01:52  kernel: SCSI host 1 channel 0 reset (pid 0) timed
out - trying harder
Jun 12 04:01:52  kernel: SCSI bus is being reset for host 1
channel 0.
Jun 12 04:01:52  kernel: (scsi1:0:0:0) Synchronous at 10.0
Mbyte/sec, offset 15.
Jun 12 04:01:52  kernel: st0: Error with sense data: Info
fld=0x14, Current st09:00: sense key Unit Attention
Jun 12 04:01:52  kernel: Additional sense indicates Power
on,reset,or bus device reset occurred









More information about the Linux-PowerEdge mailing list