[parisc-linux] Re: 53c700 (LASI SCSI 53c700) hang
Carlos O'Donell Jr.
carlos@baldric.uwo.ca
Mon, 4 Feb 2002 20:27:53 -0500
> What these errors tell me is that your HD accepted more tags than it could
> cope with and then choked. Linux error handler isn't very good at handling
> this situation. Also, your disc:
>
> deller@gmx.de said:
> > Vendor: QUANTUM Model: FIREBALL_TM3200S Rev: 300X
>
> Is a known trouble causer with tag command queueing. Initially, try taking
> the #define NCR_700_MAX_TAGS in drivers/scsi/53c700.h down to 4 or 2 and
> recompiling the driver. Alternatively, turn off tagged command queueing
> altogether by commenting out this block of code:
>
> I am getting around to adding the code changes to make this able to be done as
> module/kernel command line options.
>
> James
>
I've been having problems with the driver for quite some time now.
SCSI subsystem driver Revision: 1.00
53c700: consistent memory allocation failed
53c700: Version 2.6 By James.Bottomley@HansenPartnership.com
scsi0: 53c700 rev 0
scsi0 : LASI SCSI 53c700
Vendor: FUJITSU Model: M2694ES-512 Rev: 8134
Type: Direct-Access ANSI SCSI revision: 02
Attached scsi disk sda at scsi0, channel 0, id 6, lun 0
SCSI device sda: 2117025 512-byte hdwr sectors (1084 MB)
Partition check:
sda: sda1 sda2
Compiled kernel with tag queue code _always_ disabled (2.4.17-pa18 from CVS).
#ifdef NEVERCOMIPLE
if(SCp->device->tagged_supported && !SCp->device->tagged_queue
&& (hostdata->tag_negotiated &(1<<SCp->target)) == 0
&& NCR_700_is_flag_clear(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING)) {
/* upper layer has indicated tags are supported. We don't
* necessarily believe it yet.
*
* NOTE: There is a danger here: the mid layer supports
* tag queuing per LUN. We only support it per PUN because
* of potential reselection issues */
printk(KERN_INFO "scsi%d: (%d:%d) Enabling Tag Command Queuing\n", SCp->device->host->host_no, SCp->target, SCp->lun);
hostdata->tag_negotiated |= (1<<SCp->target);
NCR_700_set_flag(SCp->device, NCR_700_DEV_BEGIN_TAG_QUEUEING);
SCp->device->tagged_queue = 1;
}
#endif
in drivers/scsi/53c700.c at about line 1891.
Start up one of those real-world scripts :}
#!/bin/tcsh
while ( 1 )
find /bin | xargs cat > /dev/null
find /boot | xargs cat > /dev/null
find /etc | xargs cat > /dev/null
find /root | xargs cat > /dev/null
find /sbin | xargs cat > /dev/null
find /tmp | xargs cat > /dev/null
find /usr | xargs cat > /dev/null
find /var | xargs cat > /dev/null
end
root@node44:/proc/scsi/lasi700# cat 0
Total commands outstanding: 1
Target Depth Active Next Tag
====== ===== ====== ========
6: 0 16 1 0
10 minutes into the run, the find _and_ cat are D on the process list.
The drive is officially unresponsive around this point... maybe it was
just cat and find you say?
Soon after, kupdated goes into D aswell. From there on in the box is
locking up left right and center. I wish I had kdb and could see what's
going on.
I've repeated this lockup 3 times.
Most intersting is that when I reenable the Tag queueing code but change
the Tag depth to 2 (instead of 16). The machine doesn't seem to hang.
I have a box currently running well over the 10 minute mark that I will
leave running until tommorow.
The sim700 driver runs poorly, but happily for days... generating heat :)
Sadly, the sim700 driver is currently only functionaly with the older kernels.
I'm using 2.4.9-pa25 to run the 715/50's in our cluster (diskless boxes run
the latest kernel no problems).
Any thoughts?
Is the issue as simple as:
Leave Tag queuing in, but set depth to something low (2 or 4).
Good: Tag Queu, Depth = 2
Bad: No Tag Queue.
Tag Queue, Depth = 16.
c.