[parisc-linux] SMP (in)stability
Richard Hirst
rhirst@linuxcare.com
Wed, 10 Jul 2002 15:23:18 +0100
On Wed, Jul 10, 2002 at 08:50:14AM -0600, Grant Grundler wrote:
> Thibaut VARENE wrote:
> > hangs occured (SysRq 't', Ryan modified), take a look at:
> > http://pateam.esiee.fr/archive/mails/
> >
> > and read the *SMPHangReport* files...
>
> ah - thanks for saving those.
> Here's another crash we just got last night on the A500-6X.
>
> grant
>
>
> -pa52 kernel panic'd at 22:07
> running gcc1 test in background
> ran two cvs updates on the kernel.
>
> Kernel Fault: Code=26 regs=0000000012a2cd40 (Addr=0000000010112738)
>
> YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00001000000001000000000100001110 Not tainted
> r00-03 0000000000000000 0000000010421c10 00000000101869c4 0000000030c14010
> r04-07 0000000030c14000 0000000010415410 0000000030c14010 0000000000000000
> r08-11 0000000012a2ca48 000000000000000b 0000000000000000 0000000010415410
> r12-15 000000000000000b 0000000000000000 0000000012a2ca80 0000000000000000
> r16-19 0000000000000000 000000000000004a 0000000010490000 0000000000000001
> r20-23 0000000010112738 0000000012a1b550 000000000800000f 000000000800000f
> r24-27 0000000000000000 0000000030c14018 0000000012a1b540 0000000010415410
> r28-31 0000000000000104 0000000012a2cd30 0000000012a2cd40 0000000010398840
> sr0-3 0000000000005700 0000000000009780 0000000000000000 0000000000000080
> sr4-7 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>
> IASQ: 0000000000000000 0000000000000000 IAOQ: 000000001013078c 0000000010130790
> IIR: 0e9512c0 ISR: 0000000000000000 IOR: 0000000010112738
> CPU: 0 CR30: 0000000012a2c000 CR31: 0000000010498000
> ORIG_R28: 000000001015ed00
>
> GR02 0x101869c4 poll_freewait+3c
> IOAQ 0x1013078c remove_wait_queue+1c
0000000000000000 <remove_wait_queue>:
0: 00 01 0e 76 rsm 1,r22
4: 0f 40 11 d3 ldcw 0(sr0,r26),r19
8: 86 60 20 3a cmpib,=,n 0,r19,2c <remove_wait_queue+0x2c>
c: 53 35 00 20 ldd 10(r25),r21
10: 53 34 00 30 ldd 18(r25),r20
14: 34 13 00 02 ldi 1,r19
18: 0e b4 12 d0 std r20,8(sr0,r21)
1c: 0e 95 12 c0 std r21,0(sr0,r20)
20: 0f 53 12 80 stw r19,0(sr0,r26)
24: 00 16 18 60 mtsm r22
28: e8 40 d0 02 bve,n (rp)
The address it is trying to store to is 0x10112738, which is kernel
_code_ space.
void remove_wait_queue(wait_queue_head_t *q, wait_queue_t * wait)
so either r25 (= wait) is wrong, or the wait_queue_t it points at is
corrupt. r25 is 0x3....... don't know what is up there; vmalloc'ed
memory?
Richard