[parisc-linux] 2.4.18 SMP instability
Grant Grundler
grundler@dsl2.external.hp.com
Tue, 28 May 2002 11:07:57 -0600
Jeremy Drake wrote:
> I'll try. BTW, the HPMC only happens sometimes. Most of the time it just
> hangs. But HPMC starts if I hit the button on the back and let it boot.
ok. This is an interesting symptom.
...
> General Registers 0 - 31
> 00-03 0000000000000000 0000000a44b3921e 0000000000019bf0 00000000f400400
> 0
GR02 is the return pointer - but it's not a kernel address.
Possible PDC or something else.
...
> IIA Space = 0x0000000000000000
> IIA Offset = 0x0000000000019bf8
IIA is the instruction pointer. Also not a valid kernel address.
It's possible we are getting a "double fault" and the first
one is overwriting the original HPMC.
> Check Type = 0x20000000
> CPU State = 0x9e000004
> Cache Check = 0x00000000
> TLB Check = 0x00000000
> Bus Check = 0x0030103b
> Assists Check = 0x00000000
> Assist State = 0x00000000
> Path Info = 0x00000000
> System Responder Address = 0x000000fff4004014
> System Requestor Address = 0xfffffffffffa0000
This is useful. The system *probably* died trying to access 0xf4004014.
I could try to look up CPU State but I'm out of time.
Here are the next steps:
1) figure out who is touching 0xf4004014.
I didn't see anything in the console output.
(http://lists.parisc-linux.org/pipermail/parisc-linux/2002-May/016342.html)
Can you look in /proc/iomem?
My C3000 has:
f4000000-f4ffffff : LBA PCI LMMIO
f4007000-f4007fff : usb-ohci
f4008000-f40083ff : tulip
2) figure out if the access is because of bad DMA killing the IOMMU
or just the chip not responding.
It remotely possible the latest commit I made will affect this problem.
Can you retry with -pa28 (or -pa29)?
grant