[parisc-linux] Debugging 64-bit kernel crashes involving
James Bottomley
James.Bottomley at SteelEye.com
Tue Feb 27 17:02:10 MST 2007
On Sun, 2007-02-25 at 23:19 -0600, James Bottomley wrote:
> OK, I have a theory. It has to do with the way we do flush_tlb_mm by
> incrementing the spaceid. This works in a single space per process
> model. However, a process with multiple threads has >1 scheduling
> context which share spaces. So, the theory goes that when we fork from
> a thread, we execute flush_tlb_mm which bumps the context (space). Then
> we schedule another thread in the same process. However, this picks up
> its space registers from the task rather than the mm->context, so it
> uses the old mm. Now, the load context has updated %cr8, the protection
> ID. However %cr8 isn't part of the task context, so we end up executing
> in the old context with the protection of the new one ... resulting in a
> protection ID trap.
Based on the theory, I managed to reproduce the problem on ioz (you just
have to increase N to be much greater than the number of CPUs you have)
and tried a little fix, which seems to work for ioz. Could you try this
out on your a500?
Thanks,
James
Index: BUILD-2.6/arch/parisc/kernel/process.c
===================================================================
--- BUILD-2.6.orig/arch/parisc/kernel/process.c 2007-02-27 15:52:54.000000000 -0800
+++ BUILD-2.6/arch/parisc/kernel/process.c 2007-02-27 15:57:24.000000000 -0800
@@ -395,3 +395,30 @@ get_wchan(struct task_struct *p)
} while (count++ < 16);
return 0;
}
+
+struct task_struct *__switch_to(struct task_struct *prev,
+ struct task_struct *next)
+{
+ unsigned long sr3;
+ unsigned long newsr3 = mfsp(3);
+ struct pt_regs *regs = &next->thread.regs;
+
+ /* need to be executing in user context */
+ if (regs->iasq[0] != 0 || regs->iasq[1] != 0) {
+ sr3 = regs->sr[7];
+
+ /* need our current space to be different from our
+ * new one. Note, this trips a lot if we're in a
+ * syscall not an interrupt from userspace, but in the
+ * syscall case, this is a nop since the space is
+ * explicitly reconstructed on return from syscall */
+ if (unlikely(sr3 != 0 && sr3 != newsr3)) {
+ int i;
+
+ for (i = 0; i < 8; i++)
+ if (regs->sr[i] == sr3)
+ regs->sr[i] = newsr3;
+ }
+ }
+ return _switch_to(prev, next);
+}
Index: BUILD-2.6/include/asm-parisc/system.h
===================================================================
--- BUILD-2.6.orig/include/asm-parisc/system.h 2007-02-27 15:53:12.000000000 -0800
+++ BUILD-2.6/include/asm-parisc/system.h 2007-02-27 15:54:33.000000000 -0800
@@ -43,9 +43,10 @@ struct pa_psw {
struct task_struct;
extern struct task_struct *_switch_to(struct task_struct *, struct task_struct *);
+extern struct task_struct *__switch_to(struct task_struct *, struct task_struct *);
#define switch_to(prev, next, last) do { \
- (last) = _switch_to(prev, next); \
+ (last) = __switch_to(prev, next); \
} while(0)
/*
More information about the parisc-linux
mailing list