[parisc-linux] sshd triggers Protection id trap
John David Anglin
dave at hiauly1.hia.nrc.ca
Mon Jan 8 20:15:01 MST 2007
> On 1/8/07, John David Anglin <dave at hiauly1.hia.nrc.ca> wrote:
> > I managed to get gdb backtraces for the hung java processes yesterday.
> > It looks like the hang could be a result of the clone bug in glibc.
> > clone is used in thread creation.
>
> The clone bug is a failure to restore r19 correctly. This results in
> an immediate crash when attempting to access a variable via the PIC
> register. It doesn't matter in 99% of the cases where the $$dyncall
> sets r19 and we jump to the target plabel.
>
> I argue that this is not the clone bug.
>
> > Of course, this doesn't explain why killing the processes crashes the
> > system.
>
> It doesn't explain it ... because it's not that bug.
Another example:
dave 22128 1 0 13:03 ? 00:00:00 /home/dave/gnu/gcc-4.3/objdir/hppa-linux/libjava/testsuite/Process_2.exe
dave 22129 1 85 13:03 ? 07:33:07 /home/dave/gnu/gcc-4.3/objdir/hppa-linux/libjava/testsuite/Process_2.exe
dave 22130 22129 0 13:03 ? 00:00:00 [sh] <defunct>
>From top:
22128 dave 15 0 50948 28m 20m S 0.0 2.9 0:00.00 Process_2.exe
22129 dave 25 0 50948 28m 20m T 0.0 2.9 457:14.08 Process_2.exe
22130 dave 20 0 0 0 0 Z 0.0 0.0 0:00.01 sh <defunct>
$ gdb Process_2.exe 22129
...
(gdb) bt
#0 0x40601a30 in __pthread_manager () from /lib/libpthread.so.0
#1 0x41a50498 in _Jv_CondWait (cv=0x40b4e000, mu=0xeae0c29,
millis=<value optimized out>, nanos=208)
at ../../../gcc/libjava/posix-threads.cc:179
#2 0x41a31794 in gnu::gcj::runtime::FinalizerThread::run (
this=<value optimized out>)
at ../../../gcc/libjava/gnu/gcj/runtime/natFinalizerThread.cc:57
#3 0x41a45c50 in _Jv_ThreadRun (thread=0x4052fd70)
at ../../../gcc/libjava/java/lang/natThread.cc:302
#4 0x41a4fdd4 in really_start (x=0x400c38a0)
at ../../../gcc/libjava/posix-threads.cc:445
#5 0x42738714 in GC_start_routine (arg=0x400e2f80)
at ../../../gcc/boehm-gc/pthread_support.c:1294
#6 0x4060128c in pthread_start_thread () from /lib/libpthread.so.0
#7 0x409ff780 in clone () from /lib/libc.so.6
#8 0x409ff780 in clone () from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
As far as I can tell, this process is "stuck" at 0x40601a30 although
I don't think setting breaks on a running process works. The process
isn't doing anything special:
(gdb) disass 0x40601a20 0x40601a40
Dump of assembler code from 0x40601a20 to 0x40601a40:
0x40601a20 <__pthread_manager+1012>: ldw 0(r22),r20
0x40601a24 <__pthread_manager+1016>: add,l r20,ret0,r20
0x40601a28 <__pthread_manager+1020>: ldi 1,ret0
0x40601a2c <__pthread_manager+1024>: stw r20,0(r22)
0x40601a30 <__pthread_manager+1028>: stw ret0,c(r5)
0x40601a34 <__pthread_manager+1032>: ldo 1bc(r5),ret0
0x40601a38 <__pthread_manager+1036>: add,l r20,r7,r3
0x40601a3c <__pthread_manager+1040>: depw,z r7,30,31
gdb Process_2.exe 22129
...
(gdb) bt
#0 0x409e68b4 in sched_yield () from /lib/libc.so.6
#1 0x406055f0 in __pthread_acquire () from /lib/libpthread.so.0
#2 0x4060585c in __pthread_alt_unlock () from /lib/libpthread.so.0
#3 0x406024d4 in pthread_mutex_unlock () from /lib/libpthread.so.0
#4 0x41ebcea0 in java.lang.ConcreteProcess$ProcessManager.run()void (
this=0x40542e40) at java/lang/ConcreteProcess.java:35
#5 0x41a45c50 in _Jv_ThreadRun (thread=0x40542e40)
at ../../../gcc/libjava/java/lang/natThread.cc:302
#6 0x41a4fdd4 in really_start (x=0x400c3800)
at ../../../gcc/libjava/posix-threads.cc:445
#7 0x42738714 in GC_start_routine (arg=0x400e2ec0)
at ../../../gcc/boehm-gc/pthread_support.c:1294
#8 0x4060128c in pthread_start_thread () from /lib/libpthread.so.0
#9 0x409ff780 in clone () from /lib/libc.so.6
#10 0x409ff780 in clone () from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
This process appears to be trying to acquire a lock in __pthread_acquire
which is always locked. So, it spins calling sched_yield and/or
nanosleep.
When I quit gdb, I see:
(gdb) quit
The program is running. Quit anyway (and detach it)? (y or n) y
Quitting: Can't detach LWP 22125: No such process
kill -9 22128 22129
just crashed the system.
Dave
--
J. David Anglin dave.anglin at nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
More information about the parisc-linux
mailing list