[parisc-linux] Progress
Philipp Rumpf
Philipp.H.Rumpf@mathe.stud.uni-erlangen.de
Mon, 22 Nov 1999 09:24:12 +0100
> __delay() in delay.h is ok except for ".balignl". The .balignl inserts
> a bunch of "ldi 1a,%r0" instructions which do nothing. I just didn't specify
> enough cycles before.
Yup. They are intended to do nothing to get the following code nicely aligned.
Actually I wonder now whether the best way to implement __delay(x) is:
mfctl 16, %0 ; current interval timer value
addl %0, %1, %1 ; interval timer value we want to reach
subl %1, %0, %0 ; want-is
comb,> %0, 0, .-4 ; while((want-is)>0)
mfctl 16, %0 ; current interval timer value
I actually like this quite a lot;
- should be shorter than the old loop (5 instructions instead of 3
instructions plus alignment)
- should work well for low values (mfctl is quite fast and the rest
is just arithmetic operations - and we don't have any nops in there)
- more exact than other __delays (interrupts, cache effects,
alignment, and, at least in theory, power-saving modes can make
other __delays inexact)
- more exact wrt our timer source (as CR16 actually _is_ our timer
source). This might be a bad thing as it means we don't have a sanity
check for our timer anymore.
> extern __inline__ void __delay(unsigned long loops) {
> asm volatile(
> - " .balignl 64,0x34000034
> - addib,UV,n -1,%0,.
> + " addib,UV,n -1,%0,.
> addib,NUV,n -1,%0,.+8
> nop"
> : "=r" (loops) : "0" (loops));
Just to scare you a bit, have a look at the PCXL ERS, Section 6.4 "Instruction
Lookaside Buffer". This is basically a one-entry TLB that gets set from the
real TLB and takes some time to do so.
Now picture the page boundary happes between the two addibs. This loop will
execute at about a third of the speed of a normal delay loop. The code is
inlined, so only one loop gives you bogus results - if it is the BogoMIPS
calibration loop, udelay(N) will actually only delay for N/3 us, which can
have unexpected effects on hardware we use udelay() for.
Philipp Rumpf