[parisc-linux-cvs] Re: 2.6.0-test4-pa12 __fls()
Matthew Wilcox
willy@debian.org
Sat, 6 Sep 2003 20:31:00 +0100
On Fri, Sep 05, 2003 at 11:59:14PM -0600, Grant Grundler wrote:
> > kudos to Joel Soete for mangling lamont's __ffs() to produce __fls().
> > Again, I added 64-bit support. Booted on a500.
> > Didn't test 32-bit but fls_test.c included in above URL works in user space.
> > (162 cycles per loop iteration, 00:29:01 to complete on 400Mhz PA8500).
Sorry, but it's clearly broken.
> -#define fls(x) generic_fls(x)
> +static __inline__ unsigned long __fls(unsigned long x)
> +{
> + unsigned long ret;
> +
> + __asm__( " ldi 1,%1\n"
> +#if BITS_PER_LONG > 32
> + " extrd,u,*<> %0,63,32,%%r0\n"
if any of the bottom 32 bits are set ...
> + " depd,*TR %0,31,32,%0\n"
move the bottom 32 bits up into the top 32 bits
> + " addi 32,%1,%1\n"
otherwise add 32
> +#endif
... and then do things that can't see the top 32 bits.
> + " extru,<> %0,15,16,%%r0\n"
> + " zdep,TR %0,15,16,%0\n"
> + " addi 16,%1,%1\n"
think you could put in comments similar to the endian swapping? it makes
the intent clearer to see. as far as i can tell, the basic principle
here is..
if (top N bits clear)
shift N bits right
else
add N
right?
seems to me this whole thing should be done as
if (any top N bits set)
shift N bits left
else
subtract N
> + " extru,<> %0,7,8,%%r19\n"
r19? That's not mentioned as being clobbered. I think you mean r0.
> +static __inline__ int fls(int x)
> +{
> + return x ? (__fls((unsigned long)x) + 1) : 0;
> +}
we don't need an __fls, so implementing an fls that works directly would
be better.
--
"It's not Hollywood. War is real, war is primarily not about defeat or
victory, it is about death. I've seen thousands and thousands of dead bodies.
Do you think I want to have an academic debate on this subject?" -- Robert Fisk