Mail Archives: djgpp/1998/08/27/20:08:22
George Foot wrote:
> > inline void setdata (int selector, int offset, int value, int num_bytes)
> > {
> > asm (
> > "pushl %%es;"
> > "movw %%dx, %%es;"
> > "cld;"
> > "movb %%al, %%ah;"
> > "rorl $8, %%eax;"
> > "movb %%al, %%ah;"
> > "rorl $8, %%eax;"
> > "movb %%al, %%ah;"
>
> I think it's quicker to do:
>
> movb %%al, %%ah
> movl %%eax, %%edx
> shll $16, %%eax
> orl %%edx, %%eax
>
> (assuming the high parts of EAX were zero initially)
>
> > "shrl $1, %%ecx;"
> > "jnc NoByte;"
> > "stosb;"
> > "NoByte: ;"
> > "shrl $1, %%ecx;"
> > "jnc NoWord;"
> > "stosw;"
> > "NoWord: ;"
> > "rep; stosl;"
> > "popl %%es "
>
> In fact I think it's better to get EDI aligned by doing a few stosbs
> at the start, then do as many stosls as necessary, then stosb the
> remainder.
Your write. I tested this code, it seems to work for any length, and it alligns itself so it is 40% faster than
the non-aligned version. I ran it with the pentium cycle counter setting 899 bytes to zero, it took 1700 cycles (vs.
memset which took 1701 without setting the selector), although memset is faster with much smaller values (less than
50 bytes).inline void setdata (int selector, lword offset, int value, int num_bytes)
{
asm (
"pushl %%es;"
"movw %%dx, %%es;"
"movb %%al, %%ah;"
"movl %%eax, %%edx;"
"shll $16, %%eax;"
"orl %%edx, %%eax;"
"cld;"
"test $0xFFFFFFFC, %%ecx;"
"jz 1f;"
"test $1, %%edi;"
"jz 0f;"
"stosb;"
"dec %%ecx;"
"0: "
"test $2, %%edi;"
"jz 1f;"
"stosw;"
"sub $2, %%ecx;"
"1: "
"movl %%ecx, %%edx;"
"shrl $2, %%ecx;"
"rep; stosl;"
"test $1, %%edx;"
"jz 2f;"
"stosb;"
"2: "
"test $2, %%edx;"
"jz 3f;"
"stosw;"
"3: "
"popl %%es;"
: : "c" (num_bytes), "a" (value), "d" (selector), "D" (offset)
: "%ecx", "%edi, "%edx", "%eax");
}
--
(\/) Endlisnis (\/)
s257m AT unb DOT ca
Endlisnis AT GeoCities DOT com
Endlis AT nbnet DOT nb DOT ca
- Raw text -