Mail Archives: djgpp/1997/02/10/08:06:13
> 1: doubles are sometimes slower for 1 main reason: they are twice as big
> and moving twice as many bytes usually takes longer! On a 387 or 486
> moving a 64 bit value across a 32 bit bus explicitly takes more clocks.
> On a P5 there *may* be delays caused by cache filling.
Yep... :)
> 2: (With 1 exception) there is *NO* cost to 'converting' any float
> format during reads or writes from the fpu. None, Zero clocks. Is there
> any other way I can say it? All ops end up as long double during
> calculations, so only load/store actions have any difference anyway.
> It really does come down to how many bytes get shifted.
> Loading and storing long doubles is particularly expensive because it
> needs 3x32 bit access's on a 486 or 2x64 bit ones on a P5. Its slower
> even though *no* bit format conversion occurs.
I think a long double load takes 3 cycles, anything else takes 1...
> If I tell you I just spent the last 3 months optimising P5 fpu code (for
> a 3D geometry pipeline) will you start believing me?
Hey, I believe you, its just some people don't seem to understand... :)
You should see the mail I have been getting on the topic...
Leathal.
- Raw text -