Chronic Logic

Gish => General Discussion => Topic started by: Jonathan_NL on October 02, 2004, 01:59:57 AM

Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on October 02, 2004, 01:59:57 AM
I know, sqrt returns a double. I was actually confused because I didn't see things fit together. Have you already tried ((float)sqrt((double)somefloat))?
Title: Mac/PC physics/timing difference
Post by: woody on September 30, 2004, 09:36:42 AM
Quote (Jonathan_NL @ Sep. 29 2004,11:42)
so there must be some difference in the physics or timing? ???

Where do you see these differences?
Title: Mac/PC physics/timing difference
Post by: woody on October 02, 2004, 02:38:19 AM
Quote (Chronic Logic - Alex @ Oct. 01 2004,8:44)
Actually the accuracies for the sqrt functions are the same, but the different compilers convert the double back to a float at different times.  The main problem is Visual C seems to ignore casting a double back to a float, even with optimizations off.  I'm not sure why the Linux and Mac versions are different yet.

If you did the final build on the PC using GCC I assume it would be the same. I also assume the final executable would be slower, so I guess that isn't a good plan!
Presumable linux and the Mac should be the same.

However in this case, My Mac and Jonathans Mac produce different results (I assume).
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on September 30, 2004, 06:23:41 PM
Just try a PC replay on the Mac or a Mac replay on the PC. There is a PC replay included with Gish (also the Mac version). Mac replay: Jonathan-C12.gre.gz
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on October 02, 2004, 03:05:59 AM
That has to do with the latest patch, it seems.

Edit: Made one with the new version.
Title: Mac/PC physics/timing difference
Post by: woody on October 01, 2004, 01:26:42 PM
I think it is more than a mac-pc difference then as when I play that back on my mac you storm round the left, over the top, break the right hand side platforms and end up trapped at the right-middle until it times out.

Unless that is actually what happened I guess there is a processor timing issue in there somewhere!
Title: Mac/PC physics/timing difference
Post by: woody on October 02, 2004, 12:35:09 PM
Well, that worked but then I would assume it is a less timing critical level.
Still a damn site better than I can do!!
Title: Mac/PC physics/timing difference
Post by: Chronic Logic - Alex on October 01, 2004, 10:22:05 PM
The problem is a difference in accuracies for the square root function that is used in the physics.  Unfortunately its different for all 3 versions, which makes it difficult to fix.
Title: Mac/PC physics/timing difference
Post by: woody on October 02, 2004, 03:11:12 PM
If all of the maths is done in floats rather than doubles, it might be worth using vsqrtf(vFloat val) instead of sqrt - it is a lot faster and thats where gish seems to spend most of its time.
It is part of the vector libraries (altivec processor unit).
For real performance increase it should be noted that if you can feed the altivec unit fast enough you can do >= 4 vsqrt()s simultaneously with no overhead on the processor.
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on October 01, 2004, 10:51:48 PM
I already thought that it was something like this, at the root of the physics. ;-)

Have you ever done tests on how fast/accurate the different sqrt()s actually are?
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on October 07, 2004, 12:55:38 AM
I quickly hacked vsqrtf in a for(i=0; i<10000000; i++) loop without much altivec knowledge (looked everything up). It is twice as fast as sqrt() in the same loop, and remember that it does 4 at a time!

Edit: My first successful asm is an altivec instruction. :laugh: If you're interested btw:
Code Sample
vUInt32 in = (vUInt32)('siht', ' si ', 'brag', '!ega');
vUInt32 perm = (vUInt32)(0x03020100, 0x07060504, 0x0B0A0908, 0x0F0E0D0C); // the permute value for byte-swapping 4 32-bit values. note that you can get data from a second vector (unused in this case. it's the second %1 below, and can be reached with 0x1*) to put in the result.
vUInt8 out;
asm("vperm %0,%1,%1,%2":"=v"(out):"v"(in),"v"(perm)); // couldn't find this in the altivec c extensions?
printf("%vc\n", out);

A vector is just 128 bits, so 4 floats can be simultaneously 16 chars, or actually nothing: just 128 bits.

Edit again: can't find vperm c extension? Search at the right place!
Code Sample
vUInt8 out = vec_perm((vUInt8)in,(vUInt8)in,(vUInt8)perm); // I'm forced to almost-cast because the compiler doesn't like it, even though it makes no difference!

I expected this name, but I couldn't find the function where I searched for it at first.

Edit agaaaain: Condensed version:
Code Sample
vUInt32 in = (vUInt32)('siht',' si ','brag','!ega');
printf("%vc\n",vec_perm((vUInt8)in,(vUInt8)in,(vUInt8)(vUInt32)(0x03020100,0x07060504,0x0B0A0908,0x0F0E0D0C)));


http://developer.apple.com/hardware/ve/ (ve=velocity engine)
Title: Mac/PC physics/timing difference
Post by: Chronic Logic - Alex on October 01, 2004, 11:44:18 PM
Actually the accuracies for the sqrt functions are the same, but the different compilers convert the double back to a float at different times.  The main problem is Visual C seems to ignore casting a double back to a float, even with optimizations off.  I'm not sure why the Linux and Mac versions are different yet.
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on October 02, 2004, 01:18:13 AM
Can you explain that a little more detailed?
Title: Mac/PC physics/timing difference
Post by: Chronic Logic - Alex on October 02, 2004, 01:45:02 AM
The sqrt function returns a double, in Gish the physics all use floats, so the compiler should convert the double back to a float on the line that it is called.  For some reason the Visual C compiler doesn't convert it back to a float on the line that it should, so it does some other calculations in double accuracy instead of float.
Title: Mac/PC physics/timing difference
Post by: Jonathan_NL on September 30, 2004, 02:42:26 AM
The replay format is the same:
int magic;
int level;
int length;
char actions[length];

so there must be some difference in the physics or timing? ???