i don’t think we can make the jump to making everything 64 bit all the time (nor should we). tbh, the vector proposal was mainly just a way to surface some sort of “boxed” number where precision could be maintained while doing repeated calculations (like in the physics engine) and also avoiding overflow issues in multiplication. if those boxed numbers were accessed in non-native code, we would convert them back into the usual 32 bits.
…and of course, i’d need to do testing to make sure this sort of thing didn’t tank perf. i actually have no idea what the relative performance of 64 bit and 32 bit integers is on the hardware we target.
The F4 should have FPU enabled - all the chips we use have FPU. I’ll do that tomorrow.
It would be nice to somehow expose the FPU but I’m not sure how exactly. We need a uniform data representation so the floats would have to be boxed probably negating most of the perf advantage.
I do intend to optimize the fx8 math though.
Instead of internally dealing with fixed-point math and its annoying overflow behavior, would it be worth revisiting using native single-precision float math on hardware targets? Especially if there would be a reasonably efficient way to turn them into fixed point on input or output.
Having a “boxed” representation for transform matrices and for vectors (or even better, arrays of vectors) would help for 3D rendering.
Some useful operators would be:
matrix * vector => boxed vector
perspective transform
vector length (assuming there’s an efficient float square root available on hardware
dot product
squared vector length (that’s just dot product of a vector with itself)
The scalar valued functions could all return fixed point numbers if it can do internal math in a way that avoids overflow. The Fx8 22.8 bit number range for results is probably fine for typical use cases - I don’t remember how significant the issues were with precision that led me to using Fx14 instead for the kart demo. If it is a problem, maybe add a shift as a function parameter?
dot(a, b) => return integer part
dot(a, b, 8) => scaled to Fx8
dot(a, b, 14) => scaled to Fx14
If I have time I could try to go back to my renderer to see what other operations I’m doing with numbers. I’m not sure offhand which other operations could turn into performance bottlenecks - I suspect it’s likely the clipping stage that happens between applying the transform matrix and doing the perspective divide. That involves a lot of comparisons and calculating intersections.
For inspiration, I’d recommend checking out Brandon Jones’s JS library https://glmatrix.net/ . It’s very efficient and its data types could be a good match for the boxed types it would be useful to support.
The API is very focused on performance, for example it generally uses an out parameter instead of returning values to avoid allocating objects. I think it would be good to follow that.
Ooh that sounds very nice, textured polys could be huge! I just skimmed this thread but from what I’ve seen one of the major issues with 3d is collision and depth sorting, which require in-depth knowledge of 3d math. These aren’t easy problems to solve (especially since clips still happen in modern games), but I believe these are the major roadblocks for “full 3d games” by competent-but-not-insanely-experienced-or-educated developers.
Just to add to the complication, “clipping” and “culling” are two distinct concepts. Clipping in the sense of cleanly cutting off geometry at the sides of the view frustum is comparatively straightforward, but doesn’t always have the effect you’d want. For example, clipping includes cutting off objects at the “near” view plane intersection, so getting too close to objects can result in being able to see inside them. This gets very disturbing if you then see the inside of an NPC’s head. In this case clipping is working as intended, but the game is missing a collision check to keep you from intersecting the object.
It gets even more complicated in VR - you can’t stop the user’s head from moving, and decoupling the viewpoint movement from head movement is nauseating so you can’t prevent the view from moving along with the head. So apps need to use alternate methods such as fading to black when your head is inside objects or outside a wall.
Culling means discarding geometry before even trying to draw it, for example because an object is entirely outside the field of view (pretty easy) or hidden by a solid wall (much trickier). Games have had issues where poor culling has a big performance impact due to trying to draw objects that end up being covered up. Getting it wrong in the other direction can lead to objects suddenly popping into view.