I’m mostly finished with a new and improved register allocator for fragment shaders in the R300 compiler. I still need to clean up the code and add comments, but otherwise it is ready for testing. The new allocator takes advantage of a register allocation algorithm designed for irregular architectures from a paper by Johan Runeson and Sven-Olof Nyström. Eric Anholt implemented this algorithm and added it to mesa, so all drivers could make use of it.
ADD TEMP[0].x, CONST[0].x CONST[0].x
MUL TEMP[1].x, TEMP[0].x, TEMP[0].x
MUL TEMP[2].x, TEMP[1].x, TEMP[1].x
will now be transformed to this:
ADD TEMP[0].x, CONST[0].x CONST[0].x
MUL TEMP[0].y, TEMP[0].x, TEMP[0].x
MUL TEMP[0].z, TEMP[0].y, TEMP[0].y
This will have a big impact on shaders that use a lot of scalar values. Some of the bigger shaders in Lightsmark use 30-50% less registers with the new register allocator on my RV515. I also get an improvement in fps from ~4.75 to ~5.30, which is about 10%, but with fps that low I’m not sure the difference is really significant. I’d be interested to see the results on other cards with different games and benchmarks. If anyone wants to test it out, the code is in the new-register-allocator branch here.