The last week I’ve been trying to get presubtract operations working for the r300 compiler. Presubtract operations are basically “free” instructions that modify source values before the are sent to the ALU. The four presubtract operations for r300 cards are (1 – src0), (src1 + src0), (src1 – src0), and (1 – 2 * src0). At this point the compiler only uses (1 – src0), but now that I have one working adding the others shouldn’t be too hard. I had to make some major changes to the compiler to get this working, so I am going to let it sit in its own branch (presub branch at http://cgit.freedesktop.org/~tstellar/mesa/) and test it out for a while before I merge it into the the master branch.
I just pushed commit 3724a2e65f5b3aa6e123889342a3e9c4d05903f5 to the mesa master branch that fixes this bug. I filed this bug 8 month ago as a user without knowing anything about mesa or the r300 driver, and today I fixed it! How cool is that?
A few weeks ago I began working on using the hardware loop capabilities for fragment shaders on R500 cards. My original plan was to use the specialized loop instructions provided by the graphics card, but as it turned out, the documentation for these instructions was a little confusing (or so I thought), and I could never get them to work the way I wanted. So, instead I ended up using JUMP instructions to execute loops the same way you would if you were generating code for a CPU. This is an OK solution, but it makes it very difficult to generate code for loops that have continue or break statements.
After taking a few days off from loops, I decided to give the specialized loop instructions another try. I went back and reviewed the documentation and still it did not make sense to me, so I decided to ask Alex Deucher, who works at AMD, for some clarification on the documentation. As it turns out the documentation was fine, Alex pointed out a short but very important part of the documentation that I had over-looked. I’ve probably read the documentation one hundred times, but I always missed that one crucial part!!! Thanks, Alex.
I will start working on hardware loop instructions again soon, but first I am going to take a little detour to fix a bug in the compiler’s instruction scheduler that is preventing me from playing civ4 and causing problems with Compiz for some people.