Alright, I know you already know this, but I can't help saying it anyway:
Not sure, but this could be very forward thinking. As the silicon miniaturization reaches some absolute limits, the only way to increase CPU performance becomes multiplying the cores. A software-rasterizer standard may seem silly now, a throwback to Quake-1 technology. But as more CPU cores become available to dedicate themselves to the graphics load, and the engine is adapted accordingly, it may end up rivaling the GPUs.
Won't happen for two reasons.
The fastest system RAM you can expect to buy these days is PC2 8500; with two channels that gives you "17" (~16.6) GB/sec theoretical peak bandwidth, which is still less than the Radeon 9700 (that came out in 2002!). Currently the slowest ATI cards have 8GB/sec bandwidth, and the fastest have 115(!!).
The best CPUs today can run four concurrent threads (or eight if you count HyperThreading, which you shouldn't in this case), whereas the cheapest GPU ATI currently manufactures can run 80 (albeit at a lower clockspeed, but also with a shorter pipeline).
The bus bottleneck would have to be solved, though. Video cards go straight out to the display. Just a thought.
Out of all the things holding back software GPUs, I think this is the least of them. 1080p60 with 32-bit color is only about 470 MB/sec, whereas HyperTransport is about 50 times that fast and PCIe-x16 is about 8 times that fast.