xiphmont

Backstory: Theora is often attacked for having relatively low encoded image quality. The Doom9 2005 multi-codec shootout is the reference most commonly cited by detractors. I'm not going to criticize the Doom9 shootout because I think it was entirely fair and showed Theora legitimately falling short. I wrote http://web.mit.edu/xiphmont/Public/theora/demo.html long before the W3C/Theora discussion recently stole the limelight. Although I originally intended that document for internal consumption, it escaped into the public and is a useful look at how we at Xiph view Theora internally.

There's an important distinction to be made between the incapabilities of the Theora format, and the quality of the Theora encoder. Of course, users don't really care about theoretical performance speculation. To them, the software either works well or it doesn't. It doesn't matter if that's due to design or implementation. Sucking is sucking and it's hard to dispute that the Theora encoder, as inherited from On2, is downright sorry by modern standards. Believe me-- we're more intimately familiar with the level of suck than most.

However, if the software sucks but the format design does not, the software can be improved. That's why we've been working on the Thusnelda encoder for the past few months.

Another problem that comes up regularly is one of perception, and it's very simple. Theora is built from VP3 (true). The current generation of On2 codec is VP6/VP7 (true). '3' is a smaller number than '6' or '7' (true). It is half as big or less in fact (true)! VP6/7 must be about twice as advanced as VP3 (uhhhh). Thus, why would anyone use an outdated throwback like Theora, which is really VP3, which is less than half as good as a current codec (huh, wait, what)?

It's stunning how pervasive this argument actually is. For that reason, we've decided in the next release to rename 'Theora' to 'TV8'. Juuuust kidding.

First off, MPEG2 video, MPEG4 video, VP3, VP5 and VP6 are all in the same codec family. They're all block-DCT codecs with motion-compensated inter-frame block prediction. Exactly how they lay the bits into the stream differs, but the foundation math is all the same. They all use the same battle strategy. In fact, if you generalize 'DCT' to 'transform', even Dirac and Snow are typical members of this video codec family. (Tarkin differed significantly and was not a technical success).

Digging down lower, you begin to see differences in how the strategy is implemented tactically. Examples: At the last, lowest level entropy coding stage, VP3 uses a Huffman coding and VP6 uses range coding. VP3 uses Hilbert-ordering for blocks and macroblocks. VP6 decided the extra complexity of Hilbert ordering isn't worth the gain and just uses raster order. In both cases, they're doing the same thing-- but doing it in a slightly different way.

In short, VP6/7 is more an exercise in altering details without needing to worry about backwards compatibility than it is an invalidation of VP3. We've done and are doing the same thing with Theora, except that we're maintaining backward compatibility as we move forward. Of course, that glosses over several metric tons of details (and that's why we're not done yet). We have a finite details-per-hour throughput.

Note that we're maintaining compatibility across Theora versions; even the original Theora alphas had several changes that were incompatible with VP3, but the encoder could be directed to output in a VP3 compatibility mode. Our Thusnelda encoder no longer offers VP3 compatible output).

Blogging Theora (Reply)

Blogging Theora

Profile

Most Popular Tags