xiphmont: (Default)
[personal profile] xiphmont

It's amazing how software sidetracks can plow through years of time. I'm finally getting back to Vorbis (and other audio development) after a minor two year video distraction that saw the completion of a new experimantal Theora encoder (Thusnelda) last year, and the beginning of the next experimental Theora encoder which Tim has named Ptalarbvorm. Ptalarbvorm is already showing further large improvements over Thusnelda (honestly, I think he's already doubled again on Thusnelda).

But this post is about Vorbis.

One thing on the original Vorbis bullet list was surround encoding. Vorbis was always surround capable and the software is happy to encode as many channels as you like, but once you're past stereo, everything is encoded as entirely discrete channels. This hasn't been so bad really; despite discrete channels, Vorbis's coding efficiency compares favorably to other surround-capable formats. That said, Vorbis can do better.

When beginning surround optimization work on Vorbis, I found the encoder handled remapping the channel order of input formats, but it didn't always do it properly (FLAC and some WAV inputs were frotzed). I also found that ogg123 needed updating to handle output order. These improvements will be released in vorbis-tools 1.4.0 sometime soon.

The hard part of the tool updates turned out to be libao, which both had no concept of channel ordering, and had also bitrotted substantially while maintainerless. The idea had been that Lennart Poettering's SydneyAudio was going to replace libao so there was no sense continuing to develop libao. We'd just use libao while we waited and not put any more resources into it. Unfortunately, Lennart has been entirely occupied fighting his PulseAudio battles, and this is frankly a more important use of his time. SydneyAudio is no closer to completion than it was a few years ago. We can't continue to wait for it.

So, I'm announcing the resumption of active development and maintenance of libao until such time as there's something better to replace it.

I've made up a surround demo page (like the ones I made during Thusnelda development) with more details on vorbis-tools, libao, and Vorbis's in-progress surround hacking. It goes into alot more detail about Vorbis's surround and channel coupling system with diagrams and samples. I hope folks like it!

Date: 2010-02-23 06:52 pm (UTC)
From: (Anonymous)
Welcome back!

Can't wait for a new multichannel vorbis and merging of latest aoTuV optimizations to libvorbis. :)

Another Theora ?

Date: 2010-02-28 04:22 pm (UTC)
From: (Anonymous)
Whoa I was not aware of the Ptalarbvorm, I'm goign to compare the new encoded file against my old ones !

And soon a new Vorbis that will help me enjoy further more my encoded DVDs in Theora ^_^

Date: 2010-03-01 10:40 am (UTC)
From: [identity profile] kurdakov.livejournal.com
Yes, thanks for the page :)
(deleted comment)

Ghost, Ptalar and demos

Date: 2010-03-04 12:24 pm (UTC)
From: [identity profile] xiphmont.livejournal.com
Several of our most active volunteers turned out to be folks who joined up after reading the Thusnelda demo pages. Many many people have requested demo pages about Ptalarbvorm progress, and that's likely to be the next thing I do after a release of the vorbis surround update with libao and vorbis-tools. That might also have time for gmaxwell to get some of his temporal RDO improvemements in demoable condition in addition to the work already done on low-contrast texture improvements.

Ghost went on the shelf when Thusnelda work started up a few years ago. It was a conscious decision that improving the very unimpressive Theora encoder was far more important than work on a next-generation audio codec when Vorbis was still holding its own well. Since then Tim has been able to take over Theora dev and I've been able to get back to audio-- first dealing with a few years of neglect, then getting Ghost down off the shelf again.

The Ghost project already had one successfully delivered spin-off (Celt). I'm not quite back to Ghost work, but I'm already thinking about it. I can't wait to dig back in, but it will probably be another month or two. Until then, debugging Vorbis surround work, collecting coupling data, and blogging.

Re: Ghost, Ptalar and demos

Date: 2010-03-08 03:55 am (UTC)
From: (Anonymous)
> temporal RDO improvemements
Hmm, interesting. Is this "standard" curve compression based on whole-frame complexity (which 1.1 doesn't have AFAICT), or something more sophisticated like per-block tracking? It doesn't seem like anything related is in SVN yet...

Re: Ghost, Ptalar and demos

Date: 2010-03-15 06:36 pm (UTC)
From: [identity profile] xiphmont.livejournal.com
curve compression? Not sure what you mean...

temporal RDO is just looking at how long a given change matters so that changes that have an effect for a long time are weighted favorably against changes with a very brief contribution. The classic test case is a fixed-background 'shooter' game with a few moving sprites and a completely static background. The x264 people especially love this test case because B-frame analysis has a built-in temporal RDO component and so the x264 encoder realizes that the static background is incredibly important and devotes a ton of bits to it. Theora at present analyzes only a frame at a time and has no way of knowing the background is important over a long period... and so it doesn't. And the difference is very striking.

Theora doesn't have B-frames, so the temporal (multi-frame) analysis has to be done explicitly and it isn't yet, but will be hopefully soon.

Re: Ghost, Ptalar and demos

Date: 2010-03-15 10:37 pm (UTC)
From: (Anonymous)
"Curve compression" just means taking bits from high bitrate scenes into low motion ones. It's more or less like what you describe, but on whole scenes instead of individual blocks. It's called like that because if you look at a bitrate per time graph, it will get compressed (taken to the extreme, it would become a flat line: at that point you are doing constant bitrate). Its advantage is that it's very simple and therefore it's implemented on a lot codecs, but wouldn't help in the example you mention.

I'm glad you're going for the per-block route, it can be quite the gain. I assume there's no estimation on when it'll be ready, right?

If I read your post right I think you got some details about x264 wrong (not that I'm an expert either). x264 has a lookahead (--rc-lookahead) much bigger than the B-frame detector (--bframes). For example by default you have at maximum 3 consecutive B-frames, but the lookahead is 40 frames. To decide frame type it just considers 3 frames at once and uses either heuristics or an exhaustive search at high options (--b-adapt 2) which is very slow for >6 or so frames. Separate from that, the lookahead (a low quality, reduced resolution encode) aids both in ratecontrol (specially useful with buffer constraints) and in computing block value (what you are doing). It certainly doesn't need B-frames to work, it works fine in Baseline profile which doesn't support B-frames (in that case the B-frame detection code doesn't run). Additionally, on two-pass mode, it writes the block values to disk, getting effectively infinite lookahead. Were not for the (large) lookahead, it would only work effectively in two-pass mode.

Hey, I think I know that one.

Date: 2010-03-25 06:29 pm (UTC)
From: [identity profile] https://login.launchpad.net/+id/tbzQdGG (from livejournal.com)
The classic test case is a fixed-background 'shooter' game with a few moving sprites and a completely static background. The x264 people especially love this test case because B-frame analysis has a built-in temporal RDO component and so the x264 encoder realizes that the static background is incredibly important and devotes a ton of bits to it.

Is that the reason Theora does so terribly on this comparison (http://saintdevelopment.com/media/)? And does that mean that it'll do better when Ptalarbvorm is in place?

Re: Hey, I think I know that one.

Date: 2010-03-26 04:15 am (UTC)
From: [identity profile] xiphmont.livejournal.com
The short answer to both is yes.

BBC R&D on Ambisonics

Date: 2010-03-12 09:50 pm (UTC)
From: (Anonymous)
Regarding your Vorbis work, you may be interested in some info about work being done by BBC Research and Development on Ambisonics:

http://www.bbc.co.uk/blogs/researchanddevelopment/2010/03/audio-in-the-north-part-1.shtml

Interesting stuff, though I'm a bit dubious of the critique of compressed audio given, in passing, in the video.

Re: BBC R&D on Ambisonics

Date: 2010-03-15 06:41 pm (UTC)
From: [identity profile] xiphmont.livejournal.com
Ambisonics is unrelated to the current work, though Ambisonics figures prominently on my radar. The phase munging that Vorbis coupling does would be fatal to Ambisonics; Ambisonics requires uncoupled (or losslessly coupled) channels.

Sadly that link states 'not available in your area.'

Profile

xiphmont: (Default)
xiphmont

Most Popular Tags