A Fabulous Daala Holiday Update
Dec. 23rd, 2014 04:09 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Before we get into the update itself, yes, the level of magenta in that banner image got away from me just a bit. Then it was just begging for inappropriate abuse of a font...
Ahem.
Hey everyone! I just posted a Daala update that mostly has to do with still image performance improvements (yes, still image in a video codec. Go read it to find out why!). The update includes metric plots showing our improvement on objective metrics over the past year and relative to other codecs. Since objective metrics are only of limited use, there's also side-by-side interactive image comparisons against jpeg, vp8, vp9, x264 and x265.
The update text (and demo code) was originally for a July update, as still image work was mostly in the beginning of the year. That update get held up and hadn't been released officially, though it had been discovered by and discussed at forums like doom9. I regenerated the metrics and image runs to use latest versions of all the codecs involved (only Daala and x265 improved) for this official better-late-than-never progress report!
Re: New way to represent images
Date: 2015-02-05 11:31 am (UTC)The hard part is making it work better than preexisting techniques.
Re: New way to represent images
Date: 2015-02-05 07:40 pm (UTC)Motivation: Ghost (audio codec) split audio in tone + noise, applying different techniques to each part.
Motivation 2: DCT related transformation does not deal hard edges very well (specially after quantization).
My idea came after read this research:
http://www.cse.cuhk.edu.hk/~leojia/projects/L0smoothing/index.html
The idea whas:
1 - Vectorize the 'l0 smothed' version of image.
2 - Use DCT related (or any other frequency based transformation) to the 'difference' (texture?).
Maybe this idea can be used in future codec, not in (near finished) Daala.
MaurĂcio Kanada
Re: New way to represent images
Date: 2015-02-11 02:06 am (UTC)Recent deep learning approach looks promising for replacing DCT based model.
It's possible to apply optimized set of (pre-trained) filters for specific type of image/block - noisy, pattern, gradient, landscape, face, ...
Reference
http://the-locster.livejournal.com/110724.html
http://www.cs.nyu.edu/~ranzato/research/projects.html#sparse_coding