How to encode a Theora file to a specific file size

I have a 2 hour DVD disc that I would like to backup as an Ogg Theora file, that fits on a CD. Here’s how:

First I quickly copy the DVD’s mpeg2 video to my harddisk, using mplayer’s “mencoder” tool:

mencoder dvd://1 -oac copy -ovc copy -o ~/movie.avi

This produces a 3.4Gb (3,622,278,868 byte) file, movie.avi

I then use mmpeg2theora (latest stable release from the project’s website) with optimisations, some sharpening, and keyframes every 500 frames, and a quality of 2/10 in order to fit on a CD:

ffmpeg2theora --optimize -S 1 -K 500 -v 2 movie.avi

This produces a 606Mb (0,634,570,079 byte) file, which easily fits on a CD. The quality isn’t great on my awesome 24” widescreen, but its certainly watchable.

The default quality of 5/10 - -v 5 - produces a 1.2Gb (1,283,240,720 bytes) file that is great quality though.

Sadly, I don’t like this file that much though ;-)

While researching this, I popped into the #theora channel on freenode. Gotta love IRC:

10:45 < abattis> lo, how can i encode a theora file to a specific file size?
10:46 < abattis> like i have a 3gb mpeg2 file and I want to compress it to a 
                 650Mb theora?
11:49 < maikmerten> abattis, did you yet receive an answer on your 650MB
11:49 < maikmerten> (I had to leave)
11:57 < nessy> no he didn't :)
12:06 < maikmerten> well, basically you have to divide the amount of available
                    (kilo)bits with the length of the film in seconds
12:06 < maikmerten> then one would get the average bitrate allowed
12:06 < maikmerten> then substract the bitrate of the audio track
12:07 < maikmerten> then multiply it with like 0.95 to have some headroom
12:07 < maikmerten> and then tell the theora encoder to use this bitrate for
12:07 < maikmerten> notes:
12:07 < maikmerten> a) the bitrate management is horrible and will slash qualiy
12:07 < maikmerten> b) even with bitrate management there's no guarantee the
                    size will be hit
12:08 < maikmerten> c) what one would really want is a multipass encoder which
                    would analyze the film and adjust parameters so the second
                    run actually produces a file of given size
12:09 < maikmerten> I'm not aware of a Theora encoder doing c)
12:12 < maikmerten> < -- useful tool
12:13 < maikmerten> or that one:
12:38 < kfish> hmm, we had a soc project application to do a two-pass encoder,
               and it was rejected
12:39 < kfish> i think it would have been useful ...
13:03 < maikmerten> I wonder if xiphmont's bitrate management changes would
                    make doing a multipass encoder more easy
13:04 < maikmerten> I'm not sure if the current "management" is giving any
                    useful output for a second pass ;)
13:04 < maikmerten> (or would allow using prior-generated metrics)
13:14 < kfish> hmm, it sounds like it's a non-trivial problem
15:00 < abattis> maikmerten: awesome
15:01 < abattis> hmm ok
15:02 < abattis> so xvid 2 pass encoding is technically able to do this, but it 
                 makes nasty mpeg4
15:02 < maikmerten> yup
15:02 < abattis> may i post you comments on my blog?
15:02 < maikmerten> nasty, puppy-consuming mpeg
15:02 < maikmerten> sure, you're welcome
15:03 < abattis> s/puppy-consuming/patent-encumbered/ :)
15:03 < maikmerten> feel free to also use nonofficial 
15:03 < maikmerten>
15:04 < maikmerten>
15:04 < maikmerten> ;-)
15:05 < maikmerten> by the way, if you want to get an impression how bad the 
                    Theora bitrate management is: 
15:06 < maikmerten> also nicely shows how the Theora-format can deliver better 
                    quality than seen nowadays if the encoder is made to not do 
                    stupid things all the time
15:14 < abattis> ill see what ffmpeg2theora -v 2 comes out like
15:14 < maikmerten> right, just trying out some quality values eventually also 
                    may come close to the average bitrate needed
15:15 < abattis> as it was at default -v5 and made a 1.2G ogm from 3.4G mpeg2
15:15 < abattis> so i hope it will duck 0.7G with -v2

I cuss the Google Summer of Code people who rejecting funding this important feature :-)

Dear lazyweb, please make a simple GTK or QT program that does this calculation.

Creative Commons License
The How to encode a Theora file to a specific file size by David Crossland, except the quotations and unless otherwise expressly stated, is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License.


Leave a Reply