The goal behind porting to the C64x+ is to run on OMAP3 SoC from TI, which has an ARM Cortex A8 core and also has a C64x+ DSP coprocessor. This SoC (System on Chip) is best known as being the base behind Nokia’s N series of mobiles (including the N900), the Motorola Droid, Palm Pre, and the Beagle Board. The DSP coprocessor is commonly used for audo and video processing, including video encoding and decoding, and TI makes codecs available for MPEG-4 video decoding, AAC decoding, etc.Â Having Theora decoded on the DSP fits into Mozilla’s Fennec project, making Firefox with video useful on a mobile platform.
One of the engineering reasons behind having a separate processor for media handling is that it separates real-time tasks (media decoding) from non-real-time tasks, such as running web browser software. From the standpoint of software running on the ARM, the video decoder looks and acts just like a hardware video codec. The DSP on the OMAP3 is even more compelling for video decoding because attached to the DSP are several units that accelerate motion vector copying, VLC decoding, and loop deblocking. Unfortunately, these pieces are not publicly documented by TI, so the current Theora port (which is open source) is unable to use them. A future Entropy Wave project will likely add support for these acceleration units which would allow the performance of the Theora decoder to be similar to TI’s MPEG-4 codec, which can do 800×480 playback (possibly more?). As it looks now, the resulting code would necessarily be closed source until such a time when TI wishes to make the specifications public.
As it currently stands, the Theora decoder plays 640×360 24fps at slightly more than 100% speed on average. This isn’t quite good enough to call it “real time”, since some frames take longer than the allotted time to decode, but it’s pretty close and the results are good. Additional speed improvements in libtheora would require internal changes, which would be a project in itself. One clear area for improvement is that the DSP spends a substantial part of its time idle, because the host code is serialized with the DSP processing. Fixing this is likely to put the above case firmly into the “real time” category. Given that 640×360 is larger than the iPhone display resolution and almost as large as the N900 resolution, it’s clearly good enough, even if it is less than TI’s hardware accelerated MPEG-4.
On the Entropy Wave site is a page describing the demo, including where to download images and how to compile source code.
A big thanks to the people that laid the foundations for this work, especially Felipe Contreras.