MPEG

Yo...wavelets and fractals are "after the revolution" stuff and the MPEG revolution is hitting us FIRST! Alright? What's important is Compression itself, and beyond that a nice overview of the DCT-encoding behind JPEG and MPEG should become common knowledge. So here we go...

Intraframe Compression

MPEG and JPEG resemble each other in that they both use 3 stages to compress a still image or frame:


DCT Encoding | Quantization | Entropy Encoding

Contrary to myth, MPEG does not contain JPEG. There are numerous distinctions between the way these two codecs handle these 3 essential stages.

For those of you new to this, the 3 stages may be summarized briefly as follows:

DCT Encoding
Any signal can be broken down to component frequencies by a variety of transformations similar to the best-known Fourier Transform. The Discrete Cosine Transform (DCT) is used by breaking up the still image into a number of 8x8 blocks (or a matrix of 64 coefficients) and transforming the spatial data into low-frequency and high-frequency coefficients in the frequency domain. Low-frequency data is very essential and can be considered a summary of what is going on in each 8x8 block. High-frequency data contains the fine detail and is less important (except aesthetically).
Quantization
This is where the action is - data gets killed off by rounding numbers. Since the matrix is described by a difference scheme, that means numbers which repeat a previous number are no different from the previous number and so may be represented by zeros. As a general rule, higher compression ratios are accomplished by more dramatic quantization (rounding off to bigger differences between numbers).
Entropy Encoding
After the quantized numbers are read out of the 8x8 matrix in a linear fashion, Run-Length Encoding (RLE) crunches consecutive zeros together (e.g., 0000000000 = 0 times 10). After RLE, repetitive patterns in the data stream may be replaced with abbreviations (MPEG imposes a fixed Huffman table, JPEG provides other choices).

This is the order that a still image or frame is encoded in. A complete compression scheme provides a codec (encoder and decoder). The decoding of an MPEG or JPEG image is in the opposite order: first expanding the data that was abbreviated in the entropy stage, then multiplying the data so that it has approximately the same numerical values it had before quantization (except now the numbers have all been rounded-off), and finally bringing the frequency data back across the DCT (using the IDCT) into the spatial domain.

The MPEG standard only formally specifies the decoding process. Also, MPEG-2 has many more complications than MPEG-1, but the above description is essentially the same.

Interframe Compression

You may have heard of the famous I, P and B frames. If you've explored other schemes of compression you may have heard of key and delta frames. Briefly, the idea is that a series of frames in video or film has a lot of redundancy from image to image, so some minority of images are selected to be I-frames or key images and are given the complete Intraframe Compression treatment outlined above. Other frames that fall between these may be compressed by just recording the ways in which they differ from the data in the I-frames or key-images. The advantage of all this complexity is that these frames that come in between I-frames use up much less memory to store. (note: JPEG has no interframe compression, so motion-JPEG is all I-frames, which is a good way to work with data before subjecting it to MPEG's interframe scheme because you can wait until the last minute to decide what to use for your I-frames.)

The important thing to remember about MPEG interframe compression is that it's more about macroblocks than frames. A macroblock is a small group of 8x8 blocks, for example 2 across and 2 down (avoiding the topic of component color and YCrCb for the moment). The ways in which P-frames and B-frames can record the way the image-information has changed may be broken up from macroblock to macroblock with a high degree of variety (including, for example, using I-frame-type data for just a single macroblock in a B-frame). A consequence of this complexity is that MPEG encoding can involve a tremendous search to discover how to piece together the optimum mosaic of macroblocks. Affordable home-MPEG boards can't handle this complexity, and even service bureau encoders will only search for optimal mosaics to differing degrees (for example, if the customer is willing to pay for it).

That's it for now...I'll expand this later. Visitors who don't know what PHADE is obviously haven't Yahooed yet.


-- Professional -- Personal -- Links

footer-portion modified March 10, 1996