How It Works

And there's more. Modern music compression techniques (including MP3) also use stereo-related effects to help reduce file size even further. You're probably familiar with the fact that you only need a single subwoofer in a room because the ear can't tell where lower bass notes come from. Usually an optional extra you have to specify as an extra parameter at encode time, Intensity Stereo (often mistakenly referred to as Joint Stereo) takes advantage of this lack of directional sensitivity by combining the stereo channels as one for sounds at the very top end of the audible spectrum. By storing the information of two channels in one, with some nominal directional information, file size is reduced, and when the file is decoded, the sound is 'split' back into two channels using the directional information as an aid.


The graph represents the variable sensitivity of the human ear to the various audio frequencies. The curve refers specifically to the amount of gain that would need to be applied to an audio signal for the human ear to hear everything at a constant volume.


Intensity stereo is lossy; for higher bit rate files a different, but related, lossless stereo technique is used. This is called mid/side stereo encoding and it takes advantage of the fact that, in most music, the left and right channels are quite similar. This simply rearranges the numbers in a more efficient, computer-friendly manner: the encoder takes an average of the left and right channel values (mid), then records the difference between those values (side). The result is (usually) one large number and a smaller one instead of two larger numbers, taking up less storage information than before. This, in turn, allows more bits to be dedicated to the rest of the signal, improving quality.

Finally, the compression codec will run the resulting audio data through a process known as Huffman coding, to ensure that the final file is as efficiently coded as possible. This employs a similar technique to that used in Zipping documents. It's not lossy (ie no information is discarded or lost) and it merely ensures that the final file is as efficiently coded as possible. A useful side effect of the Huffman encoding is that it acts as a counterbalance to the periods during which the psychoacoustic techniques above don't work very well. Quiet sections of music, for instance, when little masking goes on are more effectively compressed through Huffman coding than loud, complicated audio signals, during which the psychoacoustic compression techniques are highly active.

comments powered by Disqus