| Audio Compression |
|
(BMAS Website 2003-2005) Audio data can be compressed quite effectively whilst still retaining a fair amount of quality, by making use of software such as MP3 (http://www.mp3.com/) and Ogg Vorbis (http://www.vorbis.com/). This type of compression falls into the 'lossy' category, as the software will throw away what it considers to be redundant data. The redundant data threshold can be controlled by the user by changing the bitrate; bandwidth etc. MP3 MP3 is not free as you might have thought. There are quite a few patents used by MP3 and the majority are owned by Fraunhofer IIS, so its almost impossible to find a free MP3 encoder and some OS vendors are not including the ability to play MP3's in recent releases. But saying that, the popularity of the MP3 format has increased even more especially since the advent of Peer-to-Peer sharing software like Napster; WinMX etc. Details of MP3 licensing may be found at http://www.mp3licensing.com/. The default settings for most MP3 audio files seem to be CBR (constant bit rate) 128kbps; 44100Hz; 16-bit; J-Stereo which gives a compression ratio of around 11:1 from the original audio data. Looking around on the Internet quite a lot of MP3 tracks can be found at much better than that. For example CBR 256-320kbps; 44100-48000Hz; 16bit; Full-Stereo which is almost an exact copy of the original audio but still giving 3-4:1 compression. The following table shows how the compression ratio and output file size varies for a sample WAV file of 582677468 bytes along with the low-pass frequency cut off. The data was provided by LAME using CBR output. The last column shows that the MP3 data is MPEG-2.5 Layer III for the lower bitrates changing to MPEG-2 Layer III and then MPEG-1 Layer III for the midrange to high bitrates.
(The default bitrate is -b 128) Since the WAV file used was a Music CD, the lower settings are of little use as the degradation in quality is quite pronounced. If the WAV file was based on speech alone then some of the lower settings would be more suitable due to the reduced frequency bandwidth of speech, but the lowest settings 8-32kbps should be avoided. The original MP3 standard is beginning to show its age nowadays, so there have been several improvements made, one of which is to use a Variable Bitrate (VBR) in addition to the standard Constant Bitrate (CBR), whith VBR the encoding software encodes the audio data at the most suitable bitrate for the audio data, so very quiet patches of audio tend to be encoded at a low bitrate (around 16-32kbps) whilst the more complex patches of audio such as percussion/strings tend to be encoded at a very high bitrate (around 256-320kbps). The overall effect is an increased dynamic range whilst still maintaining a reasonably good compression ratio. Older MP3 players cannot handle VBR encoded audio. The following table shows how the compression ratio and output file size varies for a sample WAV file of 582677468 bytes along with the low-pass frequency cut off. The data was provided by LAME using VBR output. (i.e: replacing fixed bitrate settings with a variable quality setting)
(The default quality setting is -V 4) Notice in the tables below, how the bitrate spread changes with the quality setting. The output is from LAME running on a Linux system.
MP3pro MP3pro was introduced more recently to improve the compression ratio even more whist retaining much of the quality. An MP3pro encoder splits the audio data into two sections, the first section is encoded in much the same way as a standard MP3, but the second section containing the higher frequency components is remapped to a lower unused part of the spectrum and encoded alongside the first. The resultant MP3pro file can be played on older MP3 players without the extra second section enhancements but on MP3pro players the two sections are decoded and played correctly. Much higher compression rates have been achieved over 'normal' MP3 and according to the MP3Pro website "It gives 128kbps performance at a 64kbps encoding rate" and that "24 albums could fit on a single CD-R".. Ogg Vorbis Ogg Vorbis is patent free and there are no licensing fees associated with it. Ogg Vorbis is a true Open Source system so anyone can download the source; compile it; modify it under the LGPL (The GNU Lesser General Public License), and can produce a slightly better compression ratio along with an improvement in quality over either CBR or VBR encoded MP3 data. Actually Ogg Vorbis is a two part system, with Ogg as the surrounding wrapper and with Vorbis as the CODEC. There are other codecs that fit into the Ogg framework, but none are available for general release at the moment.
(The default quality setting is -q 3) Sample Audio files The sample audio files given in the table below came from a CD entitled 'Life in Make-Believe' by 'Dave Forward' (Copyright Evergreen Records) who Composed, Performed and Recorded all the Music. The only reason for choosing this track was that we, like the title are in the 'North of England'. The audio was extracted ('ripped') from the CD using 'cdparanoia' running under SuSE Linux on a Toshiba laptop, the resultant 'WAV' files were encoded as 'MP3' using 'LAME' and to 'Ogg' using 'oggenc' on the same system.`
MP4, AAC, M4A etc.. These are all the same codec, being used mainly by Apple as AAC (advanced audio codec) as their improvement to the aging MP3 standard. One major reason for the new format is that of DRM (digital rights management), to be able to control and track illegal copies of commercial audio tracks. In MP3 there are copyright tags in the ID but they were never really used. The iTunes program from Apple could allow a download of an audio track from the Internet, and the track could be locked to a single system and unplayable on other unregistered systems. Increased quality is another major claim over the previous MP3 standard, but most of the quality improvements of MP4 seem to be over an MP3 recorded at the 'standard' settings (128k CBR 44100Hz J-Stereo). Under listening tests that may be true for most tracks, but the use of MP3 with a slightly higher bit rate or making use of VBR (variable bit rate) or ABR (average bit rate) can narrow the perceived quality gap quite a lot. Ogg Vorbis at similar settings to MP4 appears to reverse the situation, which appears to be a much higher quality than both MP4 or MP3 VBR. It must be stated here that a lot of the quality improvements are subjective and can vary either way from listener to listener. Another claim for MP4 is that of reduced filesize, which may be true when compared to MP3 CBR, but MP3 VBR tends to be smaller so the difference is minor. Ogg Vorbis tends to produce files smaller still. (see example files above) |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
