Audio Compression with Quick Recorder

Copyright © 1993 Microsoft Corporation, portions reprinted with permission from Microsoft Corporation

Editor's Note: The Windows Sound System v2.0 software user's guide is mostly concerned with the setup and control of audio into and out of a computer running Microsoft Windows. Within this rather dry manual we found several especially useful chapters that describe, in simple terms, audio compression, sampling frequencies and related subjects.
The material is so straightforward and helpful that we wanted to include them as part of the
EUonline Continuing Education page.
Please keep in mind that this material is from 1993, and that specific technology has advanced such that 32kHz sampling is really a
minimum for broadcast quality recording, storage and transmission---but the underlying concepts are still valid.
Although Microsoft's Quick Recorder software is described here, other sound software programs on a variety of computer platforms perform similar functions as Quick Recorder.

Understanding Sound | The Quick 1...2 of Digital Audio
Return to Continuing Education

  Have you noticed that when you return from vacation, your suitcase doesn't seem to hold as much as it did when you left home? Could it be that you don't pack as carefully for your return as you do when you set off? Or did you buy one souvenir too many? It doesn't matter; you still have to put everything into that suitcase. And you do it by squeezing things as much as possible, by discarding items you don't think you will need at home, and finally by asking everyone in the hotel to sit on your suitcase so you can zip it up.
Let's stuff 20 pounds into a 10 pound suitcase! This is known as compression, and each effort you make to reduce the physical size of your belongings is known as a compression scheme. Just as you used compression schemes to get those items into your suitcase, you can use audio compression schemes when you have a large Quick Recorder recording that you need to fit into a limited amount of disk space.

Storing Digitized Sound

The better the sound quality you want, the more storage space you need. One minute of 16-bit (CD-quality) sound recorded on one channel (a monophonic recording) can take 5 megabytes (MB) of disk space, and one minute of monophonic, 8-bit (Tape-quality) sound can take over 1 MB. Can you afford this?

Not if you're a corporate buyer on a multi-user system, concerned about network traffic and upgrading your hardware. Not if you are sending an online file over the network to contacts in several different cities. Certainly not if you are an average user with the average amount of disk space who can't afford to update your system to keep up.

Sound Quality vs. Disk Space

As you make recordings, you must balance two factors, sound quality desired and disk space required. Generally, when one increases, so does the other.

So, if you want high-quality sound, you'll need to take more samples along the sound wave to collect more digital waveform data. After all, the more samples you take, the more closely the digitized sound reflects the actual sound. And, the more digital waveform data that is taken, the lower the noise floor1. However, the more data that is taken, the bigger your sound file will be.

Compression reduces the size of a sound file by averaging sound samples, discarding some of your data. The more you compress a file, the more the sound samples are averaged and the more data that is lost. As a result, the sound quality is reduced, and the file is smaller.

Before recording in Quick Recorder, you'll need to balance sound quality against the amount of disk space you have available, so your results will suit the task at hand.

 

What's Available on Quick Recorder

In Quick Recorder, you can choose one of five sound quality options, three of which include some degree of compression. Two options are suitable for voice recordings and three are suitable for music and other types of recordings.

Options for Voice Recordings

Quick Recorder has two compression schemes designed specifically for high-quality human voice recordings. The following table lists the compression schemes available, the type of sound quality you can expect to get, and the amount of disk space a minute of recorded monophonic sound requires.

Compression Scheme Sound Quality Disk Space (in bytes per minute)
TrueSpeech Voice 62K
IMA ADPCM Voice 234K

TrueSpeech

TrueSpeech is ideal for recording and compressing human speech. The file size generally is small and you get excellent sound quality—a real advantage if you plan to send the file over a network. The main disadvantage to TrueSpeech is the time it takes to compress a file after you finish recording.[ed note: The sound quality of True Speech is suitable for non-broadcast purposes, such as dictation, annotation and the like.]

Although you may have to wait while it compresses your file, the quality TrueSpeech yields may well be worth the wait for both annotations and narrations that will accompany an online presentation or demonstration.

Note: TrueSpeech works only for human speech; it does not work well for music.

Voice-IMA ADPCM2

This compression scheme works well for human speech alone, or for human speech combined with music or other sounds. Because it is also a standard adopted by other sound software manufacturers, it is the most appropriate choice if you are making voice recordings that you plan to send to people with sound software different from yours. Files compressed with this scheme are bigger than TrueSpeech files for the same sampling rate and have a lower sound quality, but they do not need the additional time required for compression.[ed note: IMA ADPCM compression sounds the same as the G.722 audio protocol. The G.722 protocol is an international standard describing how one will use data bits to encode and decode audio. Most ISDN audio coders/decoders (aka "codecs") support the G.722 protocol. It is a ubiquitous, if grainy, sound.]

Options for Music and Other Sound Recordings

Quick Recorder has one compression scheme and two uncompressed, high-quality schemes that are designed for recording music and other sounds. The following table lists the schemes available, the equivalent sound quality you can expect, and the amount of disk space a minute of recorded monophonic sound requires.

Scheme Sound Quality Disk Space(in bytes per minute)
IMA ADPCM Radio 322K
8-bit Uncompressed Tape 1291K
16-bit Uncompressed CD 5167K

Radio-IMA ADPCM If you are making recordings of music or other sounds, and you want good sound quality with some compression, this option is a good choice. The sound quality of this option is comparable to that of an AM3 radio broadcast.

8-bit Uncompressed (PCM4) Uncompressed, 8-bit sound files are comparable in quality to cassette tape recordings with no noise reduction and using low quality tape. An 8-bit uncompressed file can be quite large, but is still smaller than a 16-bit uncompressed file.

16-bit Uncompressed (PCM) Uncompressed, 16-bit sound files give you the highest sound quality equivalent to what you would hear on your audio compact disk (CD) player. At a sampling rate of 44 kHz, the files can be very large, about 5 MB for one minute of monophonic recorded sound. (For more information about sampling rates, see the chapter titled Understanding Sound)

 

Choosing the Right Compression Scheme

Before you choose a compression scheme, decide what you plan to do with the recording.

If you plan to record voice annotations, select TrueSpeech or Voice quality, because both options provide the necessary clarity for recording speech. If time is of the essence, choose Voice quality. If you have the time to wait for your recording to be compressed, choose TrueSpeech. You'll save disk space and benefit from the superior sound quality this scheme yields.

For music or other types of sound recordings when disk space is an issue, select the option that gives you both good sound quality and some compression, Radio quality. This is probably adequate for most of your work needs

Tape-quality sound files are smaller than the best recording quality (CD-quality). This usually is adequate for most of your online demonstration and presentation needs.

When you need the best sound quality, perhaps for an online presentation with music and speech for an executive meeting, training, or conference select CD quality. This level of sound quality takes the greatest number of samples, foregoes compression altogether, and requires the most disk space.

In some cases you may find that a 16-bit sound file that has been compressed to a 4-bit file produces better sound and less noise than an 8-bit sound file. Results vary depending on the hardware you are using. Experiment with different compression schemes until you find one that is suitable for most of your needs.

Note Once you've made a recording, you can't improve the sound quality by changing the compression scheme you've chosen to increase the number of samples. For example, you cannot change a Radio-quality recording into a CD-quality recording.

You can, however, change the sampling rate and compression scheme to decrease the size of a file, for example, by changing a CD-quality recording into a Radio-quality recording. This, of course, diminishes sound quality, but can be useful when you plan to send a sound file to someone whose computer does not have the same capabilities as yours, or when you want to transmit your file over the network to several users without overloading server resources.

The Home Stretch

The first time you make a sound recording, you may be overwhelmed by all the choices you have to make, just as you may have difficulty choosing what to take with you when you're going on vacation. But as you become accustomed to including sound in your work, you'll find it's easy to decide when and what to compromise.


1In the absence of an input signal (that is, in the silent time between data bits), the playback device generates a noise, usually a soft hiss, known as the noise floor. For each additional data bit you add to your sample, the noise floor is reduced by 6 dB. Thus the more data bits in your sample, the lower the background noise during playback. [Go back]

2 Interactive Multimedia Association Adaptive Delta Pulse Code Modulation. A variation on the pulse code modulation (PCM) compression technique that stores digital audio data with excellent data compression properties. [Go back]

3 Amplitude Modulation. [Go back]

4Pulse Code Modulation. A technique to digitally encode audio signals from analog to digital data. [Go back]

Understanding Sound | The Quick 1...2 of Digital Audio
Return to Continuing Education

Return to EUonline Home Page