MULTI-CHANNEL AUDIO

The history of multi-channel sound reproduction goes back seventy years. Today's surround sound has its origin in film and theater reproduction. In this area there had been many successful experiments prior to multi-channel reproduction entered the home market. Quadraphony was the first consumer based system that had a commercial significance although on small scale. During the last decade surround sound systems have dominated the multi-channel surround sound markets. Today the most significant company in this field is Dolby Laboratories. It has developed de facto standards for both cinema and home systems.

First and early adopted reproduction systems contained one speaker. In the early years of recorded or transmitted sound, the level of technology did not allow more channels. Today it is still considered to be the most common audio reproduction system. It is also dominant in TV because of its simplicity and low cost. Monophonic reproduction has many advantages and in some cases it is the only technique that can be used. The main disadvantage is the lack of directivity information and it’s inability to re-create a realistic, acoustical, three dimensional image of an original performance.

Limited directivity information in reproduction can be achieved by adding one extra speaker channel in front of the listener. This system is known as stereo. The improvement when compared to monophonic reproduction was significant. In most cases, it is relatively easy to add one extra channel to the existing system but further additions are not always possible. An LP-record is an example of a media, where only one extra sound track can be added easily. Today, most the recorded music is produced as stereophonic. As well, most FM radio stations are transmitting stereophonic sound. Stereophonic reproduction by and large works relatively well. However, stereophonic systems require higher quality equipment than monophonic systems.

Figure 1 Speaker positioning and listening area in a two channels stereophonic system. Satisfactory stereophonic sounfield is possible to achieve in a relatively small area on the axis between the speakers.

The best listening area in a two channel stereo environment is limited to quite a narrow area in the axis between speakers (figure 1). The other disadvantage is that information on the room acoustics of the original performance is very difficult to reproduce via two channels.

It has been long known that two channels can not provide realistic directionality. Multi-channel reproduction has its origin early in this century and since those days it has been studied intensively. Today we have technically and economically functioning multi-channel audio systems.

The experimental starting point of multi-channel audio reproduction was Bell Laboratories in the late twenties. There it was demonstrated that three channels give a better depth effect than two channels. From those first experiments it took almost forty years to produce economical, quality sound equipment for the consumer market. During those years experimenters and researchers discovered the theoretical basis of multi-channel reproduction.

During the four decades prior to 1960, several multi-channel sound reproduction systems were developed and demonstrated. At the same time, the quality of sound reproduction was increasing. When the experiments of multi-channel sound started, the only consumer format was monophonic. At that time AM radio transmissions and 78 rpm shell lack records were used. There was no media that could offer more than one channel. For this reason those early experiments were conduced in public formats like theaters and conference rooms. Although the experiments showed the possibilities of multi-channel reproduction and listeners were mostly satisfied, the time was not yet ready for a consumer product.

A new inventions were still needed to decrease the over all costs of a multi-channel system. In the late forties Williamsson introduced an amplifier using a high degree of feedback. As a consequence the distortion of the amplifier was very low. The next step towards multi-channel reproduction was taken in the fifties. Speaker technology had developed greatly. High quality closed and vented boxes were introduced onto the market. At that time HiFi as a hobby, started. An invention to economically produce multi-channel sound was still needed. The answer came in the sixties in the form of four-track magnetic tape recorders and playback machines. Now the infrastructure was in place for a practical multi-channel sound systems for consumers based on prerecorded four channel magnetic tapes.

As phonograph records and radio broadcasting could not carry multi-channel sound, consumer multi-channel sound systems remained the hobby of small amateur groups. At that time there was no great commercial interest in multi-channel sound reproduction. The outlook changed in the late sixties when Peter Scheiber described and demonstrated a technique for recording and reproducing four-channel sound through a two-channel medium. This was the starting point of intensive research on 4-2-4 matrixing systems. Once the basic technique became widely approved, it took only a short time to develop commercial systems for phonograph records and FM broadcasting. Besides matrixed systems, discrete channel systems were studied and phonograph records were developed. Matrixed systems were called SQ, stereo-quadraphony (compatible), and discrete channel systems were known as CD.

Figure 2 Schematic diagram of matrixed 4-2-4 multi-channel audio system. Four input channels have been encoded to two channels for recording or broadcasting. For reproduction encoded two channels have been decoded back to four channels.

A typical or conventional quadraphony system as it was called contained an acoustical arrangement in which a stereophonic system was fitted with additional speakers located behind a listener. Also sc. unconventional four channel systems were introduced. In one system speakers were placed at the sides, front and rear of the listener. These two quadraphony systems are presented in figure 3.

Figure 3. On the left side is presented the conventional speaker arrangement in quadraphony. A newer system is on the right. Normally multichannel audio was studied as an unlimited array of speakers surrounded the listener. The speakers were spaced from each other by an angle.

Besides four channels systems, three channel solutions were also studied. Two of these systems had commercial significance. The first of them was ambiophony in which one speaker was added to the normal two channel stereo system behind the listener. Another view of three channels reproduction offered was the Finnish basic system called Orthoperspecta. In both of these two systems, the rear channel was supplied with the difference between the normal stereo left and right channels. The aim was to produce some type of surround signal[1]. Unlike ambiophony, Orthopespecta used a mono front channel and two rear speakers supplied by the same signal. These two, three channels systems are presented in figure 4.

Figure 4 Two three speakers approaches to multichannel reproduction. The ambiophony is on the left and Orthoperspecta on the right. Orthoperspecta was in production also in Salora.

Although by the mid seventies four-channel records were available for consumer use, commercial success was not achieved. Quadraphony failed for many reasons. Different record companies and stereo equipment manufacturers backed different incompatible encoding-decoding systems. Producers and recording engineers did not have clear vision of how best to use these extra channels and most consumers couldn't hear any advantage with the system. The price of these early systems was also very high, discouraging most people from buy them. People never associated quadraphony as the surround sound used in films. Home multi-channel sound was nearly forgotten for fifteen years. Only three channel systems had some significance, because they used a two channel sound source.

This was situation specifically in Europe. However during in that time the theoretical basis for multi-channel reproduction was established. Consumers eventually became reacquainted with multi-channel sound. The film-industry was interested in the new multi-channel sound reproduction. This was because theaters audiences were shrinking under the competition from television. High quality multi-channel sound gave theater owners a competitive advantage TV could not.

The only four channels medium at the end of sixties was prerecorded magnetic tapes. The system was expensive and the supply of program material was low. Discrete four channel sound could not be transmitted via radio or recorded onto records. The turning point in the development of quadraphony was the introduction of matrixed systems [PESC70]. Instead of four independent channels only two had to be stored or transmitted. All existing stereophonic systems were able to convey quadraphony. This was the starting point for the consumer format of multi-channel audio.

The principle of matrixed quadraphony is to encode four input channels into two either for recording or broadcasting. Coded channels are decoded back to four ones for reproduction (figure 2). The requirements for an ideal matrixed system are [PECH70]

** Four channel compatibility using standard components and construction wherever possible

** Stereo compatibility: the ability to reproduce the four channel program on all standard two channel

stereo medium with all sounds in the four channel program with correct stereo localization

** Mono compatibility: Monaural playback possible on all standard medium without losing or altering

** Full playing time within a given format compared with the equivalent of stereo

The requirements were difficult and impossible to fulfill completely. Matrixed four channel encoding can be described as following operation

where chx are input channels, chxt are recorded channels, xx and yx are coefficients of matrixed channels. The rank of coefficient matrix is two. There are only two orthogonal vectors and for the coefficient matrix there is no inverse one. For complete decoding four orthogonal vectors are needed.

However incomplete decoding is possible. Degradation of crosstalk between channels in reproduction can not be avoided in this case. Channel separation decreases remarkably and will be only a few decibels. Decoding is the reversal of encoding. It is described with the following matrix.

The first sound reproduction method used in films was synchronized gramophone records[2]. At the same time an optical soundtrack was developed that stayed in use for two decades. In the late forties new inventions[3] improved significantly the sound quality of phonographic records. Micro groove records took the first step and high quality tape for recorders was also introduced. As a consequence the frequency response on the new records was 30 Hz...15 kHz within +/- 2 dB.This compared favorably to the 6 kHz high frequency limit and 45 dB signal to noise ratio of optical soundtracks. The improvements were significant. [JOMO81].

The film industry adopted the new magnetic recording system within a short period because of its obvious benefits. In production it shortened the time from recording to final soundtrack. In the early fifties two new film formats[4], using magnetic sound stripes were introduced. A new stereophonic sound was promoted with a new wide screen film format. The film stereo started out with a minimum of four channels. To play these films an extra playback head, like those in tape recorders, was needed in the projectors. In a normal set-up at least one of the channels contained rear information; known at that time as the effect channel. The rear channel was used only occasionally during dramatic effects because of the high noise level.

Figure 5 Typical film sound production organization in the mid eighties [JOMO81]

The market for film-stereo shrank in the late sixties and early seventies for many reasons [DOLA94]. The magnetic method was expensive and the film industry was in a turmoil because of competition from TV. The magnetic stripe was also sensitive to disturbance. In some cases the content of the sound track could be wiped out of because of the poor quality of the reproduction equipment. In these conditions it was difficult to maintain sufficient quality sound and the popularity of multi-channel reproduction decreased.

Sound mixers, however, continued experimenting, almost as hobby. Six-track 70 mm film offered consistent signal-to-noise ratios for all channels. The effect channel could be used for continuous low level ambient sounds. This gave more realism overall and the effect channel got a new name, the surround channel. The method was called surround sound. In the rear of the theaters a pattern of speakers was installed to produce a diffuse sound field.

In production two different strategies have been used. A typical production organization for film sound is presented in figure 4.5. The production sound group works simultaneously with other production groups recording the sound and picture. In many cases this leads to an unsatisfactory result. Outside disturbances like the

Figure 6 35 mm Dolby Stereo playback containing two optical matrixed soundtracks identified as Lt and Rt. [DOLA94].

sounds of airplanes and filming equipment cause background noise that is very difficult to avoid. Also with the microphone technique in use, there were problems, especially during dialogue recording. These and other reasons have led to the practice of recording the soundtrack afterward. Sound staff record only a reference track that can be used to help the actors to remember the correct nuances during the post production phase. Similarly in dancing scenes only a few musicians are used with the final full orchestra being added during post production

The post production method is very widely used today. Dialogue, music and effects are recorded afterward and the film is mated with the soundtrack. This has many advantages. There are no unwanted sounds in recording studios. In many cases prerecorded effect libraries are used. In these conditions a very accurate sound field can be created and the mixer can create the most effective sound possible. The post production method is used also for dubbing, when the original dialogue is changed to a different language. At this time the soundtrack in many old films have been rerecorded, too. Perhaps the most famous post recording company at this time is Lucas film and it's Skywalker group, LucasArts.

In the mid seventies, Dolby Laboratories introduced a new sound technology for 35 mm film based on optical soundtracks instead of magnetic ones. In the system one extra sound stripe was fitted into the same space as the previous track to ensure compatibility with mono reproduction. To achieve acceptable signal to noise ratio, a compression technique is used for noise reduction. This solution also decreased the costs of the sound tracks.

Two channels are not sufficient to produce good stereo sound in a cinema. In normal two-channel stereo, the maximum distance between speakers is 4.5 meters without discontinuity in the stereo image. Locations off the center of a wide movie screen it is also impossible to achieve satisfactory localization. To fill the hole and to make the narrow stereo operating area wide enough an extra channel is needed between the left and right speakers. The task was to fit four channels into two soundtracks on the film. The solution was the 4-2-4 matrix techniques first used for quadraphonic home stereo. The system contains left, center, right and surround channels and it is known as Dolby Stereo (figure 6). The block diagram of Dolby 4-2-4 cinema decoder is presented in figure 7. The optional subwoofer output is formed from decoded channels.

Multi-channel digital sound format was introduced in 1992 by Dolby. The new sound track was added between perforated holes on the side of film. The old analog stripe remains unchanged and contains normal Dolby surround information. The new digital track provides stereo surround sound and an extra limited band effect channel compared to previous analog system. Dolby stereo digital is presented in figure 4.8. On the left of the picture the sound track arrangement previously described and on the right side, the speaker placement. Dolby stereo digital uses a new bit compression technology that is known as AC-3.

Figure 8 Dolby Stereo Digital multichannel film sound reproduction system [DOLA94]

Figure 9 Fantasia was the first film shown publicly with stereo sound utilizing three optical tracks on separate 35 mm film played in sync with picture. [DOLA94]

Dolby surround and stereo digital, SR*D, are methods to code and encode multi-channel sound. There are no special requirements for the sound equipment or the quality of sound in the theaters. To overcome this unsatisfactory[5] situation Lucasfilm started to develop requirements and standards for theater sound reproduction. The result of this work was THX. THX describes accurately the minimum requirements for theater acoustics and equipment to reproduce the film soundtrack as the mixer intended. Besides sound, THX specifies the minimum requirements for picture quality. Theaters that fulfill these standards can use the THX logo as a quality label. To maintain a high level of performance Lucasfilm monitors these theaters strictly so that the public can be assured of a satisfactory cinema experience.

By the early eighties, the time was ripe for home surround sound. Stereo video cassette recorders and Laser disks were introduced, offering high quality sound channels. TV manufacturers launched the first models with stereo amplifiers and stereo video broadcasting was ready for testing. Unlike sound only reproduction, television never used traditional quadraphony. Instead TV had adopted the sound system that was in use in theaters.

Dolby Surround was introduced in late 1982. The original soundtrack stays unchanged when the film is transferred onto stereo videocassette, Laser Disk or transmitted on stereo TV. A few years later, Dolby introduced Pro Logic, improving remarkably the channel separation. It also made it possible to decode the center channel. Market acceptance of Dolby Surround has been very good.

The timeline for multichannel sound reproduction is presented in figure 9 for both consumer and cinema format.

Dolby surround is a variation of quadraphony systems. Four channels are used with the rear speakers used to create ambiance for the front speakers. Three front channels, left center and right are used for larger listening areas. The system contains two normally enclosed or two cardioid or one gradient radiating rear or side speakers supplied by one mono channel. A normal system with two side speakers is presented in figure 4.10.

Dolby surround belongs to the family of 4-2-4 matrixed quadraphony. Four input channels have been converted to two channels. The conversion operation is [PESC76]

Lt and Rt signals can be recorded on any normal stereo media such as a CD or HiFi stereo video tape. Also all stereo transmission methods can be used. In reproduction the opposite operation is used to produce four output channels. Decoding for playback is

Figure 10 Speaker arrangement in Dolby surround system. Rear left and right speakers reproduce same signal in phase. Center channel is also known as dialogue channel

Figure 11 Channel separation map of Dolby surround sound system. Crosstalk from left and right channels to center and surround channels and wise-versa is only 3 dB [RODR92].

Channel separation between C' and L' and R' is only 3 dB. The situation is the same between the decoded surround and left and right channels. Crosstalk between the left and right is good, as well as between the center and surround channels. The separation map is presented in figure 4.11.

Because of low channel separation a set of psychoacoustical methods have been used to improve perceived directivity. A seven kHz low-pass filter decreases the leakage that tend to increase in magnitude towards the high frequency. Secondly, there is a delay in the surround channel. If two similar signals arrive at different times the sound is localized in the direction of the first received signal [HHA72]. Also, modified Dolby B-type noise reduction is used, partly to reduce noise in the surround channel but also to decrease the front channel leakage. Image dominance over sound helps significantly improve sound localization on the screen while decreasing psychoacoustic crosstalk from the surround channel.

Subjectively, front to back separation, has been perceived at better than 3 dB. Together with a phantom center image, Dolby surround can offer satisfactory spatial effects. In some cases the center speaker have also been used to improve dialogue localization at the expense of a lower stereo image. The block diagram of the Dolby surround encoder is presented in figure 13 and the decoder in figure 14. Low-pass filters and modified Dolby B noise reduction are used both in the encoder and in the decoder.

Figure 14 Dolby surround sound decoder block diagram. The system is also known as passive decoder [RODR92].

In the 4-2-4 matrixed quadraphony system, since four directional input signals are encoded to two channels, some directional information has been lost. In reproduction

the original directional sound field is impossible to create in the listening room. To decrease crosstalk between channels many methods have been proposed.

One approach is to feed that part of a frequency range that contains speech to the front speakers and the balance of the frequencies to the remaining four speakers. In another system, the sound level of each four loudspeaker is adjusted without changing the relative contribution of the two channels. This reduces the crosstalk. The system is known as gain riding. Another possibility is to produce the four output signals from the two ones according to a special mathematical algorithm varying the relative contributions of the two channels. The objective is to reduce the effect of crosstalk. The system is known as the variable matrix approach.

Control signals derived from encoded two channels is widely used. Two signals have been proposed to control separation for rectangle quadraphony. Both front and rear channels have their own control signal. The front-rear level is controlled. Systems containing up to ten control signals have been introduced.

Directional enhancement has also been developed for Dolby surround sound. Two alternative solutions can be used to improve channel separation. An enhancing circuit can be connected in cascade with the decoder or constructed as a part of the decoding process. Dolby has solved the enhancement problem with an adaptive matrix [MTAD89] [MTAD90] [MTD91] that is included in the decoder. The system is known as Dolby surround Pro Logic. An extra block, Pro Logic adaptive matrix has been added to the surround decoder. To help the level alignment of the user a noise sequencer is also used. The directional enhancement system is presented in figure 16 and the block diagram of the adaptive matrix is in figure 4.17. Encoders stay unchanged and channel separation is dependent only on the decoder [MTAD89] [MTAD90] [MTD91].

Although crosstalk between some channels is only 3 dB, there is still enough directivity information to re-create almost the original sound field by using a more complex algorithm than a simple add and subtract function. Channel separations in the left-right and in the center-surround axes are over 40 dB. First dominance vectors have been derived from band-pass filtered Lt and Rt. Left, right and center surround dominance vectors DLR and DCS are respectively

where a is a constant. Four directional control signals EL, ER, EC and ES have been derived from low-pass filtered dominance vectors

where a is referred to the base of logarithm in equation (7) and b is a constant (that has the best match to the referred directional sensation). A vector V is defined as

Cancellation matrixes GX have been used by Dolby Pro Logic. These matrixes have been derived from signals where only one is reproducing the matrixed sound. The size of matrix GX is five by two. Now the output signals L', R', C' and S' of adaptive matrix can be written

The adaptive matrix described above will detect the direction of one individual sound very accurate. For more complex signals directivity is lower. If one source is

Figure 17 Separation map of Dolby Pro Logic adaptive matrix surround sound system [RODR92]

dominate it is easily detected and reproduced in the correct direction. The balance of the sources forms an ambiance signal. Channel separation is increased to about 30 dB compared to 3dB without the adaptive matrix. The separation map is presented in figure 17.

[1]Personally I think this is a mistake and does not base on deep understanding of the nature of room and free space acoustics.

[2]System used 16" records and was known as "Vitaphone and was used 1920s

[3]Ampex Corporation, Redwood City, Ca introduced 200 series tape recorders and 3M Company, Minneapolis, MN and Audio Devices Inc., Stamford, CT introduced a new tape coating (3M series 111).

[4]35 mm CinemaScope 4 tracks, Twentieth Century Fox Film Corporation, Beverly Hills, Ca: 1953, "The Robe" and 70 mm Todd-AO 6 tracks, Todd-AO Corporation, Hollywood, CA: 1955, "Oklahoma"

[5]As previously seen there is vast amount of people partisipating to produce impressive high quality sound track to film and the most of theaters had only mono sound system. Situation was very frustrating.