ÄÄNILÄHTEEN PAIKALLISTAMINEN

Human hearing is quite atuned to detecting the direction of sounds in a horizontal plane [JUTO84]. It is most sensitive below 1 kHz. For frequencies over 2 kHz the sensitivity drops only slightly. In the frequency range from 1 kHz to 2 kHz the ability to detect directions is lower. The main mechanism for separating sound directions below 1.5 kHz is time difference and over the 1.5 kHz it is level difference between the ears.

If the same sound is excited at different times or with different levels to different ears, the sound source is localized in direction that depends of loudness or the order of sound excitation. Level and time differences are interchangeable within limits. Level and time differences needed to compensate each other are frequency dependent. At low frequencies the time that is needed to compensate for level is about 2 µs/dB and for high frequencies up to 100 µs/dB is needed. When the time difference is over 2 ms, the mechanism does not work and the first incoming sound is totally dominant.

Pinna cues of sound direction

Interaural differences in time and level are considered to be the major factors in directional hearing. They, however, can not take into account the localization of three dimensional space. Sound sources that cause identical interaural time differences lie on a hyperbolic cone around the interaural axis. The mechanism on how humans detect an elevation of sound source is not well known nor is front-back discrimination well know. One way to approach the problem is to use the function of pinnae with three dimensional space sound sources.

Figure 5 Descriptive diagram of pinna

Over one kHz, it appears that directivity information is processed basically in frequency domain as spectral variations while only secondary cues are handled in the time domain. Below one kHz the dimensions of the head and ears are such that time delay is the only primary cue as to the direction of sound. The first 500 µs to 1 ms is dominant for detecting the source. Repetitive directional information arrived after 1 ms has a lower effect until zero impact is reached at 10 ms (Haas effect?) [HBR88]. For early man this was all that was necessary for survival. There was no time for a complex process to detect the directions of a arriving sound. The consequence of this kind of character is that directional cues have to be as simple and as clear as possible.

Figure 6 Frequencies of judgment front (f), behind (b) and above (a) for one-third octave bands of noise [JEBL70]. Directional bands: Bordered- at 90 % level of significance and shaded- most likely

Figure 7 Spectral features that vary monotonically with decreasing sound source elevation in vertical plane (above) and increasing source azimuth in horizontal plane [HOHA94].

The processing of cues has to be effective and well suited for the neural system. For simplicity some experiments conclude that in some cases, directional information is processed as monaural [HBR88].

In the head related transfer function, HRTF, there appears to be a notch in the high frequency range that is a function of the elevation of the sound source. The notch itself may not be the primary cue but the left slope that varies systematically and monotonically from 6 to 13 kHz when the elevation is increasing, certainly is. This is the only cue below ear level. Although the detection of spectral slope is a difficult task for electronic equipment to measure it is simple for neural system while the detection of spectral minimum is a more complex task. Experimentally, the sensation of elevated sound source can be achieved by [HOHA94]

** 3.9...8.0 kHz low pass cutoffs. Increasing the cutoff at this frequency range, increase the elevation angle from 0 to 60 degrees.

** 4.0...7.2 kHz bandpass filtering is perceived in front with an elevation of 60 degrees.

** 7.4...10.8 kHz notches cause elevation increase from 0 to 60 degrees when the notch frequency increases.

** 10.3 kHz low pass filtering causes the elevation of 90degrees.

** 8.1...9.1 kHz bandpass signal causes the sensation of 90 to 60 degrees elevation in rear section.

** 12.0...17.8 kHz notch causes elevation of 90 degrees.

Besides the level, the right hand edge of a spectral notch, is shifted toward the higher frequencies in the right ear, when the sound source is moving clockwise. The working area of this cue is from 40 to 180 degrees. In the frontal section a double notch would work as directional cue from -40 to +30 degrees. It is the same double notch that detects the elevation in zero azimuth. In horizontal-plane localization the concha plays the major role in the HRTF.

The sensation of a background sound source can be achieve with 13.2...15.5 khz high-pass or 14.5 kHz band-pass filtered noise. In the first case there is no elevation and in the latter one source is elevated 30 degrees. The common denominator is towards the high frequencies ascending spectral slope. In HRTF there is also, at 16 kHz, a deep notch that doesn't appear in the frontal section. Front back discrimination appears to be based on a level difference at a band in these frequencies. In the lower frequency band from 3.75...7.5 kHz, depending on the azimuth angle, there is a deep notch with rear section sound sources but none with frontal sources. The third mechanism to achieving rear sensation might be a boosted frequency band between 7 and 12 kHz.

The frequency bands affecting the sensation of front, back and over head sound source with the stimulus of one third octave band noise are presented in figure 6. Figure 7 shows the main pinna cues that probably are active in detecting the elevation and azimuth of the sound source. In these measurements a torso in an unechoic chamber has been used. The microphone is mounted in the right ear of the torso.

Perception of distance

Human hearing is not very accurate in detecting the distance of a sound source. Loudness is maybe the most widely accepted cue for distance. Loudness decreases inversely as the distance in unechoic conditions and about 20 phon [MBGA69] is needed to perceive half the distance. In reverberant condition the situation is a little different.The decrease in loudness is not as large as in an unechoic condition and the loudness difference needed to double the perception of distance varies from 22 to 41 phon [SØNI93]. For accurate perception of distance absolute loudness must be known. The distance of the source is over estimated with low levels of sound but improves considerable with an increase in the sound level. Variations between individuals are large. In unechoic condition it is almost impossible to estimate the distance without any other cue other than loudness.

The shape of the frequency spectrum is another factor that may give a cue for ascertaining the distance of a sound source. Air damping is frequency dependent. The loss of high frequencies is greater than the losses of low frequencies. Besides direct air damping, many echoes in reverberant rooms cause absorption of reflections from walls. The loss of loudness depends on the materials of the walls. To estimate the distance of sound source requires knowledge about the characteristics of room and the frequency spectrum of the source.

The ratio between direct sound, early reflections and reverberation also offers a cue as to the distance of a sound source. In this case known sound and room characteristics are required for good perception. With a known source like the human voice the reverberation ratio appears to be the most important [SØNI93] factor in estimating the distance. Also binaural differences may sometimes be a valuable cue. The significance of this factor is not however very strong. Experiments have been also shown that the distance of a source is over estimated when the source is at an angle to the receiver as opposed to directly in front.

ÄÄNENTOISTO

etusivu