If you want to become full, let yourself be empty. Lao Tzu

Uploaded by
Lao Tzu

111 downloads 5966 Views 257KB Size

The Moment-Distribution Method

Ego says, "Once everything falls into place, I'll feel peace." Spirit says "Find your peace, and then

The Slope-Deflection Method

I want to sing like the birds sing, not worrying about who hears or what they think. Rumi

The SCORTM Method

At the end of your life, you will never regret not having passed one more test, not winning one more

the method of problems versus the method of topics

Don't fear change. The surprise is the only way to new discoveries. Be playful! Gordana Biernat

On the source-identification method William M. Hartmann,a) Brad Rakerd,b) and Joseph B. Gaalaasc) Michigan State University, East Lansing, Michigan 48824

~Received 8 June 1997; revised 25 August 1998; accepted 28 August 1998! The source identification method is a standard psychophysical procedure for studying the ability of listeners to localize the source of a sound. The method can be described in terms of a statistical model in which listeners’ responses are determined by the width and bias of an internal distribution. This article presents a theoretical study of the method, particularly the relationships between the average experimental observables, rms error and variability, and parameters of the internal distribution. The theory is tested against source-identification experiments, both easy and difficult. Of particular interest is the experimental dependence of observable statistics on the number of sources in the stimulus array, compared with theoretical predictions. It is found that the model gives a good account of several systematic features seen in the experiments. The model leads to guidelines for the design and analysis of source-identification experiments. © 1998 Acoustical Society of America. @S0001-4966~98!02712-X# PACS numbers: 43.66.Qp, 43.66.Yw @RHD#

INTRODUCTION

The source-identification method is an experimental technique for studying the ability of human ~or other! listeners to localize the source of a sound. The method is easy to describe. The listener is in an environment with a number, N, of sound sources. One source is caused to emit a signal, and it is the listener’s task to identify the location of the source. The location may be identified by name, number, or by coordinates on a prearranged scale. Over trials the listener receives presentations from all the sources, typically many times. The source-identification method, hereafter called the ‘‘SIM,’’ is especially applicable for localization experiments in a room. Here, the experimenter may be interested in localization as a function of the signal, or the listener, or the room itself. However, because of standing waves in the room, an experiment done with a sound source in any one location may be special and not representative of the system of interest. By averaging performance over a number of source locations, the experimenter achieves greater generality. Therefore, SIM data are normally averaged over the source array. The SIM is naturally modeled in terms of statistical decision theory ~Searle et al., 1975, 1976; Hartmann, 1983b!. The present article is primarily a theoretical study of that model. It shows how observable variables, rms error and variability, averaged over the source array, are related to parameters of the model internal distribution. Therefore, this article provides a guide to the design of SIM experiments that are intended to discover the internal parameters. The article is concerned especially with the choice of the number of sources to be used in an experiment that measures localization ability over a fixed angular range.

The SIM experiments studied here are constrained by the following assumptions: First, it is assumed that the allowed response set is identical with the stimulus set. For example, there might be N524 loudspeakers in front of a listener labeled 1 through 24. After presentation of a sound from one of the speakers, the listener must respond with a number from 1 to 24. Next, it is assumed that the sources are equally spaced by a common angle, A, measured in degrees along a single angular dimension, for example azimuth or elevation. For definiteness, the following discussion will be couched in terms of the azimuthal dimension, but the method is applicable to sources in any plane. The decision theory model used for calculations below is one dimensional. Therefore, the model is inappropriate when the perceptual character of the localization task is multidimensional. It is assumed that sources are arranged over part of a circle, to be called the span, with angular extent G5(N21)A, and with source number 1 at one extreme and source number N at the other. A SIM experiment begins with a choice of statistics to describe localization error. Searle et al. ~1975, 1976! used the absolute value of the discrepancy between response and target. Hartmann ~1983b! used the root-mean-square ~rms! error, which has theoretical advantages described below. The rms statistic is designated by the symbol D, the square root of an average squared error, computed as follows:

A

D5 D 5

W.M.H. is at the Department of Physics and Astronomy. b! B.R. is at the Department of Audiology and Speech Sciences. c! J.B.G. is at the Department of Mechanical Engineering at the University of Texas at Austin. 3546

J. Acoust. Soc. Am. 104 (6), December 1998

W ~ k ! D 2~ k ! ,

~1!

k51

where W(k) is the fraction of the trials on which source k was presented, and D 2 (k) is the mean square localization error for source k. This function is given by 1 D ~ k ! 5A Mk 2

a!

A( N

2

2

Mk

( ~ R i 2k ! 2 ,

i51

~2!

where R i is the listener’s response—on the scale of source numbers—to the ith trial on which source k is presented.

0001-4966/98/104(6)/3546/12/$15.00

© 1998 Acoustical Society of America

3546

There are a total of M k of such trials. Equation ~1! introduces the notation whereby a bar over a symbol indicates an average over sources and a bar under a symbol indicates the square root of that average. Statistic D includes both variability and constant error. A second statistic, sI , measures only variability by computing error with respect to the mean response. It is the square root of quantity s 2 given by N

s 5 2

(

k51

~3!

W ~ k ! s 2~ k ! ,

where the variability for source k is given by 1 s ~ k ! 5A Mk 2

2

Mk

( @ R i 2R ~ k !# 2 ,

~4!

i51

and R(k) is the average response of the listener—in terms of source numbers—when a given source k is presented, R~ k !5

1 Mk

Mk

( Ri .

~5!

i51

Statistic s(k) is a biased estimate of response variability that tends to underestimate the actual standard deviation for small sample sizes. For comparison with the variability observed experimentally or in a Monte Carlo simulation s(k) should be multiplied by AM k /(M k 21), a factor which becomes important if the number of presentations is small. In addition to variability, there is constant error. The constant error, C(k), measured in degrees, is the difference between the true location of a source, k, and the mean perceived location of the source, C(k)5A @ R(k)2k # . It may be positive or negative except when k is a well-defined extreme location. Rakerd and Hartmann ~1986! noted a Pythagorean relationship among rms error, variability, and constant error: D 2 ~ k ! 5s 2 ~ k ! 1C 2 ~ k ! .

~6!

Therefore D(k) was called the overall error. It follows that D 2 5s 2 1C 2 , where C 2 is an average over sources analogous to D 2 and s 2 . The calculations below are devoted to calculating these statistics, particularly D and sI . I. DECISION THEORY MODEL

The decision theory model for a listener’s response, given a sound coming from source k, includes several basic assumptions. The first is that the listener has an internal coordinate u for the source positions, undoubtedly established visually if the sources are visible, and that the presentation of source k leads to a normally distributed representation of location cues on that coordinate system. The probability density that source k leads to internal value u is given by P~ u !5

1

s k A~ 2 p !

e 2 ~ u 2 u k 2b k !

2 /2s 2 k

.

~7!

Here, parameter u k is the location on the reference coordinate for source k, and b k is a bias such that the acoustical cues for source k are not centered exactly on this referent.1 Bias leads to constant error, C(k), and increases the size of the overall error, D(k). 3547

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

A key parameter is the angular standard deviation, s k , called the width of the internal distribution, or, simply, the width. It depends on the listener, the type of sound that must be localized, the environment in which the experiment is performed, and the position of the source. The sound may be easy to localize ~small s k ), e.g., a broadband impulsive noise, or it may be difficult ~large s k ), e.g., a spectrally sparse tone without onset transient. Normally, the purpose of a SIM experiment is to determine the width as a function of experimental conditions. Because the width is not zero the listener makes inconsistent responses to a given source. The width is generally a function of k because some sources are more difficult to localize than others. In the azimuthal plane sources to left and right are more difficult than sources in front, and in the median sagittal plane sources overhead are more difficult than others. A second assumption of the model calculation is that responses are quantized; when a listener experiences internal coordinate u, the listener responds by choosing the source with referent u k that is closest to u. ~Alternatively, listener responses on a continuum scale may be quantized in the process of recording the data.! There are two kinds of calculation, terminated span or wrapped span. For a terminatedspan calculation, the span has well-defined ends, typical of a span that is much less than a complete circle. Here, the probability of making a particular response given a particular source is a simple monotonic function of the distance along the span between the two locations. By contrast, a wrappedspan calculation includes both errors along the span and error outside the span; it is defined in more detail below.

A. Calculations without bias

The present section examines statistics D and sI when there is no bias (b k 50). The calculations were motivated by the conjecture that for a given source array span, the values of overall error, D, and variability, sI , should be insensitive to the number of sources in the array. The logic was simple: As the number of sources is reduced the listener is less likely to make an error because the sources are farther apart. However, when the listener does make an incorrect choice, the contribution to the overall error sum is a larger number of degrees. The conjecture that D and sI should be insensitive to N follows from the expectation that these two effects should largely cancel one another. One purpose of the calculations below was to test that conjecture. The dependence of D and sI on the number of sources was tested in a computation where each source is presented an equal number of times @ W(k)51/N # . The calculation used an analytic form for the cumulative normal function to determine the probabilities of each possible response for each possible source. 1. The small-span limit

A source array with a small span extends over a limited range of azimuth values. Therefore, a small-span sourceidentification experiment can provide the same information Hartmann et al.: Source identification method

3547

as a minimum audible angle experiment with the advantage that the source-identification method should be less sensitive to standing waves in the environment. When the span is small, the width may be regarded as independent of the source number, i.e., s k becomes a constant, s 0 . Calculations in the small-span limit are normally terminated-span calculations. From the structure of the equations it is possible to come to some general conclusions. There is reason to expect that function D 2 (k) should be approximately equal to s 20 , because the second moment of a normal density is the variance. Function D 2 (k) resembles the second moment of density P. This is a theoretical advantage of the rms quantities D and sI . However, D 2 (k) is not exactly equal to s 20 , both because the formula is a discrete sum—not an integral—and because of end effects. In the limit that the width s 0 becomes very small while the number of sources N becomes large, D(k) approaches s 0 , as long as k is not close to the edges of the source array. In those limits, the discrete sum approaches an integral, and end effects are not important because the distribution has little strength near the ends. Also, in those limits the value of D approaches s 0 because the fraction of sources near the end becomes small, and D is determined primarily from values of D(k) that are away from the ends. A logical problem with terminated-span calculations is that when the width s 0 becomes comparable to the source span G, the model sometimes predicts performance that is worse than random guessing. When this unreasonable result occurred in calculations below, the calculations were halted and the limiting point was noted in the graphical presentation of the results. The random guessing limits for D and sI are given by Eqs. ~A8! and ~A12! of the Appendix, where they are derived. The results of the calculations are given in scaled units, normalized to either the span G or the width s 0 . Therefore, the calculations are not immediately applicable to any particular experiment, but, with a little work, they are applicable to all particular experiments. Parameter s 0 is always given in units of the span. The work of Searle et al. ~1976! suggests that the internal width s 0 increases in proportion to the span. Therefore, the normalized parameter s 0 /G, as used here, is a convenient choice.2 Figure 1 shows the predictions of the analytic cumulative normal calculation for D as a function of increasing number of sources, N. The figure shows that D converges to the width when N is large and s 0 is a small fraction of the span. For example, when s 0 /G50.025, D converges to within one percent of s 0 when there are 50 sources. When s 0 /G is not small, D always converges to a value that is less than s 0 . The discrepancy is caused by end effects, but see Sec. I A below. Figure 1 also shows that the expected value of D is close to its asymptotic value ~for large N! when there are enough sources that the spacing between the sources is less than or equal to s 0 . These adequate values of N are indicated with a filled star. Although Fig. 1 shows that D/ s 0 decreases with increasing s 0 , in fact, D itself increases monotonically with increasing s 0 : the larger the width, the larger the rms error. 3548

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 1. rms error, D, expressed in units of s 0 , the width of the listeners’s localization probability density function. Statistic D is presented as a function of the number of sources in the array, assuming that span G remains constant. The parameter is s 0 in units of the span. A filled star indicates the value of N where the spacing between sources is equal to the width s 0 .

The quantity D/ s 0 decreases because D increases less rapidly than linearly with increasing s 0 . For practical purposes, Fig. 1, and other figures in this article, must be used iteratively to find a self-consistent solution for the width. The experimenter begins by knowing G and N. The experimenter measures D. The self-consistent calculation begins with the assumption that s 0 5D. This leads to a value of the graph parameter s 0 /G. The graph then leads to a predicted value of D/ s 0 , and hence a revised value of s 0 . Because the plots in Fig. 1 are smooth, one expects the calculation to converge to a stable value of s 0 after only one or two iterations. The insensitivity of D to the number of sources is further demonstrated in Fig. 2, which shows D/ s 0 as a continuous function of s 0 /G. The calculated value of D varies by less than 10% as the number of sources is varied, provided that there are at least six sources and s 0 is greater than 5% of the span. When s 0 is greater than 20% of the span, D becomes extremely insensitive to the number of sources. Parallel calculations for variability, sI , for the case of no bias show that sI is very similar to D, as would be expected.

FIG. 2. rms error, as a function of the continuous variable s 0 /G, the width of the listener’s internal distribution expressed as a fraction of the span. Each function is cut off at the random guessing limit. Hartmann et al.: Source identification method

3548

Although sI is logically required to be smaller than D, calculated plots of sI vs N or sI vs s 0 /G almost coincide with the corresponding plots for D ~Figs. 1 and 2! so long as the width is less than 10% of the span ~i.e., s 0 /G,0.1). The discrepancy between sI and D grows as s 0 /G increases, but the difference is not more than 10%, even when s 0 /G is as large as 0.5. 2. Spans approaching 180 degrees

As the source span increases it becomes more important to take account of the dependence of the width on source location. For definiteness, we continue to assume that the sources are in the horizontal plane. The dependence of the width, s k , on the angular position of the source, u k , is modeled by assuming a constant difference limen for the interaural time difference. This model is known to capture some, but not all, of the azimuthal dependence of the width. In this model, the localization error is inversely proportional to the derivative of the interaural time difference with respect to angular position. For an azimuthal coordinate system, with u 50° directly in front of the listener, the interaural time difference is described by the Woodworth formula ~1938!, Dt5 a ~ u 1sin u ! ,

~8!

where u is in radians and a is a constant equal to the head radius divided by the speed of sound. Differentiating with respect to u and inverting gives du 1 5 . d ~ Dt ! a ~ 11cos u !

~9!

Since s k is proportional to d u k ,

s k5

2s0 , 11 u cos u k u

~10!

where s 0 is the width directly in front of the listener. The absolute value in the denominator is necessary to account for the sign of cos u in the different quadrants. As the span approaches 180°, there is a second, and structurally more important, effect that must be considered in the computations, namely ‘‘wrapped’’ probabilities. If, for example, the source is at 80° to the left of center, the probability of choosing a response that is 70° to the right of center is not just the probability of making an error of 150°; one must add also the probability of making an error of 210° (36021505210). The need to include wrapped probabilities signifies the departure from the terminated-span calculation considered in Sec. I A 1. For example, it is no longer necessary to consider the random guessing limit because large probabilities for responses off the ends of the array are correctly wrapped. The calculations shown in Figs. 3 and 4 below include both the effect of source-dependent width and wrapped probability. Figure 3 illustrates how D depends on span G when the array is centered on the forward direction and extends equally to the listener’s left and right by G/2. The figure shows the effect of the variation of s with source angle for various values of s 0 when the number of sources is large. If the span is small, s is approximately constant. The fact that 3549

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 3. rms error as a function of span G when width s changes with source position such that the width expressed as interaural time difference remains constant. rms error D is normalized to the width directly in front of the listener s 0 . The number of sources in the calculation was N550.

D/ s changes by less than 10% as G increases to about 120° shows that the assumption of constant ITD is equivalent to a constant-sigma approximation even as a source span becomes as large as 660°. As the span increases beyond 120° D begins to rise. When s 0 <0.1G this rise is proportional to the increase in the average value of s. Therefore, if the plot of D is normalized to the value of s averaged over the span the plot becomes almost a flat line, independent of G. The average value of s from integrating Eq. ~10! is ¯s 54 s 0

tan~ G/4! G

~ G< p ! ,

22tan~ p /22G/4! ¯s 54 s 0 G

~11! ~ p ,G<2 p ! ,

where G is expressed in radians. For s 0 greater than 10% of G, the average-sigma model is less successful. For a span greater than 160°, there is an anomalous curvature when s 0 50.2G.

FIG. 4. rms error for source-dependent width as a function of the number of sources. The span is 180° centered on the forward direction. This figure can be compared with Fig. 1 to see the effects of source-dependent width and wrapped probability. The tick mark on the right axis shows the average width over 180°. Hartmann et al.: Source identification method

3549

Figure 4 shows D as a function of the number of sources for a span of 180°. As described in connection with Fig. 3, the asymptotic values in the large N limit are similar to Fig. 1 except that they are scaled by the average of s / s 0 . From Eq. ~11! for G5180°, this is equal to a scale factor 4/p or 1.27. Figure 4 shows that when N is not asymptotically large this simple scaling does not always apply. The figure also shows that D does not vary monotonically with s 0 ; the value for s 0 50.2G seems to be out of order. Figure 3 suggests that this nonmonotonic behavior is restricted to spans greater than about 160°. The curiously large curvature for the plot with s 0 50.2G occurs only for such large spans. The nonmonotonic behavior is the result of the combined effects of source-dependent width and wrapped probability. Calculations that exclude either one of these show only a monotonic dependence on width. Calculations with a 180° span and wrapped probability were also done for a constant ~source-independent! value of the width. The calculations led to a plot of D vs N that was almost identical to the terminated-span calculation in Fig. 1, except for the extreme case, s 0 50.4G. For both, D systematically underestimated the width. For the terminated span the reason was end effects, as noted in Sec. I A 1. For the wrapped span the reason is the wrapped probabilities themselves. If the width is less than 20% of the span, wrapped probability has a negligible effect on D(<1%) when the span is not greater than 180°. Because wrapping complicates the analysis of data, an experimenter would do well to avoid spans approaching 180° if the experimental conditions promote large internal width, 30° or more.

FIG. 5. rms error for source-dependent width, and for a large span, G 5270°. Part ~a! does not give the subject the benefit of a front-to-back reflection; part ~b! has reflection scoring.

3. Span greater than 180 degrees

When a span exceeds 180°, the source array cannot be entirely in front of the listener. Some sources must extend toward the rear, and this changes the perceptual nature of the localization task. Sources which differ considerably in azimuth may lie on the same cone of confusion and be perceptually similar. This multidimensional aspect of perception is not captured in our one-dimensional localization model. For purposes of illustration we proceed with the model anyway. When G becomes greater than 180°, the array itself wraps around so that some sources are closer to each other across the gap between source 1 and source N than along the span. This possibility requires a new computational rule for scoring such that the maximum error charged against the listener is 180°. Any error that is found to be greater than 180° is replaced by its 360° complement. Thus for any pair of sources in the array, there is a unique magnitude and direction of the difference between them. When the source array extends behind the listener, it is common to deal with the multidimensional character of the task by regarding confusions between front and back sources as separate from azimuthal confusions. Therefore azimuthal errors are computed by giving the listener the benefit of a reflection in the frontal plane ~includes the points at 690° azimuth and the point overhead! if that leads to a smaller error ~Wightman and Kistler, 1989!. Below, the calculations that employ that rule are called ‘‘reflection scoring.’’ It is not 3550

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

necessary that an actual source be present at the site of the reflection. When reflection scoring is introduced, the final value of the error is the smallest of the listener’s choice or its 360° complement, or the reflected choice or its 360° complement. As an example of large spans, we chose a span G 5270°. The array was centered on the midline, with one end at 2135°, the other end at 1135°, and the remaining sources (N22) equally spaced in between. The internal width was taken to depend on source angle per Eq. ~10!. Figure 5~a! shows the results without reflection scoring. As before, D/ s 0 is quite insensitive to the number of sources. Upon careful observation, periodic variations can be observed in the D/ s 0 data, especially for small s 0 . This effect is due to the arrangement of the sources based on G and N. When G5270°, there are sources located at u 5690° whenever N56n11 ~where n51,2,...). This creates peaks because the average s k is increased. ~The same effect occurs for circular spans whenever N54n.) The analogous plot of sI / s 0 is the same as D/ s 0 in Fig. 5~a! within 10%, except when s 0 /G50.4 where the discrepancy becomes about 15%. Figure 5~b! shows the effect on D when reflection scoring is introduced. The values of D are generally reduced, of course. Further, the tendency for peaks at N56n11 is greatly enhanced. A better description of the effect is that Hartmann et al.: Source identification method

3550

reflection scoring introduces a valley centered on N values given by N56n14. Valleys result from source placements in which the localization score benefits the most when the listener is given credit for a correct answer despite a frontto-back reversal. According to Eq. ~11!, the average width, ¯s (270), is equal to 1.35s 0 . In the limit of a large number of sources, D agrees very well with the expectation D51.35s 0 if reflection scoring is not used @Fig. 5~a!#. Only as the ratio of width to span grows to 0.4 is there appreciable departure. ~For a 270° array a ratio of 0.4 means that the internal width is more than 100°, a case of extreme uncertainty.! Even if the number of sources is not large, the D values in Fig. 5~a! do not differ from the expected value by more than about ten percent. The same statements cannot be made about the calculation with reflection scoring @Fig. 5~b!#. Then statistic D is less stable both with respect to s 0 and with respect to the number of sources. The peak and valley structure is, however, particularly apparent for a 270° span. For general span G(G.180°), peaks and valleys are not as frequent. A peak occurs for N sources when there are two integers N and k that satisfy the condition N5

2G ~ 2k21 ! 11 , 2G11

~12!

where G is the span fraction, G5G/360. It is somewhat difficult to evaluate the significance of the structure observed for reflection scoring because we do not believe that our one-dimensional calculation is appropriate perceptually for sources that extend to the rear. However, this objection to the calculation is not fatal. The actual cause of the valleys in the structure is a series of source locations that particularly benefit the listener when reflection scoring is introduced. To some degree, this experimental artifact is bound to appear with reflection scoring. The precise size of the artifact depends on the perceptual model. 4. Summary

At the outset of this section on the SIM without bias, it was conjectured that the values of D and sI might be insensitive to the number of sources. It was expected that the smaller probability of making an error when the number of sources is small would be compensated by the larger penalty when an error is actually made. Therefore, it was further conjectured that experimental values of D and sI should provide reliable estimates of internal width s. In the end, Secs. I A 1–3 above support these conjectures. The conjectures hold for a wide range of widths and source spans. However, the relationship between quantities D and sI and the width parameter depends on the width parameter itself, in the form s/G, as shown by Figs. 1–5. Therefore, an actual determination of the width from D or sI may require some modest iteration. The functions in the figures are so well behaved that convergence is assured. B. Calculations with bias

The model of Sec. I A described a listener without bias. When the sound originated from source k, the internal distri3551

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

bution for auditory localization cues was centered at the location u k , corresponding to the reference position of the source as established visually. It is this reference coordinate that the listener uses in making responses. Therefore, the statistics of the responses to source k depended only on a single parameter, the width s k . The model without bias is, however, an idealization. Unfortunately, in sound localization, bias is the rule and not the exception. Bias is introduced by visual cues ~ventriloquism! and by acoustical cues, such as the reflections from walls in an asymmetrical room environment. Bias can be introduced into an experiment deliberately; a large visual bias is caused by directing a listener’s gaze to the end of a source array ~Hartmann, 1983a!. A large acoustical bias can be created by putting a single reflecting surface in an otherwise anechoic room ~Rakerd and Hartmann, 1985!. But although bias can be experimentally controlled, it cannot be entirely eliminated; it is normally present for any listener whether one wants it or not ~Hartmann, 1983b!. Bias consists of a displacement of internal acoustical cues with respect to the angular reference coordinate system, u k . Therefore, bias can be seen in plots of R(k), and it is measured for individual sources by constant error C(k). An average measure of bias is C. Because the rms error, D, includes C @Eq. ~6!#, the bias also appears in D. In this article we take the view that the goal of the experimenter is to use the source identification method to learn about the width of the internal distribution s. The presence of bias poses a problem, and the purpose of the present section is to try to deal with it. Although s can be determined from either sI or D in the absence of bias, the presence of bias has a major direct effect on D which makes it unreliable for estimating s. By contrast, the variability sI should, in principle, be independent of bias because variability is calculated with respect to the mean response made by the listener and not with respect to a physical referent. In practice, however, sI is affected by bias, both because of effects at the ends of the arrays and because of the quantization of the responses. Therefore, statistic sI is the best statistic to use to estimate s in the case of bias, but it is not without troubles of its own, as will be seen below. What makes it difficult to discuss bias is that bias can take many forms. Below, we deal with two types, constant bias and central bias. Calculations are presented in the small-span limit. 1. Constant bias

Constant bias means that the displacement of the acoustical cues with respect to the reference coordinate system is constant, independent of the source. Constant bias is a common occurrence, especially if the array of sources is small. The effect of directed gaze on the localization of sources in a 28° span was found to be modeled best by a constant bias ~Hartmann, 1983a!. Numerical studies, using the decision theory model and constant width, on the effects of constant bias showed that bias can always be neglected if the number of sources is large enough. If the bias is large, it may not be practical to run as many sources as are needed for sI to give a good estimate of s 0 , but large N is an important limit to keep in Hartmann et al.: Source identification method

3551

FIG. 6. The role of constant bias. Open symbols show variability sI when there is no bias. Filled symbols show the effect of making the bias, b, twice the width, s 0 . Two values of s 0 /G are shown.

mind. The effect of bias on sI depends sensitively on the ratio of bias b to width s 0 . Bias effects are shown in the plot of sI in Fig. 6 for the special case that the bias is twice the width. The filled-circle plot in Fig. 6 shows sI when the bias is 5% of the source span (b/G50.05) and s 0 /G50.025. It can be compared with sI in the absence of bias ~open circles!. When there is no bias, sI gives a good estimate of s 0 if the number of sources is about N514 or greater. Adding the bias has a dramatic effect on the variability, leading to a peak at N 59. The peak overestimates s 0 by a factor of 2. The behavior shown by the circles in Fig. 6 is typical. Whenever the bias is twice the width there is a peak ~height 1.5,sI / s 0 ,2.5) as a function of N. The peak occurs at N 5N max , where N max'Int(0.2G/ s 0 )11. Not surprisingly, a given bias has the largest effect for the smallest s 0 , and the number of sources needed to eliminate that effect may become large. The square symbols in Fig. 6 check the above statements when the bias is b/G50.2 and the width is s 0 /G50.1. If the bias becomes as large as 4 s 0 , sI becomes an oscillating function of N and cannot estimate s 0 . On the other hand, if the bias is no larger than s 0 itself then the effects of bias on sI are less than 10%, so long as there are four or more sources in the array and s 0 /G is larger than about 0.02. Then it is possible to ignore the bias in determining s 0 as the large-N limit of sI . 2. Central bias

Whereas constant bias is necessarily directed toward one end of the source array or the other, central bias is directed toward the center of the array. In the common case of a symmetrical array with the subject looking at the center, a central bias may be a visual effect. In general, any central tendency, such as a reluctance to choose extreme responses, appears as a central bias. The central bias function itself might take different forms: straight line, S-curve, step function, etc. The calculations of this section employ a step-function bias function because the experiments described below often found R(k) functions approximately of this form. In a step-function bias 3552

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

the auditory cues for all the sources to the left of center are biased toward the right by a constant (b l ) and all the sources to the right of center are biased toward the left by a constant (b r ). The bias can be characterized by a single central-bias parameter b c if it is symmetrical (b l 5b c and b r 52b c ). Model calculations for small spans indicated that sI calculated with central bias was very similar to sI calculated with a constant bias of the same magnitude. Typical differences between the two kinds of bias were less than 10% for N large enough to provide a reasonable estimate of sI . The sign of the difference was always the same; central bias led to the larger sI . The difference grew with increasing bias magnitude. However, as long as the bias was not greater than twice the width, the difference was less than 33% even when the bias was as large as 80% of the span. II. EXPERIMENTS

To test the model calculations we performed localization experiments. We were particularly interested in how D and sI depend on the number of sources in a given span. Therefore, the experiments were performed using 3, 6, 12, and 24 sources. A. Tasks

In order to test the computations in several ranges of s 0 , we used two tasks, one in which the localization was easy and one in which it was difficult. Both tasks were performed in a reverberation room. 1. Easy (EL) experiment

In the easy localization ~EL! task, listeners sat 3 m away from an array of speakers in the horizontal plane. The array extended 23° to the left and right of the midline (G546°). Broadband noise at a level of 55 dB SPL was given a stepfunction amplitude envelope and played through one of the speakers. The subjects’ task was to declare which loudspeaker had sounded. 2. Difficult (DL) experiment

The difficult localization task ~DL! was made much more difficult than the EL task. Listeners were 6 m away from the source array, again in a 46° span. Because of the larger distance to the source, incoherent reverberant sound was a larger fraction of the total sound power, making localization more difficult. The stimulus was broadband noise that had been low-pass filtered ~corner frequency of 5 kHz, 248 dB/octave!. Therefore, listeners could not use highfrequency interaural intensity cues that are especially helpful in this room. The SPL of the noise before filtering was identical to the EL experiment. The filtered noise was given a linearly rising amplitude envelope with a duration of 2 s. During the onset, uncorrelated broadband noise was played at a level of 85 dB through a speaker behind the subject’s neck to mask the onset of the stimulus. Therefore, listeners gained no benefit from the precedence effect, further degrading localization ability. Again, the task was to declare which loudspeaker sounded. Hartmann et al.: Source identification method

3552

B. Method

The reverberant room was rectangular with dimensions 7.6736.3533.58 m high. It had a reverberation time of 4 s at midrange frequencies. The orientation of the array in the room is best described as a nonspecial geometry. The 24 loudspeakers were Realistic Minimus 3.5, consisting of a single driver in a sealed box. They had been chosen from a set of 85 based on similar on-axis frequency response in an anechoic environment. The configurations for the different number of sources were as follows: N524⇒A52°⇒G546°, N512⇒A54°⇒G544°, N56⇒A58°⇒G540°, N53⇒A523°⇒G546°. The loudspeakers were at ear level of a seated subject. A bar rested on the head of the subject to help the subject maintain a constant, forward facing position. Each source was labeled with a number, and the subject made a response by using a button box to increment a numerical display up or down. The display reading was then recorded by the computer running the experiment. C. Subjects and procedure

Four subjects participated in these experiments. Subjects W, R, and G were males, ages 57, 45, and 21, respectively, and were the coauthors of this article. Subject J was a female of age 17. Subjects W and R had extensive experience in localization experiments and had high-frequency hearing losses typical of males their age. Subjects G and J had recent experience as subjects and had normal hearing. The experiments were performed in blocks of runs for both easy ~EL! and difficult ~DL! tasks. A block consisted of a run for each source spacing condition for either the EL or the DL case. The runs of a given block were performed on the same day, and the order of the runs within a block was randomized. Each run consisted of 48 stimulus-response pairs and lasted 10–15 min. Within each run, all stimuli were presented an equal number of times in random order. Therefore, a particular source was presented twice for N524, four times for N512, eight times for N56, and 16 times for N 53. There was no feedback, but a curious subject was allowed to view the results at the end of a run. Each subject did three blocks for both EL and DL conditions. D. Results

The experimental results appear in their greatest detail in plots of R(k), the average response of a listener to source k. For illustration, plots of R(k) are shown for listener W in Fig. 7~a! and ~b! for the EL and DL experiments, respectively. Perfect performance corresponds to an R(k) plot that is a 45-degree line. It can be seen that Fig. 7~a! approximates a 45-degree line, although there is considerable central bias, as described above. The plot for the DL experiment in Fig. 7~b! shows enormous deviations from the 45-degree ideal as 3553

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 7. Function R(k), the average response of listener W to source number k. Error bars are plus and minus the variability, s(k). Experiments with different numbers of sources ~N! are plotted on the same graph: stars for N524, open circles for N512, open squares for N56, and filled squares for N53. Part ~a! is for the EL experiment. Part ~b! is for the DL experiment. Each small division on horizontal and vertical axes corresponds to 2°.

well as central bias. Figure 7~a! and ~b! is typical of R(k) plots for all the listeners, although different listeners had different forms of bias, some better approximated as constant bias, not central. Of primary interest in the present article are the average quantities D and sI for the eight different conditions (N 524, 12, 6, and 3 for both the EL and DL experiments!. These are given in Table I, averaged over the three runs for each listener. These averages and corresponding standard deviations (n21) over the three runs appear in Figs. 8 and 9. III. COMPARISON—THEORY AND EXPERIMENT

The principal comparison between theory and experiment was a test of the prediction of the decision theory Hartmann et al.: Source identification method

3553

TABLE I. Experimental values of rms error (D), variability (s), and constant error (C) for four listeners in two source identification experiments, easy ~EL! and difficult ~DL!. The arrays spanned 46 degrees and included N53, 6, 12, or 24 sources. Values of width and bias are model parameters determined from the asymptotic variability and constant error, respectively. The parameters were used for model calculations in the comparison plots that follow.

Listener

N

Experiment ~degrees! D sI C

G

3 6 12 24

0 0 2.21 2.01

3 6 12 24

0 1.09 3.09 3.01

0 0.84 1.67 1.26

3 6 12 24

0 2.66 3.26 3.13

3 6 12 24

0 3.21 3.85 3.59

3 6 12 24

10.14 8.54 11.39 11.70

3 6 12 24

8.08 8.70 9.21 9.61

7.82 6.48 6.79 4.14

3 6 12 24

11.60 9.70 11.46 11.13

3 6 12 24

7.12 10.00 10.27 11.37

J

R

W

G

J

R

W

EL experiment 0 0 0 0 1.68 1.44 1.21 1.60

Model ~degrees! width bias

1.21

1.60

0 0.69 2.60 2.73

1.26

2.73

0 2.37 2.03 1.64

0 1.21 2.55 2.67

1.64

2.67

0 2.44 1.93 1.43

0 2.09 3.33 3.29

1.43

3.29

DL experiment 8.60 5.37 6.88 5.06 6.73 9.19 6.04 10.02

6.70

10.02

2.03 5.81 6.22 8.67

4.50

8.67

10.00 8.29 6.56 6.42

5.88 5.04 9.40 9.09

7.40

9.09

6.73 6.13 5.94 5.13

2.32 7.90 8.38 10.15

5.70

10.15

model for the dependence of sI and D on the number of sources in the array. This dependence was the primary focus of model calculations themselves, for small and large spans, with and without bias. Because the experimental span was only 46° the model calculation could be done in the small-span limit. The input parameters to the model were the width of the internal distribution and the bias. The bias was assumed to be of the constant type, or, equivalently, central. The biases observed experimentally were of both types, but, as described in Sec. I B 2, these two types of bias have similar effects on the average statistics of interest. It was assumed that the width and bias parameters depend only on the listener and the experimental conditions—EL or DL. Therefore, it was expected that the dependence on number of sources, for both D and sI , should be predicted by the model. 3554

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 8. Comparison between experiment ~points! and model ~solid lines! for the easy localization ~EL! experiment. Each row is for a single listener, sI and D. Error bars are two standard deviations (n2152 weight! in overall length. Dashed lines connect the experimental points.

The procedure for assigning model parameters was simple. We assumed that the width should be determined by the large-N limit of sI , i.e., N524 in Table I. When the model width is small it is equal to sI (24); when the width is not small, it must be taken to be somewhat larger than the experimental sI (24) in order that sI (N) agrees with experiment in the limit that N524. We also determined the bias parameter from the experimental constant error, C(24), in Table I. As a measure of bias, this constant error approximately agreed with the vertical shifts seen in plots of R(k). For example, C(24) for listener W in Table I is 3.29°. This agrees with R(k) in Fig. 7~a!, which suggests a central bias averaging 1.5–2.0 divisions, or 3°–4°. Therefore, the nature of the comparison was to determine the model parameters from the width and estimated bias for N524 and to compare the model predictions, for both sI and D, with the experimental results for N53, N 56, and N512. The model parameters are shown in the right two columns in Table I. The comparisons between calculations and the EL experiments are shown in Fig. 8. The comparisons show that the model is in reasonable numerical agreement with experiment, even though the parameters were not chosen to provide an optimum fit. Further, the model captures a number of features seen in the experiments: There is a tendency for a Hartmann et al.: Source identification method

3554

FIG. 9. Same as Fig. 8 but for the difficult localization ~DL! experiment.

peak in sI and D as a function of N when the width is small. However, theoretically the peak is less prominent for D than for sI , and experimentally no significant peak appears in D. The comparisons between calculations and the DL experiments are shown in Fig. 9. In the DL experiments the width is large. For large width, theory and experiment agree that there is no peak for 3

IV. CONCLUSIONS

The source-identification method ~SIM! is a standard technique used to measure the ability to localize a sound. The method uses an array of source positions, which is particularly useful when there is reason to expect that the perception of any one source would be special. Such conditions occur in rooms. The experimental data from this method are in the form of variability ~theoretically insensitive to bias! and rms error ~includes both variability and bias!. The method can be analyzed with a decision theory model based on a coordinate system imagined to be internal to the listener. Sources from the physical world lead to distributions of localization cues on this internal coordinate, characterized 3555

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

by a model width and a displacement bias of the mean. A similar model was used to analyze the minimum audible angle method ~Hartmann and Rakerd, 1989!. Calculations are simplest for a terminated-span model. Here, the array is short enough that points on the internal coordinate that are to the left of the leftmost source must be assigned to the leftmost source; they do not wrap around and become confused with positions on the right. Terminatedspan calculations find that if bias is negligible, both the rms error and the variability can provide good estimates of the average width of the internal distribution if there are enough sources in the array. The results are very insensitive to the number of sources if the spacing between the sources is less than or approximately equal to the width. The variability appears to be a good measure of the width even in the presence of bias if the bias is smaller than the width. Bias that is larger than the width—a frequent occurrence—complicates the relationship between experimental results and the parameters of the internal distribution. The variability ~not the rms error! may still be a reliable measure of the width if the number of sources is large enough. To determine the required number of sources, one must model the bias in some way and fit the experimental data to width and bias model parameters. Two simple bias models, constant and central, were found to give similar results. When the angular span of the model is not terminated, probabilities are wrapped around a complete circle. Calculations indicate that the variability continues to provide a good measure of the internal width, as long as the width is not greater than 20% of the span. When the angular span of the actual sources is wrapped beyond 180°, source localization becomes a multidimensional perceptual problem, and the perceptual distance between two sources is not a monotonic function of the azimuth difference. Therefore, our one-dimensional model is not applicable. Applying the model anyway reveals complicated effects that occur when localization scores are given the benefit of a front-to-back reversal. Similar effects are expected to occur independent of the model. Finally, experiments with human listeners were done in order to test the model calculations. The experiments used a small span in which the number of sources varied from 3 to 24. To provide a stringent test, both easy localization ~EL! and difficult localization ~DL! experiments were done. The experiments were done in a reverberation room, and constant errors ~biases! were a major component of the overall errors. It was found that the model gave a reasonable account of the experimental results, even though the model treatment of bias was simple. To improve on the methods used here would require a treatment of bias peculiar to each individual listener. The resulting model would lead to better agreement with experiment, at the cost of generality. Because of its internal consistency and satisfactory experimental validation, the decision theory model in this article can serve as a guide to the design and analysis of source identification experiments. In the matter of experimental design, the model can determine the correct number of sources to use in an array, based on anticipated results. After the rms Hartmann et al.: Source identification method

3555

ACKNOWLEDGMENTS

in this article as the uncertainty becomes infinite. However, these limits are unreasonable because listeners can achieve better performance by guessing randomly among the sources. Better large uncertainty limits are the random guessing limits calculated below. If the N sources are presented equally often, D is the square root of

This research was supported by Grant No. DC00181 from the NIDCD of the NIH. Additional funding was provided by the National Science Foundation, which supported Joseph Gaalaas through its Research Experience for Undergraduates grant to the MSU Department of Physics and Astronomy. We are grateful to Joy Hsu for her participation in this study. Steve Colburn, Barbara Shinn-Cunningham, and Raymond Dye made many useful comments on a previous version of this article.

A2 D 5 P ~ k 8 u k !~ k2k 8 ! 2 , ~A6! N k 8 51 k51 where P(k 8 u k) is the probability of choosing source k 8 given that source k was presented. If listeners guess randomly then, in the absence of bias, they make each response k 8 equally often, independent of the source k, and P(k 8 u k)51/N. The double sum can be done and

error and variability data are experimentally known, the model can be used, first to decide whether a reliable value of the width of the internal distribution can be determined from the data, and second to calculate the actual values of the width and the bias.

N

D 2 5A 2

APPENDIX: LIMITS OF HIGH UNCERTAINTY

In the limit of high uncertainty, the width of the internal distribution becomes large compared to the span. In the extreme uncertainty limit, there is a negligible probability that the internal representation of the source lies within the span of allowed responses. Therefore, terminated-span model calculations find that all responses become extreme responses. In the absence of bias, sources 1 and N are chosen equally. Then D 2 is given by summing the squared differences between the N sources and the extremes. The two extreme sums get the same weight ~1/2!, and they are, in fact, equal. Therefore, A2 D 5 N 2

N

(

k51

~A1!

~ k21 ! 2 .

The finite sum can be done, and D 2 5A 2

G )

11

~A2!

1 . 2 ~ N21 !

~A3!

The second term inside the square root can be neglected when the number of sources becomes large; even if there are as few as four sources, dropping this term makes less than a 10% change in D. In the limit that all responses are extreme responses, statistic sI can be calculated from the differences between the extremes and the mean. If there is no bias, the mean of the extremes is (N11)/2, and s 2 5A 2

FS

1 N11 12 2 2

D S 2

1

1 N11 N2 2 2

DG 2

~A4!

N21 G 5 . 2 2

~A5!

The extreme response results for D and sI @Eqs. ~A3! and ~A5!# are the correct limits for the statistical technique used 3556

~A7!

G

A6

A

11

2 . N21

~A8!

Equation ~A8! is less than ~A3! as expected. Similarly sI can be calculated from N

N

A2 s 5 P ~ k 8 u k [email protected] k 8 2R ~ k !# 2 , ~A9! N k 8 51 k51 where R(k) is the mean response given source k. In the random guessing limit and in the absence of bias, the mean response to source k is the mean location, independent of k, R(k)5(N11)/2. Therefore,

( (

2

F

G

A2 N11 2 s 5 k 82 . N k 8 51 2 Doing the finite sum leads to

(

s 2 5A 2

N 2 21 , 12

~A10!

~A11!

and in terms of span G, sI 5

G

A12

A

11

2 . N21

~A12!

Equation ~A12! is less than ~A5! as expected. From Eqs. ~6!, ~A8!, and ~A12!, C5sI and the overall rms error is equally divided between variability and central bias. 1

,

so that sI 5A

D5

N

~ 2N21 !~ N21 ! . 6

A

~ N21 !~ N11 ! , 6

or, in terms of span G,

2

Because the span is G5(N21)A, D5

N

( (

2

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

This kind of bias, depending on the source, was called ‘‘sensory’’ bias by Hartmann and Rakerd ~1989!. Mathematically, it behaves similarly to the ‘‘response bias’’ introduced by Braida and Durlach ~1972!, which, however, is a function of the response and not the source. 2 Searle et al. ~1976! concluded that the width of the internal distribution scales with the span of the sources. This conclusion paralleled the earlier discovery that the width for absolute identification of intensities scales with the range of intensities ~Durlach and Braida, 1969; Braida and Durlach, 1972!. A problem with this parallel is that the work by Searle et al. ~also Shelton and Searle, 1978! failed to distinguish between width and bias. The more recent work by Koehnke and Durlach ~1989!, while not strictly inHartmann et al.: Source identification method

3556

volving localization, may have remedied that problem. That work found incomplete scaling, as predicted by Hartmann and Rakerd ~1989!. Braida, L. D., and Durlach, N. I. ~1972!. ‘‘Intensity perception II. Resolution in one-interval paradigms,’’ J. Acoust. Soc. Am. 51, 483–502. Durlach, N. I., and Braida, L. D. ~1969!. ‘‘Intensity perception I. Preliminary theory of intensity resolution,’’ J. Acoust. Soc. Am. 46, 372–383. Hartmann, W. M. ~1983a!. ‘‘Localization of sound in rooms—The effect of a visual fixation’’ Proc. 11th ICA, 139–142. Hartmann, W. M. ~1983b!. ‘‘Localization of sound in rooms,’’ J. Acoust. Soc. Am. 74, 1380–1391. Hartmann, W. M., and Rakerd, B. ~1989!. ‘‘On the minimum audible angle—A decision theory approach,’’ J. Acoust. Soc. Am. 85, 2031–2041. Koehnke, J., and Durlach, N. I. ~1989!. ‘‘Range effects in the identification of lateral position,’’ J. Acoust. Soc. Am. 86, 1176–1178. Rakerd, B., and Hartmann, W. M. ~1985!. ‘‘Localization of sound in rooms.

3557

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

II: The effect of a single reflecting surface,’’ J. Acoust. Soc. Am. 78, 524–533. Rakerd, B., and Hartmann, W. M. ~1986!. ‘‘Localization of sound in rooms. III: Onset and duration effects,’’ J. Acoust. Soc. Am. 80, 1695–1706. Searle, C. L., Braida, L. D., Cuddy, D. R., and Davis, M. F. ~1975!. ‘‘Binaural pinna disparity: another auditory localization cue,’’ J. Acoust. Soc. Am. 57, 448–455. Searle, C. L., Braida, L. D., Davis, M. F., and Colburn, H. S. ~1976!. ‘‘Model for sound localization,’’ J. Acoust. Soc. Am. 60, 1164–1975. Shelton, B. R., and Searle, C. L. ~1978!. ‘‘Two determinants of localization acuity in the horizontal plane,’’ J. Acoust. Soc. Am. 64, 689–691. Wightman, F. L., and Kistler, D. ~1989!. ‘‘Headphone simulation of freefield listening. II: Psychophysical validation,’’ J. Acoust. Soc. Am. 85, 868–878. Woodworth, R. S. ~1938!. Experimental Psychology ~Holt, New York!.

Hartmann et al.: Source identification method

3557

~Received 8 June 1997; revised 25 August 1998; accepted 28 August 1998! The source identification method is a standard psychophysical procedure for studying the ability of listeners to localize the source of a sound. The method can be described in terms of a statistical model in which listeners’ responses are determined by the width and bias of an internal distribution. This article presents a theoretical study of the method, particularly the relationships between the average experimental observables, rms error and variability, and parameters of the internal distribution. The theory is tested against source-identification experiments, both easy and difficult. Of particular interest is the experimental dependence of observable statistics on the number of sources in the stimulus array, compared with theoretical predictions. It is found that the model gives a good account of several systematic features seen in the experiments. The model leads to guidelines for the design and analysis of source-identification experiments. © 1998 Acoustical Society of America. @S0001-4966~98!02712-X# PACS numbers: 43.66.Qp, 43.66.Yw @RHD#

INTRODUCTION

The source-identification method is an experimental technique for studying the ability of human ~or other! listeners to localize the source of a sound. The method is easy to describe. The listener is in an environment with a number, N, of sound sources. One source is caused to emit a signal, and it is the listener’s task to identify the location of the source. The location may be identified by name, number, or by coordinates on a prearranged scale. Over trials the listener receives presentations from all the sources, typically many times. The source-identification method, hereafter called the ‘‘SIM,’’ is especially applicable for localization experiments in a room. Here, the experimenter may be interested in localization as a function of the signal, or the listener, or the room itself. However, because of standing waves in the room, an experiment done with a sound source in any one location may be special and not representative of the system of interest. By averaging performance over a number of source locations, the experimenter achieves greater generality. Therefore, SIM data are normally averaged over the source array. The SIM is naturally modeled in terms of statistical decision theory ~Searle et al., 1975, 1976; Hartmann, 1983b!. The present article is primarily a theoretical study of that model. It shows how observable variables, rms error and variability, averaged over the source array, are related to parameters of the model internal distribution. Therefore, this article provides a guide to the design of SIM experiments that are intended to discover the internal parameters. The article is concerned especially with the choice of the number of sources to be used in an experiment that measures localization ability over a fixed angular range.

The SIM experiments studied here are constrained by the following assumptions: First, it is assumed that the allowed response set is identical with the stimulus set. For example, there might be N524 loudspeakers in front of a listener labeled 1 through 24. After presentation of a sound from one of the speakers, the listener must respond with a number from 1 to 24. Next, it is assumed that the sources are equally spaced by a common angle, A, measured in degrees along a single angular dimension, for example azimuth or elevation. For definiteness, the following discussion will be couched in terms of the azimuthal dimension, but the method is applicable to sources in any plane. The decision theory model used for calculations below is one dimensional. Therefore, the model is inappropriate when the perceptual character of the localization task is multidimensional. It is assumed that sources are arranged over part of a circle, to be called the span, with angular extent G5(N21)A, and with source number 1 at one extreme and source number N at the other. A SIM experiment begins with a choice of statistics to describe localization error. Searle et al. ~1975, 1976! used the absolute value of the discrepancy between response and target. Hartmann ~1983b! used the root-mean-square ~rms! error, which has theoretical advantages described below. The rms statistic is designated by the symbol D, the square root of an average squared error, computed as follows:

A

D5 D 5

W.M.H. is at the Department of Physics and Astronomy. b! B.R. is at the Department of Audiology and Speech Sciences. c! J.B.G. is at the Department of Mechanical Engineering at the University of Texas at Austin. 3546

J. Acoust. Soc. Am. 104 (6), December 1998

W ~ k ! D 2~ k ! ,

~1!

k51

where W(k) is the fraction of the trials on which source k was presented, and D 2 (k) is the mean square localization error for source k. This function is given by 1 D ~ k ! 5A Mk 2

a!

A( N

2

2

Mk

( ~ R i 2k ! 2 ,

i51

~2!

where R i is the listener’s response—on the scale of source numbers—to the ith trial on which source k is presented.

0001-4966/98/104(6)/3546/12/$15.00

© 1998 Acoustical Society of America

3546

There are a total of M k of such trials. Equation ~1! introduces the notation whereby a bar over a symbol indicates an average over sources and a bar under a symbol indicates the square root of that average. Statistic D includes both variability and constant error. A second statistic, sI , measures only variability by computing error with respect to the mean response. It is the square root of quantity s 2 given by N

s 5 2

(

k51

~3!

W ~ k ! s 2~ k ! ,

where the variability for source k is given by 1 s ~ k ! 5A Mk 2

2

Mk

( @ R i 2R ~ k !# 2 ,

~4!

i51

and R(k) is the average response of the listener—in terms of source numbers—when a given source k is presented, R~ k !5

1 Mk

Mk

( Ri .

~5!

i51

Statistic s(k) is a biased estimate of response variability that tends to underestimate the actual standard deviation for small sample sizes. For comparison with the variability observed experimentally or in a Monte Carlo simulation s(k) should be multiplied by AM k /(M k 21), a factor which becomes important if the number of presentations is small. In addition to variability, there is constant error. The constant error, C(k), measured in degrees, is the difference between the true location of a source, k, and the mean perceived location of the source, C(k)5A @ R(k)2k # . It may be positive or negative except when k is a well-defined extreme location. Rakerd and Hartmann ~1986! noted a Pythagorean relationship among rms error, variability, and constant error: D 2 ~ k ! 5s 2 ~ k ! 1C 2 ~ k ! .

~6!

Therefore D(k) was called the overall error. It follows that D 2 5s 2 1C 2 , where C 2 is an average over sources analogous to D 2 and s 2 . The calculations below are devoted to calculating these statistics, particularly D and sI . I. DECISION THEORY MODEL

The decision theory model for a listener’s response, given a sound coming from source k, includes several basic assumptions. The first is that the listener has an internal coordinate u for the source positions, undoubtedly established visually if the sources are visible, and that the presentation of source k leads to a normally distributed representation of location cues on that coordinate system. The probability density that source k leads to internal value u is given by P~ u !5

1

s k A~ 2 p !

e 2 ~ u 2 u k 2b k !

2 /2s 2 k

.

~7!

Here, parameter u k is the location on the reference coordinate for source k, and b k is a bias such that the acoustical cues for source k are not centered exactly on this referent.1 Bias leads to constant error, C(k), and increases the size of the overall error, D(k). 3547

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

A key parameter is the angular standard deviation, s k , called the width of the internal distribution, or, simply, the width. It depends on the listener, the type of sound that must be localized, the environment in which the experiment is performed, and the position of the source. The sound may be easy to localize ~small s k ), e.g., a broadband impulsive noise, or it may be difficult ~large s k ), e.g., a spectrally sparse tone without onset transient. Normally, the purpose of a SIM experiment is to determine the width as a function of experimental conditions. Because the width is not zero the listener makes inconsistent responses to a given source. The width is generally a function of k because some sources are more difficult to localize than others. In the azimuthal plane sources to left and right are more difficult than sources in front, and in the median sagittal plane sources overhead are more difficult than others. A second assumption of the model calculation is that responses are quantized; when a listener experiences internal coordinate u, the listener responds by choosing the source with referent u k that is closest to u. ~Alternatively, listener responses on a continuum scale may be quantized in the process of recording the data.! There are two kinds of calculation, terminated span or wrapped span. For a terminatedspan calculation, the span has well-defined ends, typical of a span that is much less than a complete circle. Here, the probability of making a particular response given a particular source is a simple monotonic function of the distance along the span between the two locations. By contrast, a wrappedspan calculation includes both errors along the span and error outside the span; it is defined in more detail below.

A. Calculations without bias

The present section examines statistics D and sI when there is no bias (b k 50). The calculations were motivated by the conjecture that for a given source array span, the values of overall error, D, and variability, sI , should be insensitive to the number of sources in the array. The logic was simple: As the number of sources is reduced the listener is less likely to make an error because the sources are farther apart. However, when the listener does make an incorrect choice, the contribution to the overall error sum is a larger number of degrees. The conjecture that D and sI should be insensitive to N follows from the expectation that these two effects should largely cancel one another. One purpose of the calculations below was to test that conjecture. The dependence of D and sI on the number of sources was tested in a computation where each source is presented an equal number of times @ W(k)51/N # . The calculation used an analytic form for the cumulative normal function to determine the probabilities of each possible response for each possible source. 1. The small-span limit

A source array with a small span extends over a limited range of azimuth values. Therefore, a small-span sourceidentification experiment can provide the same information Hartmann et al.: Source identification method

3547

as a minimum audible angle experiment with the advantage that the source-identification method should be less sensitive to standing waves in the environment. When the span is small, the width may be regarded as independent of the source number, i.e., s k becomes a constant, s 0 . Calculations in the small-span limit are normally terminated-span calculations. From the structure of the equations it is possible to come to some general conclusions. There is reason to expect that function D 2 (k) should be approximately equal to s 20 , because the second moment of a normal density is the variance. Function D 2 (k) resembles the second moment of density P. This is a theoretical advantage of the rms quantities D and sI . However, D 2 (k) is not exactly equal to s 20 , both because the formula is a discrete sum—not an integral—and because of end effects. In the limit that the width s 0 becomes very small while the number of sources N becomes large, D(k) approaches s 0 , as long as k is not close to the edges of the source array. In those limits, the discrete sum approaches an integral, and end effects are not important because the distribution has little strength near the ends. Also, in those limits the value of D approaches s 0 because the fraction of sources near the end becomes small, and D is determined primarily from values of D(k) that are away from the ends. A logical problem with terminated-span calculations is that when the width s 0 becomes comparable to the source span G, the model sometimes predicts performance that is worse than random guessing. When this unreasonable result occurred in calculations below, the calculations were halted and the limiting point was noted in the graphical presentation of the results. The random guessing limits for D and sI are given by Eqs. ~A8! and ~A12! of the Appendix, where they are derived. The results of the calculations are given in scaled units, normalized to either the span G or the width s 0 . Therefore, the calculations are not immediately applicable to any particular experiment, but, with a little work, they are applicable to all particular experiments. Parameter s 0 is always given in units of the span. The work of Searle et al. ~1976! suggests that the internal width s 0 increases in proportion to the span. Therefore, the normalized parameter s 0 /G, as used here, is a convenient choice.2 Figure 1 shows the predictions of the analytic cumulative normal calculation for D as a function of increasing number of sources, N. The figure shows that D converges to the width when N is large and s 0 is a small fraction of the span. For example, when s 0 /G50.025, D converges to within one percent of s 0 when there are 50 sources. When s 0 /G is not small, D always converges to a value that is less than s 0 . The discrepancy is caused by end effects, but see Sec. I A below. Figure 1 also shows that the expected value of D is close to its asymptotic value ~for large N! when there are enough sources that the spacing between the sources is less than or equal to s 0 . These adequate values of N are indicated with a filled star. Although Fig. 1 shows that D/ s 0 decreases with increasing s 0 , in fact, D itself increases monotonically with increasing s 0 : the larger the width, the larger the rms error. 3548

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 1. rms error, D, expressed in units of s 0 , the width of the listeners’s localization probability density function. Statistic D is presented as a function of the number of sources in the array, assuming that span G remains constant. The parameter is s 0 in units of the span. A filled star indicates the value of N where the spacing between sources is equal to the width s 0 .

The quantity D/ s 0 decreases because D increases less rapidly than linearly with increasing s 0 . For practical purposes, Fig. 1, and other figures in this article, must be used iteratively to find a self-consistent solution for the width. The experimenter begins by knowing G and N. The experimenter measures D. The self-consistent calculation begins with the assumption that s 0 5D. This leads to a value of the graph parameter s 0 /G. The graph then leads to a predicted value of D/ s 0 , and hence a revised value of s 0 . Because the plots in Fig. 1 are smooth, one expects the calculation to converge to a stable value of s 0 after only one or two iterations. The insensitivity of D to the number of sources is further demonstrated in Fig. 2, which shows D/ s 0 as a continuous function of s 0 /G. The calculated value of D varies by less than 10% as the number of sources is varied, provided that there are at least six sources and s 0 is greater than 5% of the span. When s 0 is greater than 20% of the span, D becomes extremely insensitive to the number of sources. Parallel calculations for variability, sI , for the case of no bias show that sI is very similar to D, as would be expected.

FIG. 2. rms error, as a function of the continuous variable s 0 /G, the width of the listener’s internal distribution expressed as a fraction of the span. Each function is cut off at the random guessing limit. Hartmann et al.: Source identification method

3548

Although sI is logically required to be smaller than D, calculated plots of sI vs N or sI vs s 0 /G almost coincide with the corresponding plots for D ~Figs. 1 and 2! so long as the width is less than 10% of the span ~i.e., s 0 /G,0.1). The discrepancy between sI and D grows as s 0 /G increases, but the difference is not more than 10%, even when s 0 /G is as large as 0.5. 2. Spans approaching 180 degrees

As the source span increases it becomes more important to take account of the dependence of the width on source location. For definiteness, we continue to assume that the sources are in the horizontal plane. The dependence of the width, s k , on the angular position of the source, u k , is modeled by assuming a constant difference limen for the interaural time difference. This model is known to capture some, but not all, of the azimuthal dependence of the width. In this model, the localization error is inversely proportional to the derivative of the interaural time difference with respect to angular position. For an azimuthal coordinate system, with u 50° directly in front of the listener, the interaural time difference is described by the Woodworth formula ~1938!, Dt5 a ~ u 1sin u ! ,

~8!

where u is in radians and a is a constant equal to the head radius divided by the speed of sound. Differentiating with respect to u and inverting gives du 1 5 . d ~ Dt ! a ~ 11cos u !

~9!

Since s k is proportional to d u k ,

s k5

2s0 , 11 u cos u k u

~10!

where s 0 is the width directly in front of the listener. The absolute value in the denominator is necessary to account for the sign of cos u in the different quadrants. As the span approaches 180°, there is a second, and structurally more important, effect that must be considered in the computations, namely ‘‘wrapped’’ probabilities. If, for example, the source is at 80° to the left of center, the probability of choosing a response that is 70° to the right of center is not just the probability of making an error of 150°; one must add also the probability of making an error of 210° (36021505210). The need to include wrapped probabilities signifies the departure from the terminated-span calculation considered in Sec. I A 1. For example, it is no longer necessary to consider the random guessing limit because large probabilities for responses off the ends of the array are correctly wrapped. The calculations shown in Figs. 3 and 4 below include both the effect of source-dependent width and wrapped probability. Figure 3 illustrates how D depends on span G when the array is centered on the forward direction and extends equally to the listener’s left and right by G/2. The figure shows the effect of the variation of s with source angle for various values of s 0 when the number of sources is large. If the span is small, s is approximately constant. The fact that 3549

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 3. rms error as a function of span G when width s changes with source position such that the width expressed as interaural time difference remains constant. rms error D is normalized to the width directly in front of the listener s 0 . The number of sources in the calculation was N550.

D/ s changes by less than 10% as G increases to about 120° shows that the assumption of constant ITD is equivalent to a constant-sigma approximation even as a source span becomes as large as 660°. As the span increases beyond 120° D begins to rise. When s 0 <0.1G this rise is proportional to the increase in the average value of s. Therefore, if the plot of D is normalized to the value of s averaged over the span the plot becomes almost a flat line, independent of G. The average value of s from integrating Eq. ~10! is ¯s 54 s 0

tan~ G/4! G

~ G< p ! ,

22tan~ p /22G/4! ¯s 54 s 0 G

~11! ~ p ,G<2 p ! ,

where G is expressed in radians. For s 0 greater than 10% of G, the average-sigma model is less successful. For a span greater than 160°, there is an anomalous curvature when s 0 50.2G.

FIG. 4. rms error for source-dependent width as a function of the number of sources. The span is 180° centered on the forward direction. This figure can be compared with Fig. 1 to see the effects of source-dependent width and wrapped probability. The tick mark on the right axis shows the average width over 180°. Hartmann et al.: Source identification method

3549

Figure 4 shows D as a function of the number of sources for a span of 180°. As described in connection with Fig. 3, the asymptotic values in the large N limit are similar to Fig. 1 except that they are scaled by the average of s / s 0 . From Eq. ~11! for G5180°, this is equal to a scale factor 4/p or 1.27. Figure 4 shows that when N is not asymptotically large this simple scaling does not always apply. The figure also shows that D does not vary monotonically with s 0 ; the value for s 0 50.2G seems to be out of order. Figure 3 suggests that this nonmonotonic behavior is restricted to spans greater than about 160°. The curiously large curvature for the plot with s 0 50.2G occurs only for such large spans. The nonmonotonic behavior is the result of the combined effects of source-dependent width and wrapped probability. Calculations that exclude either one of these show only a monotonic dependence on width. Calculations with a 180° span and wrapped probability were also done for a constant ~source-independent! value of the width. The calculations led to a plot of D vs N that was almost identical to the terminated-span calculation in Fig. 1, except for the extreme case, s 0 50.4G. For both, D systematically underestimated the width. For the terminated span the reason was end effects, as noted in Sec. I A 1. For the wrapped span the reason is the wrapped probabilities themselves. If the width is less than 20% of the span, wrapped probability has a negligible effect on D(<1%) when the span is not greater than 180°. Because wrapping complicates the analysis of data, an experimenter would do well to avoid spans approaching 180° if the experimental conditions promote large internal width, 30° or more.

FIG. 5. rms error for source-dependent width, and for a large span, G 5270°. Part ~a! does not give the subject the benefit of a front-to-back reflection; part ~b! has reflection scoring.

3. Span greater than 180 degrees

When a span exceeds 180°, the source array cannot be entirely in front of the listener. Some sources must extend toward the rear, and this changes the perceptual nature of the localization task. Sources which differ considerably in azimuth may lie on the same cone of confusion and be perceptually similar. This multidimensional aspect of perception is not captured in our one-dimensional localization model. For purposes of illustration we proceed with the model anyway. When G becomes greater than 180°, the array itself wraps around so that some sources are closer to each other across the gap between source 1 and source N than along the span. This possibility requires a new computational rule for scoring such that the maximum error charged against the listener is 180°. Any error that is found to be greater than 180° is replaced by its 360° complement. Thus for any pair of sources in the array, there is a unique magnitude and direction of the difference between them. When the source array extends behind the listener, it is common to deal with the multidimensional character of the task by regarding confusions between front and back sources as separate from azimuthal confusions. Therefore azimuthal errors are computed by giving the listener the benefit of a reflection in the frontal plane ~includes the points at 690° azimuth and the point overhead! if that leads to a smaller error ~Wightman and Kistler, 1989!. Below, the calculations that employ that rule are called ‘‘reflection scoring.’’ It is not 3550

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

necessary that an actual source be present at the site of the reflection. When reflection scoring is introduced, the final value of the error is the smallest of the listener’s choice or its 360° complement, or the reflected choice or its 360° complement. As an example of large spans, we chose a span G 5270°. The array was centered on the midline, with one end at 2135°, the other end at 1135°, and the remaining sources (N22) equally spaced in between. The internal width was taken to depend on source angle per Eq. ~10!. Figure 5~a! shows the results without reflection scoring. As before, D/ s 0 is quite insensitive to the number of sources. Upon careful observation, periodic variations can be observed in the D/ s 0 data, especially for small s 0 . This effect is due to the arrangement of the sources based on G and N. When G5270°, there are sources located at u 5690° whenever N56n11 ~where n51,2,...). This creates peaks because the average s k is increased. ~The same effect occurs for circular spans whenever N54n.) The analogous plot of sI / s 0 is the same as D/ s 0 in Fig. 5~a! within 10%, except when s 0 /G50.4 where the discrepancy becomes about 15%. Figure 5~b! shows the effect on D when reflection scoring is introduced. The values of D are generally reduced, of course. Further, the tendency for peaks at N56n11 is greatly enhanced. A better description of the effect is that Hartmann et al.: Source identification method

3550

reflection scoring introduces a valley centered on N values given by N56n14. Valleys result from source placements in which the localization score benefits the most when the listener is given credit for a correct answer despite a frontto-back reversal. According to Eq. ~11!, the average width, ¯s (270), is equal to 1.35s 0 . In the limit of a large number of sources, D agrees very well with the expectation D51.35s 0 if reflection scoring is not used @Fig. 5~a!#. Only as the ratio of width to span grows to 0.4 is there appreciable departure. ~For a 270° array a ratio of 0.4 means that the internal width is more than 100°, a case of extreme uncertainty.! Even if the number of sources is not large, the D values in Fig. 5~a! do not differ from the expected value by more than about ten percent. The same statements cannot be made about the calculation with reflection scoring @Fig. 5~b!#. Then statistic D is less stable both with respect to s 0 and with respect to the number of sources. The peak and valley structure is, however, particularly apparent for a 270° span. For general span G(G.180°), peaks and valleys are not as frequent. A peak occurs for N sources when there are two integers N and k that satisfy the condition N5

2G ~ 2k21 ! 11 , 2G11

~12!

where G is the span fraction, G5G/360. It is somewhat difficult to evaluate the significance of the structure observed for reflection scoring because we do not believe that our one-dimensional calculation is appropriate perceptually for sources that extend to the rear. However, this objection to the calculation is not fatal. The actual cause of the valleys in the structure is a series of source locations that particularly benefit the listener when reflection scoring is introduced. To some degree, this experimental artifact is bound to appear with reflection scoring. The precise size of the artifact depends on the perceptual model. 4. Summary

At the outset of this section on the SIM without bias, it was conjectured that the values of D and sI might be insensitive to the number of sources. It was expected that the smaller probability of making an error when the number of sources is small would be compensated by the larger penalty when an error is actually made. Therefore, it was further conjectured that experimental values of D and sI should provide reliable estimates of internal width s. In the end, Secs. I A 1–3 above support these conjectures. The conjectures hold for a wide range of widths and source spans. However, the relationship between quantities D and sI and the width parameter depends on the width parameter itself, in the form s/G, as shown by Figs. 1–5. Therefore, an actual determination of the width from D or sI may require some modest iteration. The functions in the figures are so well behaved that convergence is assured. B. Calculations with bias

The model of Sec. I A described a listener without bias. When the sound originated from source k, the internal distri3551

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

bution for auditory localization cues was centered at the location u k , corresponding to the reference position of the source as established visually. It is this reference coordinate that the listener uses in making responses. Therefore, the statistics of the responses to source k depended only on a single parameter, the width s k . The model without bias is, however, an idealization. Unfortunately, in sound localization, bias is the rule and not the exception. Bias is introduced by visual cues ~ventriloquism! and by acoustical cues, such as the reflections from walls in an asymmetrical room environment. Bias can be introduced into an experiment deliberately; a large visual bias is caused by directing a listener’s gaze to the end of a source array ~Hartmann, 1983a!. A large acoustical bias can be created by putting a single reflecting surface in an otherwise anechoic room ~Rakerd and Hartmann, 1985!. But although bias can be experimentally controlled, it cannot be entirely eliminated; it is normally present for any listener whether one wants it or not ~Hartmann, 1983b!. Bias consists of a displacement of internal acoustical cues with respect to the angular reference coordinate system, u k . Therefore, bias can be seen in plots of R(k), and it is measured for individual sources by constant error C(k). An average measure of bias is C. Because the rms error, D, includes C @Eq. ~6!#, the bias also appears in D. In this article we take the view that the goal of the experimenter is to use the source identification method to learn about the width of the internal distribution s. The presence of bias poses a problem, and the purpose of the present section is to try to deal with it. Although s can be determined from either sI or D in the absence of bias, the presence of bias has a major direct effect on D which makes it unreliable for estimating s. By contrast, the variability sI should, in principle, be independent of bias because variability is calculated with respect to the mean response made by the listener and not with respect to a physical referent. In practice, however, sI is affected by bias, both because of effects at the ends of the arrays and because of the quantization of the responses. Therefore, statistic sI is the best statistic to use to estimate s in the case of bias, but it is not without troubles of its own, as will be seen below. What makes it difficult to discuss bias is that bias can take many forms. Below, we deal with two types, constant bias and central bias. Calculations are presented in the small-span limit. 1. Constant bias

Constant bias means that the displacement of the acoustical cues with respect to the reference coordinate system is constant, independent of the source. Constant bias is a common occurrence, especially if the array of sources is small. The effect of directed gaze on the localization of sources in a 28° span was found to be modeled best by a constant bias ~Hartmann, 1983a!. Numerical studies, using the decision theory model and constant width, on the effects of constant bias showed that bias can always be neglected if the number of sources is large enough. If the bias is large, it may not be practical to run as many sources as are needed for sI to give a good estimate of s 0 , but large N is an important limit to keep in Hartmann et al.: Source identification method

3551

FIG. 6. The role of constant bias. Open symbols show variability sI when there is no bias. Filled symbols show the effect of making the bias, b, twice the width, s 0 . Two values of s 0 /G are shown.

mind. The effect of bias on sI depends sensitively on the ratio of bias b to width s 0 . Bias effects are shown in the plot of sI in Fig. 6 for the special case that the bias is twice the width. The filled-circle plot in Fig. 6 shows sI when the bias is 5% of the source span (b/G50.05) and s 0 /G50.025. It can be compared with sI in the absence of bias ~open circles!. When there is no bias, sI gives a good estimate of s 0 if the number of sources is about N514 or greater. Adding the bias has a dramatic effect on the variability, leading to a peak at N 59. The peak overestimates s 0 by a factor of 2. The behavior shown by the circles in Fig. 6 is typical. Whenever the bias is twice the width there is a peak ~height 1.5,sI / s 0 ,2.5) as a function of N. The peak occurs at N 5N max , where N max'Int(0.2G/ s 0 )11. Not surprisingly, a given bias has the largest effect for the smallest s 0 , and the number of sources needed to eliminate that effect may become large. The square symbols in Fig. 6 check the above statements when the bias is b/G50.2 and the width is s 0 /G50.1. If the bias becomes as large as 4 s 0 , sI becomes an oscillating function of N and cannot estimate s 0 . On the other hand, if the bias is no larger than s 0 itself then the effects of bias on sI are less than 10%, so long as there are four or more sources in the array and s 0 /G is larger than about 0.02. Then it is possible to ignore the bias in determining s 0 as the large-N limit of sI . 2. Central bias

Whereas constant bias is necessarily directed toward one end of the source array or the other, central bias is directed toward the center of the array. In the common case of a symmetrical array with the subject looking at the center, a central bias may be a visual effect. In general, any central tendency, such as a reluctance to choose extreme responses, appears as a central bias. The central bias function itself might take different forms: straight line, S-curve, step function, etc. The calculations of this section employ a step-function bias function because the experiments described below often found R(k) functions approximately of this form. In a step-function bias 3552

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

the auditory cues for all the sources to the left of center are biased toward the right by a constant (b l ) and all the sources to the right of center are biased toward the left by a constant (b r ). The bias can be characterized by a single central-bias parameter b c if it is symmetrical (b l 5b c and b r 52b c ). Model calculations for small spans indicated that sI calculated with central bias was very similar to sI calculated with a constant bias of the same magnitude. Typical differences between the two kinds of bias were less than 10% for N large enough to provide a reasonable estimate of sI . The sign of the difference was always the same; central bias led to the larger sI . The difference grew with increasing bias magnitude. However, as long as the bias was not greater than twice the width, the difference was less than 33% even when the bias was as large as 80% of the span. II. EXPERIMENTS

To test the model calculations we performed localization experiments. We were particularly interested in how D and sI depend on the number of sources in a given span. Therefore, the experiments were performed using 3, 6, 12, and 24 sources. A. Tasks

In order to test the computations in several ranges of s 0 , we used two tasks, one in which the localization was easy and one in which it was difficult. Both tasks were performed in a reverberation room. 1. Easy (EL) experiment

In the easy localization ~EL! task, listeners sat 3 m away from an array of speakers in the horizontal plane. The array extended 23° to the left and right of the midline (G546°). Broadband noise at a level of 55 dB SPL was given a stepfunction amplitude envelope and played through one of the speakers. The subjects’ task was to declare which loudspeaker had sounded. 2. Difficult (DL) experiment

The difficult localization task ~DL! was made much more difficult than the EL task. Listeners were 6 m away from the source array, again in a 46° span. Because of the larger distance to the source, incoherent reverberant sound was a larger fraction of the total sound power, making localization more difficult. The stimulus was broadband noise that had been low-pass filtered ~corner frequency of 5 kHz, 248 dB/octave!. Therefore, listeners could not use highfrequency interaural intensity cues that are especially helpful in this room. The SPL of the noise before filtering was identical to the EL experiment. The filtered noise was given a linearly rising amplitude envelope with a duration of 2 s. During the onset, uncorrelated broadband noise was played at a level of 85 dB through a speaker behind the subject’s neck to mask the onset of the stimulus. Therefore, listeners gained no benefit from the precedence effect, further degrading localization ability. Again, the task was to declare which loudspeaker sounded. Hartmann et al.: Source identification method

3552

B. Method

The reverberant room was rectangular with dimensions 7.6736.3533.58 m high. It had a reverberation time of 4 s at midrange frequencies. The orientation of the array in the room is best described as a nonspecial geometry. The 24 loudspeakers were Realistic Minimus 3.5, consisting of a single driver in a sealed box. They had been chosen from a set of 85 based on similar on-axis frequency response in an anechoic environment. The configurations for the different number of sources were as follows: N524⇒A52°⇒G546°, N512⇒A54°⇒G544°, N56⇒A58°⇒G540°, N53⇒A523°⇒G546°. The loudspeakers were at ear level of a seated subject. A bar rested on the head of the subject to help the subject maintain a constant, forward facing position. Each source was labeled with a number, and the subject made a response by using a button box to increment a numerical display up or down. The display reading was then recorded by the computer running the experiment. C. Subjects and procedure

Four subjects participated in these experiments. Subjects W, R, and G were males, ages 57, 45, and 21, respectively, and were the coauthors of this article. Subject J was a female of age 17. Subjects W and R had extensive experience in localization experiments and had high-frequency hearing losses typical of males their age. Subjects G and J had recent experience as subjects and had normal hearing. The experiments were performed in blocks of runs for both easy ~EL! and difficult ~DL! tasks. A block consisted of a run for each source spacing condition for either the EL or the DL case. The runs of a given block were performed on the same day, and the order of the runs within a block was randomized. Each run consisted of 48 stimulus-response pairs and lasted 10–15 min. Within each run, all stimuli were presented an equal number of times in random order. Therefore, a particular source was presented twice for N524, four times for N512, eight times for N56, and 16 times for N 53. There was no feedback, but a curious subject was allowed to view the results at the end of a run. Each subject did three blocks for both EL and DL conditions. D. Results

The experimental results appear in their greatest detail in plots of R(k), the average response of a listener to source k. For illustration, plots of R(k) are shown for listener W in Fig. 7~a! and ~b! for the EL and DL experiments, respectively. Perfect performance corresponds to an R(k) plot that is a 45-degree line. It can be seen that Fig. 7~a! approximates a 45-degree line, although there is considerable central bias, as described above. The plot for the DL experiment in Fig. 7~b! shows enormous deviations from the 45-degree ideal as 3553

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 7. Function R(k), the average response of listener W to source number k. Error bars are plus and minus the variability, s(k). Experiments with different numbers of sources ~N! are plotted on the same graph: stars for N524, open circles for N512, open squares for N56, and filled squares for N53. Part ~a! is for the EL experiment. Part ~b! is for the DL experiment. Each small division on horizontal and vertical axes corresponds to 2°.

well as central bias. Figure 7~a! and ~b! is typical of R(k) plots for all the listeners, although different listeners had different forms of bias, some better approximated as constant bias, not central. Of primary interest in the present article are the average quantities D and sI for the eight different conditions (N 524, 12, 6, and 3 for both the EL and DL experiments!. These are given in Table I, averaged over the three runs for each listener. These averages and corresponding standard deviations (n21) over the three runs appear in Figs. 8 and 9. III. COMPARISON—THEORY AND EXPERIMENT

The principal comparison between theory and experiment was a test of the prediction of the decision theory Hartmann et al.: Source identification method

3553

TABLE I. Experimental values of rms error (D), variability (s), and constant error (C) for four listeners in two source identification experiments, easy ~EL! and difficult ~DL!. The arrays spanned 46 degrees and included N53, 6, 12, or 24 sources. Values of width and bias are model parameters determined from the asymptotic variability and constant error, respectively. The parameters were used for model calculations in the comparison plots that follow.

Listener

N

Experiment ~degrees! D sI C

G

3 6 12 24

0 0 2.21 2.01

3 6 12 24

0 1.09 3.09 3.01

0 0.84 1.67 1.26

3 6 12 24

0 2.66 3.26 3.13

3 6 12 24

0 3.21 3.85 3.59

3 6 12 24

10.14 8.54 11.39 11.70

3 6 12 24

8.08 8.70 9.21 9.61

7.82 6.48 6.79 4.14

3 6 12 24

11.60 9.70 11.46 11.13

3 6 12 24

7.12 10.00 10.27 11.37

J

R

W

G

J

R

W

EL experiment 0 0 0 0 1.68 1.44 1.21 1.60

Model ~degrees! width bias

1.21

1.60

0 0.69 2.60 2.73

1.26

2.73

0 2.37 2.03 1.64

0 1.21 2.55 2.67

1.64

2.67

0 2.44 1.93 1.43

0 2.09 3.33 3.29

1.43

3.29

DL experiment 8.60 5.37 6.88 5.06 6.73 9.19 6.04 10.02

6.70

10.02

2.03 5.81 6.22 8.67

4.50

8.67

10.00 8.29 6.56 6.42

5.88 5.04 9.40 9.09

7.40

9.09

6.73 6.13 5.94 5.13

2.32 7.90 8.38 10.15

5.70

10.15

model for the dependence of sI and D on the number of sources in the array. This dependence was the primary focus of model calculations themselves, for small and large spans, with and without bias. Because the experimental span was only 46° the model calculation could be done in the small-span limit. The input parameters to the model were the width of the internal distribution and the bias. The bias was assumed to be of the constant type, or, equivalently, central. The biases observed experimentally were of both types, but, as described in Sec. I B 2, these two types of bias have similar effects on the average statistics of interest. It was assumed that the width and bias parameters depend only on the listener and the experimental conditions—EL or DL. Therefore, it was expected that the dependence on number of sources, for both D and sI , should be predicted by the model. 3554

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

FIG. 8. Comparison between experiment ~points! and model ~solid lines! for the easy localization ~EL! experiment. Each row is for a single listener, sI and D. Error bars are two standard deviations (n2152 weight! in overall length. Dashed lines connect the experimental points.

The procedure for assigning model parameters was simple. We assumed that the width should be determined by the large-N limit of sI , i.e., N524 in Table I. When the model width is small it is equal to sI (24); when the width is not small, it must be taken to be somewhat larger than the experimental sI (24) in order that sI (N) agrees with experiment in the limit that N524. We also determined the bias parameter from the experimental constant error, C(24), in Table I. As a measure of bias, this constant error approximately agreed with the vertical shifts seen in plots of R(k). For example, C(24) for listener W in Table I is 3.29°. This agrees with R(k) in Fig. 7~a!, which suggests a central bias averaging 1.5–2.0 divisions, or 3°–4°. Therefore, the nature of the comparison was to determine the model parameters from the width and estimated bias for N524 and to compare the model predictions, for both sI and D, with the experimental results for N53, N 56, and N512. The model parameters are shown in the right two columns in Table I. The comparisons between calculations and the EL experiments are shown in Fig. 8. The comparisons show that the model is in reasonable numerical agreement with experiment, even though the parameters were not chosen to provide an optimum fit. Further, the model captures a number of features seen in the experiments: There is a tendency for a Hartmann et al.: Source identification method

3554

FIG. 9. Same as Fig. 8 but for the difficult localization ~DL! experiment.

peak in sI and D as a function of N when the width is small. However, theoretically the peak is less prominent for D than for sI , and experimentally no significant peak appears in D. The comparisons between calculations and the DL experiments are shown in Fig. 9. In the DL experiments the width is large. For large width, theory and experiment agree that there is no peak for 3

IV. CONCLUSIONS

The source-identification method ~SIM! is a standard technique used to measure the ability to localize a sound. The method uses an array of source positions, which is particularly useful when there is reason to expect that the perception of any one source would be special. Such conditions occur in rooms. The experimental data from this method are in the form of variability ~theoretically insensitive to bias! and rms error ~includes both variability and bias!. The method can be analyzed with a decision theory model based on a coordinate system imagined to be internal to the listener. Sources from the physical world lead to distributions of localization cues on this internal coordinate, characterized 3555

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

by a model width and a displacement bias of the mean. A similar model was used to analyze the minimum audible angle method ~Hartmann and Rakerd, 1989!. Calculations are simplest for a terminated-span model. Here, the array is short enough that points on the internal coordinate that are to the left of the leftmost source must be assigned to the leftmost source; they do not wrap around and become confused with positions on the right. Terminatedspan calculations find that if bias is negligible, both the rms error and the variability can provide good estimates of the average width of the internal distribution if there are enough sources in the array. The results are very insensitive to the number of sources if the spacing between the sources is less than or approximately equal to the width. The variability appears to be a good measure of the width even in the presence of bias if the bias is smaller than the width. Bias that is larger than the width—a frequent occurrence—complicates the relationship between experimental results and the parameters of the internal distribution. The variability ~not the rms error! may still be a reliable measure of the width if the number of sources is large enough. To determine the required number of sources, one must model the bias in some way and fit the experimental data to width and bias model parameters. Two simple bias models, constant and central, were found to give similar results. When the angular span of the model is not terminated, probabilities are wrapped around a complete circle. Calculations indicate that the variability continues to provide a good measure of the internal width, as long as the width is not greater than 20% of the span. When the angular span of the actual sources is wrapped beyond 180°, source localization becomes a multidimensional perceptual problem, and the perceptual distance between two sources is not a monotonic function of the azimuth difference. Therefore, our one-dimensional model is not applicable. Applying the model anyway reveals complicated effects that occur when localization scores are given the benefit of a front-to-back reversal. Similar effects are expected to occur independent of the model. Finally, experiments with human listeners were done in order to test the model calculations. The experiments used a small span in which the number of sources varied from 3 to 24. To provide a stringent test, both easy localization ~EL! and difficult localization ~DL! experiments were done. The experiments were done in a reverberation room, and constant errors ~biases! were a major component of the overall errors. It was found that the model gave a reasonable account of the experimental results, even though the model treatment of bias was simple. To improve on the methods used here would require a treatment of bias peculiar to each individual listener. The resulting model would lead to better agreement with experiment, at the cost of generality. Because of its internal consistency and satisfactory experimental validation, the decision theory model in this article can serve as a guide to the design and analysis of source identification experiments. In the matter of experimental design, the model can determine the correct number of sources to use in an array, based on anticipated results. After the rms Hartmann et al.: Source identification method

3555

ACKNOWLEDGMENTS

in this article as the uncertainty becomes infinite. However, these limits are unreasonable because listeners can achieve better performance by guessing randomly among the sources. Better large uncertainty limits are the random guessing limits calculated below. If the N sources are presented equally often, D is the square root of

This research was supported by Grant No. DC00181 from the NIDCD of the NIH. Additional funding was provided by the National Science Foundation, which supported Joseph Gaalaas through its Research Experience for Undergraduates grant to the MSU Department of Physics and Astronomy. We are grateful to Joy Hsu for her participation in this study. Steve Colburn, Barbara Shinn-Cunningham, and Raymond Dye made many useful comments on a previous version of this article.

A2 D 5 P ~ k 8 u k !~ k2k 8 ! 2 , ~A6! N k 8 51 k51 where P(k 8 u k) is the probability of choosing source k 8 given that source k was presented. If listeners guess randomly then, in the absence of bias, they make each response k 8 equally often, independent of the source k, and P(k 8 u k)51/N. The double sum can be done and

error and variability data are experimentally known, the model can be used, first to decide whether a reliable value of the width of the internal distribution can be determined from the data, and second to calculate the actual values of the width and the bias.

N

D 2 5A 2

APPENDIX: LIMITS OF HIGH UNCERTAINTY

In the limit of high uncertainty, the width of the internal distribution becomes large compared to the span. In the extreme uncertainty limit, there is a negligible probability that the internal representation of the source lies within the span of allowed responses. Therefore, terminated-span model calculations find that all responses become extreme responses. In the absence of bias, sources 1 and N are chosen equally. Then D 2 is given by summing the squared differences between the N sources and the extremes. The two extreme sums get the same weight ~1/2!, and they are, in fact, equal. Therefore, A2 D 5 N 2

N

(

k51

~A1!

~ k21 ! 2 .

The finite sum can be done, and D 2 5A 2

G )

11

~A2!

1 . 2 ~ N21 !

~A3!

The second term inside the square root can be neglected when the number of sources becomes large; even if there are as few as four sources, dropping this term makes less than a 10% change in D. In the limit that all responses are extreme responses, statistic sI can be calculated from the differences between the extremes and the mean. If there is no bias, the mean of the extremes is (N11)/2, and s 2 5A 2

FS

1 N11 12 2 2

D S 2

1

1 N11 N2 2 2

DG 2

~A4!

N21 G 5 . 2 2

~A5!

The extreme response results for D and sI @Eqs. ~A3! and ~A5!# are the correct limits for the statistical technique used 3556

~A7!

G

A6

A

11

2 . N21

~A8!

Equation ~A8! is less than ~A3! as expected. Similarly sI can be calculated from N

N

A2 s 5 P ~ k 8 u k [email protected] k 8 2R ~ k !# 2 , ~A9! N k 8 51 k51 where R(k) is the mean response given source k. In the random guessing limit and in the absence of bias, the mean response to source k is the mean location, independent of k, R(k)5(N11)/2. Therefore,

( (

2

F

G

A2 N11 2 s 5 k 82 . N k 8 51 2 Doing the finite sum leads to

(

s 2 5A 2

N 2 21 , 12

~A10!

~A11!

and in terms of span G, sI 5

G

A12

A

11

2 . N21

~A12!

Equation ~A12! is less than ~A5! as expected. From Eqs. ~6!, ~A8!, and ~A12!, C5sI and the overall rms error is equally divided between variability and central bias. 1

,

so that sI 5A

D5

N

~ 2N21 !~ N21 ! . 6

A

~ N21 !~ N11 ! , 6

or, in terms of span G,

2

Because the span is G5(N21)A, D5

N

( (

2

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

This kind of bias, depending on the source, was called ‘‘sensory’’ bias by Hartmann and Rakerd ~1989!. Mathematically, it behaves similarly to the ‘‘response bias’’ introduced by Braida and Durlach ~1972!, which, however, is a function of the response and not the source. 2 Searle et al. ~1976! concluded that the width of the internal distribution scales with the span of the sources. This conclusion paralleled the earlier discovery that the width for absolute identification of intensities scales with the range of intensities ~Durlach and Braida, 1969; Braida and Durlach, 1972!. A problem with this parallel is that the work by Searle et al. ~also Shelton and Searle, 1978! failed to distinguish between width and bias. The more recent work by Koehnke and Durlach ~1989!, while not strictly inHartmann et al.: Source identification method

3556

volving localization, may have remedied that problem. That work found incomplete scaling, as predicted by Hartmann and Rakerd ~1989!. Braida, L. D., and Durlach, N. I. ~1972!. ‘‘Intensity perception II. Resolution in one-interval paradigms,’’ J. Acoust. Soc. Am. 51, 483–502. Durlach, N. I., and Braida, L. D. ~1969!. ‘‘Intensity perception I. Preliminary theory of intensity resolution,’’ J. Acoust. Soc. Am. 46, 372–383. Hartmann, W. M. ~1983a!. ‘‘Localization of sound in rooms—The effect of a visual fixation’’ Proc. 11th ICA, 139–142. Hartmann, W. M. ~1983b!. ‘‘Localization of sound in rooms,’’ J. Acoust. Soc. Am. 74, 1380–1391. Hartmann, W. M., and Rakerd, B. ~1989!. ‘‘On the minimum audible angle—A decision theory approach,’’ J. Acoust. Soc. Am. 85, 2031–2041. Koehnke, J., and Durlach, N. I. ~1989!. ‘‘Range effects in the identification of lateral position,’’ J. Acoust. Soc. Am. 86, 1176–1178. Rakerd, B., and Hartmann, W. M. ~1985!. ‘‘Localization of sound in rooms.

3557

J. Acoust. Soc. Am., Vol. 104, No. 6, December 1998

II: The effect of a single reflecting surface,’’ J. Acoust. Soc. Am. 78, 524–533. Rakerd, B., and Hartmann, W. M. ~1986!. ‘‘Localization of sound in rooms. III: Onset and duration effects,’’ J. Acoust. Soc. Am. 80, 1695–1706. Searle, C. L., Braida, L. D., Cuddy, D. R., and Davis, M. F. ~1975!. ‘‘Binaural pinna disparity: another auditory localization cue,’’ J. Acoust. Soc. Am. 57, 448–455. Searle, C. L., Braida, L. D., Davis, M. F., and Colburn, H. S. ~1976!. ‘‘Model for sound localization,’’ J. Acoust. Soc. Am. 60, 1164–1975. Shelton, B. R., and Searle, C. L. ~1978!. ‘‘Two determinants of localization acuity in the horizontal plane,’’ J. Acoust. Soc. Am. 64, 689–691. Wightman, F. L., and Kistler, D. ~1989!. ‘‘Headphone simulation of freefield listening. II: Psychophysical validation,’’ J. Acoust. Soc. Am. 85, 868–878. Woodworth, R. S. ~1938!. Experimental Psychology ~Holt, New York!.

Hartmann et al.: Source identification method

3557

*When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile*

© Copyright 2015 - 2021 PDFFOX.COM - All rights reserved.