Sound localization is fundamental to hearing. In nature, sound degradation and noise erode directional cues and can generate conflicting directional perceptions across different subcomponents of sounds. Little is known about how sound localization is achieved in the face of conflicting directional cues in non-human animals, although this is relevant for many species in which sound localization in noisy conditions mediates mate finding or predator avoidance. We studied the effects of conflicting directional cues in male grasshoppers, Chorthippus biguttulus, which orient towards signaling females. We presented playbacks varying in the number and temporal position of song syllables providing directional cues in the form of either time or amplitude differences between two speakers. Males oriented towards the speaker broadcasting a greater number of leading or louder syllables. For a given number of syllables providing directional information, syllables with timing differences at the beginning of the song were weighted most heavily, while syllables with intensity differences were weighted most heavily when they were in the middle of the song. When timing and intensity cues conflicted, the magnitude and temporal position of each cue determined their relative influence on lateralization, and males sometimes quickly corrected their directional responses. We discuss our findings with respect to similar results from humans.
Communication involved in mate selection requires localization so that potential mates can find one another. Localization of acoustic signals is particularly challenging because they are often transient and broadcast over long distances (Bradbury and Vehrencamp, 2011). Even in pristine acoustic conditions, sound localization is a difficult task for small-bodied animals because the short distance between the ears diminishes the magnitude of physical differences in the timing and intensity of signals arriving at the two ears (Gerhardt and Huber, 2002; Robert, 2005). However, most natural animal communication occurs in environments in which background noise and sound degradation are prevalent (Brumm, 2013). This reduces the quality of directional information in signals reaching the receiver, sometimes to the extent of obscuring or even reversing the perceived sound location (Gilbert and Elsner, 2000; Kostarakos and Römer, 2010; Michelsen and Rohrseitz, 1997; Römer, 2015). Furthermore, neuronal responses are not perfectly consistent, even in response to the same stimuli (Faisal et al., 2008; Neuhofer et al., 2011; Vogel et al., 2005). In principle therefore, different directional perceptions could arise from an acoustic signal broadcast from a single location.
Many animal sounds (e.g. ‘pulsed’ or ‘trilled’ songs) consist of repetitions of the same acoustic subunit (Mowles and Ord, 2012; Payne and Pagel, 1997; Price, 2013). This repetition may in part serve to aid receivers in localization because each repetition provides separate directionality cues (Ronacher and Krahe, 2000). However, because of the various sources of noise, these multiple cues may sometimes provide conflicting information, and little is known about how animals weight and accumulate directional information across multiple components of a signal. In general, how decisions are made based on temporally variable and conflicting evidence is a major question in biology, psychology and economics (Gold and Shadlen, 2007; Wyart and Koechlin, 2016). Studies of animals with clearly defined behavior and a relatively well-understood sensory system lend themselves well to addressing this challenge.
One solution to the problem of integration of conflicting directional information would be to simply weight each acoustic subunit equally and combine the evidence to arrive at a decision, which is an optimal strategy in stable (i.e. non-conflicting) environments (Bogacz et al., 2006). However, there are several reasons why this could be a poor solution in the context of mate localization. First, there is often high competition for mates, so animals that detect a high-quality mate must localize it quickly before another individual does (Corcobado et al., 2010; Danielson-François et al., 2012). Fast mate localization also reduces predation risk (Bonachea and Ryan, 2011; Magnhagen, 1991). Accordingly, models of decision making with temporal integration demonstrate that increased urgency to decision making results in heavier weighting of early arriving information (Carland et al., 2015). Second, acoustic signals are often not of a predictable duration. Therefore, animals that delay their decision too long risk losing critical directional information, particularly if directional cue integration is leaky (Glaze et al., 2015). Together, this suggests a trade-off between information gathering and speed of decision making (Bogacz et al., 2010; Chittka et al., 2009), in which we may expect that animals will act on less than the entire signal's worth of information.
In humans, a series of interrelated phenomena lead to directional cues accumulating with repetitions of acoustic subunits, but directional information is not weighted evenly across the course of the signal. The precedence effect arises when, for certain delays between a leading and a lagging sound source, these are perceived as a single sound localized in the direction of the leading source (Brown et al., 2015; Litovsky et al., 1999). Multiple repetitions of these sounds result in ‘buildup’ phenomena in which two sounds with even greater delay between them are perceived as originating from a single source (Clifton and Freyman, 1989; Freyman et al., 1991). There appear to be differences in the temporal weighting of the two primary types of directional cues: inter-aural time differences (ITDs) and inter-aural level (intensity) differences (ILDs). ITDs at the beginning of the stimulus are weighted more heavily than those at the end, and the very first repetition is especially important, a phenomenon known as onset dominance (Freyman et al., 1997; Saberi, 1996; Stecker and Hafter, 2002). Onset dominance may persist even when a single leading stimulus is followed by dozens of lagging stimuli (Freyman et al., 1997; Saberi and Perrott, 1995). ILDs at the beginning of the stimulus are also weighted heavily, although less so than ITDs (Brown and Stecker, 2010). Furthermore, for ILDs but not ITDs, the end of the stimulus is also weighted heavily in directional perception (Stecker et al., 2013).
We studied directional hearing in an insect species, the grasshopper Chorthippus biguttulus (Linnaeus). Although insects evolved hearing independently from vertebrates, comparisons between the two taxa are constructive in elucidating general principles of perception in complex environments even if the specific mechanisms involved are not homologous (Albert and Kozlov, 2016; Gerhardt and Huber, 2002; Manley, 2017). In C. biguttulus, acoustic signals are the primary mechanism for long-range mate attraction and localization. Males produce a calling song, and receptive females respond to attractive calling songs by producing a response song (von Helversen, 1997; von Helversen and von Helversen, 1997). Males use these response songs to localize females: through a series of turns and runs towards the female, interspersed with song exchanges, the male approaches the female and initiates close-range courtship (von Helversen and von Helversen, 1983, 1994). In C. biguttulus, pattern recognition seems to involve a neuronal summation of the inputs of the two ears, while directional information is processed in a parallel channel (von Helversen, 1984, 1997; von Helversen and von Helversen, 1995). Therefore, directional hearing can be studied using a two-speaker setup (Fig. 1A) in which time or intensity differences are introduced between the speakers, without confounding effects of stimulus attractiveness. Because of this parallel processing, the stimuli broadcast by the two speakers will be perceived by the animal as originating from a single source, therefore allowing us to simulate natural situations in which directional cues in female response song are conflicting because of environmental interference (Gilbert and Elsner, 2000; Kostarakos and Römer, 2010; Michelsen and Rohrseitz, 1997; Römer, 2015).
Sound localization in C. biguttulus involves lateralization: males do not pinpoint the exact angle of the sound, but they produce an unambiguous turning response to the left or the right (von Helversen, 1997). Despite their small body size, males are very precise at lateralization on the basis of both time and intensity cues (virtually perfect lateralization with a 2 dB difference between two speakers, or a 1.5 ms time difference between the two signals; see von Helversen and Rheinlaender, 1988). This high performance is achieved, as in other small insects, with a pressure-gradient receiver system that magnifies the intensity difference between the ears, and additional neuronal mechanisms that further enhance directionality cues (Hennig et al., 2004; Krahe and Ronacher, 1993; Michelsen and Rohrseitz, 1995; Michelsen et al., 1994; Wolf, 1986). Female songs are highly repetitive, consisting of multiple triangular-shaped pulses grouped into syllables, and the syllables themselves are repeated after a brief pause (Fig. 1B; von Helversen and von Helversen, 1997). This redundancy potentially provides multiple opportunities to accumulate directional information, which is important for localization given the low amplitude of female signals. However, most previous experiments were performed in quiet laboratory settings providing optimal conditions for detecting directional cues. In natural conditions, signal degradation and background noise levels are much higher and these may affect male lateralization performance (Michelsen and Rohrseitz, 1997; Reichert, 2015; Ronacher et al., 2000).
We designed a series of playback stimuli to investigate sound localization performance in C. biguttulus when directional cues are inconsistent within a signal. These behavioral data may give clues on how directional information is accumulated and weighted within the central nervous system. We addressed three major questions. (1) What is the effect of the number of repetitions that provide consistent directional information? (2) How is directional information weighted according to its temporal position in the song? Several pieces of evidence suggested that the beginning of the song may be particularly important. First, sexual selection likely favors rapid mate localization. Indeed, males may turn towards the female even before her song has finished (Kriegbaum and von Helversen, 1992; Ronacher et al., 2000), and males respond readily to truncated female songs (Ronacher and Hennig, 2004; Ronacher and Krahe, 1998). Second, behavioral data and neural modeling demonstrate that female C. biguttulus weight syllables at the beginning of a male song much more heavily than syllables later in the song when evaluating male attractiveness (Clemens et al., 2014). Similar processes may therefore be operating in males, albeit in a different context. (3) Do the answers to the above questions differ depending on whether the direction cue is a time or intensity difference? Furthermore, when both types of cue are present, but conflicting, within the same song, which plays a more important role in the male's directional response?
MATERIALS AND METHODS
Chorthippus biguttulus were caught in the field near Berlin in 2014 and 2015 (Germany: N 52°32′3.33; E 13°40′23.01) or were raised in the lab from collected eggs. Only males were used in the experiments. Males were group-housed separately from females at room temperature, and were fed with fresh grass and fish food flakes ad libitum. The experiments adhered to the ASAB/ABS Guidelines for the Use of Animals in Research and the current laws for animal care in Germany.
We performed experiments in an anechoic room that was heated to 30±2°C. Animals were tested on a table with a hand-held two-speaker system that could be aligned with the male's orientation such that speakers were at right angles to the male's longitudinal axis, each at a distance of 20 cm (Fig. 1; von Helversen and Rheinlaender, 1988). After each turning response or movement of the male, the experimenter readjusted the speaker system to maintain its alignment. The primary behavioral response measure was the stereotyped, unambiguous turning response exhibited by males of this species. Turning angles range between 50 and 150 deg, largely independent of the angle of the sound source (von Helversen, 1997). Song models were broadcast at 60 dB sound pressure level (SPL), measured with a Brüel and Kjaer (Nærum, Denmark) 2231 SPL meter and 4133 microphone bandpass filtered between 3 and 10 kHz, unless otherwise noted. The song itself was a digitally synthesized female C. biguttulus song containing 12 syllables of 6 pulses each (average pulse duration: 10.7 ms, pause between pulses: 3.7 ms, syllable duration: 82.8 ms, pause between syllables: 17.5 ms, total song duration: 1186 ms). Males are highly responsive to this song and turn towards it as they do to natural female songs (Reichert, 2015). The stimuli were stereo files in which we manipulated time and/or intensity differences on one or both channels, and each speaker was driven by a single channel. For each playback, we noted to which speaker the male turned. We frequently switched the stimuli between the left and right speaker to prevent side bias, but evaluated all male turns relative to the same reference speaker channel. Each male was exposed to 10 repetitions in a row of a given stimulus and its response recorded. Most males were tested with multiple different stimuli, presented in random order. Motivation to respond was tested prior to each set of stimuli by broadcasting the female song from a single speaker, to which males are normally highly responsive. Non-responsive males were returned to their cage and sometimes tested again later.
We tested a large number of stimuli, which varied along the following dimensions: (1) the number of syllables with directional cues favoring the reference speaker (3, 4, 6, 9 or 12 of the 12 total syllables in the song); (2) the temporal position of syllables providing a directional cue favoring the reference speaker (denoted as the number of the first syllable in the song providing that directional information, and sometimes simplified as corresponding to the beginning, middle or end of the song); (3) the number and position of syllables providing either directional cues favoring the opposite speaker or providing no directional cues (i.e. simultaneous syllables of equal amplitude from the two speakers); and (4) the type of directional cue provided by each syllable (time or intensity difference). Not all combinations of all of these variables were used as stimuli: the specific stimuli used are described in the Results (see also Table S1 for a full description of the stimuli, the number of males tested and the total number of turning responses obtained for each stimulus).
Grasshopper hearing system and background of the experimental design
We presented our stimuli in the free field. Thus, the sound from each speaker reached both ears. This experimental design differs from a common paradigm used in human studies in which stimuli varying in time and/or intensity are presented independently to each ear via earphones (Brown and Stecker, 2013). This raises the question of what were the actual ITD and ILD experienced by the animal for a given difference in time or intensity at the source.
To simulate an intensity difference, we generated stereo stimuli in which (with a few exceptions, see below) specific syllables were removed entirely from one channel of the recording. Because of the species' small body size, the external difference in amplitude between the sound arriving at each ear is small, but the effective amplitude difference between the ears is increased because the grasshopper hearing system is a pressure gradient receiver in which the two ears are internally coupled via tracheal sacs (Michelsen and Rohrseitz, 1995; von Helversen, 1997). For a speaker set at a right angle to the animal's longitudinal axis (as in our design, see Fig. 1A), the attenuation at the contralateral ear as a result of the pressure gradient mechanism is approximately 8 dB (von Helversen and Rheinlaender, 1988; Wolf, 1986). If, for example, a 12-syllable song was presented via the left speaker while on the right speaker the last 6 syllables were removed, the grasshopper would perceive in its right ear not a truncated song but instead a 12-syllable song whose last 6 syllables were 8 dB quieter than the first 6 syllables (i.e. there is an ILD of 8 dB for these syllables). We therefore refer to the speaker broadcasting all 12 syllables as the ‘louder’ speaker.
To simulate a time difference, we generated stimuli in which all syllables were broadcast at the same intensity but specific syllables on one channel were delayed by 4 ms relative to those on the other channel. This corresponds to a unilateral stimulation during the first 4 ms. With respect to a single sound source, the external difference in time between the sound arriving at each ear is approximately 10 µs, which is too small to be resolved by the nervous system (Mörchen et al., 1978). However, as described above, the sound affecting the contralateral ear is attenuated by 8 dB. Because the onset of spiking in the tympanal nerve is intensity dependent (Krahe and Ronacher, 1993; Mörchen et al., 1978; Rheinlaender and Mörchen, 1979; Ronacher and Krahe, 1997), spiking in the contralateral ear in response to a 60 dB stimulus will only begin after an additional delay of 6–8 ms compared with the ipsilateral ear (Mörchen et al., 1978; Römer, 1976). This intensity-dependent latency difference results in the sound from the leading channel not interfering with the neuronal responses caused by the delayed sound on the lagging channel, such that spiking on the ear ipsilateral to the lagging speaker will occur at a delay of 4 ms relative to spiking on the ear ipsilateral to the leading speaker (von Helversen and Rheinlaender, 1988). Thus, the delay in spike onset between the two ears matches the delay in stimulus broadcast at the source (i.e. this paradigm generates an ITD of 4 ms). In this scenario, because the intensities of the stimuli on each channel are the same, there will be approximately equivalent excitation in terms of spike count from the two ears, and instead the difference in the onset of spike timing can be used by grasshoppers to infer sound direction, as demonstrated by von Helversen and Rheinlaender (1988; see also Rheinlaender and Mörchen, 1979).
The aim of this study was to simulate the situation in which the directional cues in a female song were perceived as ambiguous or conflicting, and to determine whether weighting of these cues is similar for time and intensity differences. We chose physical time and intensity differences that were large enough that, in the absence of conflicting information, males could lateralize these stimuli with essentially no errors (Ronacher and Krahe, 1998, 2000; Ronacher et al., 1986). We could then contrast this nearly perfect performance with what we predicted to be less consistent performance when directional cues conflicted. Furthermore, the magnitude of the time and intensity differences at the source was chosen so that the timing or intensity cue, respectively, would dominate the directional response, as described above. Although we acknowledge that our experimental paradigm to present ITD and ILD cues differs from that most often used for larger animals, we nevertheless assert that the result of this paradigm is fundamentally the same: we generate actual inter-aural time and level differences, with known magnitudes. Therefore, the use of ‘ITD’ for stimuli with a time difference between the speakers and ‘ILD’ for stimuli with an intensity difference between the speakers is appropriate. There were three stimuli that differed from all other stimuli in this experiment because they involved songs that were broadcast from both speakers, with one speaker attenuated (but not silenced) but leading in time relative to the other speaker. In this case, we simply refer to these stimuli by their time and intensity differences at the source.
Male responses were quantified as the percentage of turns towards the reference speaker out of all turns during the 10 repetitions of each stimulus (occasionally males were inadvertently given more than 10 repetitions of a stimulus, in which case we calculated this percentage out of the total number of stimulus presentations). For an identical signal presented via the two speakers, the null expectation is for males to turn with equal probability to the left or the right. Indeed, when we presented such a stimulus, on average males directed 53.7% of their turns to the left speaker, which did not differ from the null expectation of 50% (Wilcoxon signed-rank test, V=300, P=0.6, N=35 males). Directional cues induce males to bias their turns towards one of the speakers; our analyses involved comparing this bias across stimuli. Males do not always turn towards a speaker: sometimes they remain in place and sing, or move in a forward direction (von Helversen, 1997). We did not include these responses when calculating the percentage of turns towards a given speaker. In some cases, males turned towards one speaker, and then turned towards the opposite speaker before the stimulus broadcast was complete. These reversals were noted (but only the initial turn was included in the calculation of the percentage of turns to the reference speaker), which usually corresponded to stimuli in which the directional cues favoring one speaker switched to favor the other speaker (see below). We analyzed the effects of the number, position and type of directional cue (where applicable, depending on the experiment) on the proportion of male turns towards the reference speaker using generalized estimating equations. The number of turns towards the reference speaker was coded as a binary variable referenced to the total number of turns to either speaker. Male identity was entered as a random effect because most males were tested on more than one stimulus. All other variables and their interactions were entered as fixed effects. We performed these analyses using the geeglm function with an exchangeable correlation structure using the geepack package version 1.2-1 (Halekoh et al., 2006) in R version 3.5.2 software (http://www.R-project.org/). All statistical analyses were two tailed and performed with α=0.05.
Effects of syllable number and position on ITD lateralization
We tested male lateralization towards stimuli consisting of 12 syllables in which some syllables (3, 4, 6 or 9) showed a lead–lag relationship and the remaining syllables were simultaneous. The syllables with timing differences were placed at either the beginning or end of the song. Because male directional responses were expected to be biased towards the speaker broadcasting leading syllables, we used this speaker as the reference speaker in analyses. We found a significant interaction between the effects of the number of leading syllables and their position within the song (Table 1, Fig. 2A). For a given number of leading syllables, those syllables were more effective at the beginning of the song than at the end. Males were more likely to turn towards the reference speaker as that speaker broadcast more leading syllables, but this effect was more linear for leading syllables at the end of the song (Fig. 2A). When we presented males (N=44) with a song in which all syllables were leading from one speaker, 97.9% of turns were towards the leading speaker.
For songs with 4 ITD syllables, we also tested a stimulus in which the ITD was on syllables 5–8 (i.e. the middle of the song), and compared responses to this stimulus with responses to the same number of syllables at the beginning or end of the song. We used generalized estimating equation (GEE) analysis with an ordinal factor corresponding to position in the song, which allowed us to test whether the pattern of responses was linear or quadratic across the temporal position of the ITD syllables. A significant linear effect would indicate that responses were strongest to ITD syllables at one end of the song and weakest to those at the other end, with the sign of the coefficient determining whether the beginning or ending of the song elicited a greater bias in turning. A significant quadratic effect would indicate that the ITD syllables in the middle of the song elicited either a greater (negative coefficient) or lesser (positive coefficient) bias in turning than ITD syllables at either end of the song. We found a significant linear (effect estimate=−1.06, χ22=11.44, P<0.001) but not quadratic (effect estimate=0.12, χ22=0.19, P=0.66), effect of position, and the coefficient was negative, demonstrating that ITD syllables at the beginning of the song were most effective and those at the end of the song were least effective (Fig. 3A).
Effects of syllable number and position on ILD lateralization
We tested male lateralization towards stimuli consisting of 12 syllables in which some syllables (3, 4 or 6) were missing from one speaker, effectively creating an ILD; the remaining syllables were simultaneous. As for ITDs, ILDs were placed at either the beginning or end of the song, and the reference speaker was the speaker that was not missing any syllables. Temporal position clearly had a different effect for ILD compared with ITD stimuli (Fig. 2). However, for ILD stimuli there was no significant interaction between the number of leading syllables and their position, nor was there a main effect of ILD position (Table 1). There was, however, a significant effect of the number of ILD syllables: more syllables that were louder on one side of the male elicited more turns towards that side (Table 1, Fig. 2B).
Males (N=23) responded to a song in which all syllables were broadcast from only one speaker with a turn in 216 of 240 (90%) trials, and 100% of these turns were towards the speaker broadcasting syllables. When we presented males with a song in which, as above, all syllables were broadcast from one speaker, but the total stimulus duration was 3 syllables, males (N=12) responded with a turn in 60 out of 160 (37.5%) trials, and 96.7% of these were towards the speaker broadcasting syllables. Note that male orientation to a 3-syllable stimulus was substantially more accurate than when this same 3-syllable stimulus had an additional 9 syllables with non-directional information appended to it (i.e. the stimulus with three louder syllables at the beginning followed by 9 simultaneous syllables of equal amplitude from the two speakers in Fig. 2B, in which an average of 66% of turns were correct).
We tested a more extensive series of stimuli with ILD syllables in the middle of the song because our first trials suggested that, unlike for ITDs, the middle was more influential than the beginning or end of the song for ILDs. We tested a total of 10 stimuli: songs with 3 consecutive ILD syllables beginning at syllables 1, 4, 7 or 10, songs with 4 consecutive ILD syllables beginning at syllables 1, 5 or 9, and songs with 6 consecutive ILD syllables beginning at syllables 1, 4 or 7. We performed separate statistical analyses for each set of stimuli with the same number of ILD syllables because these syllables were positioned at different absolute locations within the song. In all cases, there was a significant quadratic negative effect (Table 2), confirming that ILDs in the middle elicit the greatest directional bias in the turning response (Fig. 3B).
Response to songs with conflicting ITDs
The stimuli discussed above were all consistent in the sense that directional information favored one speaker and the remaining simultaneous syllables were essentially neutral. We performed a second set of experiments in which we presented songs in which some ITD syllables were leading on one side and other ITD syllables were leading on the opposite side and asked whether the number and temporal position of the leading syllables from the reference speaker affected the likelihood of males turning towards that speaker. Defining the reference speaker is less straightforward because both speakers present leading syllables at some point in the song. For simplicity, we define the reference speaker as that which first presented leading syllables. The three stimuli began with 3, 6 or 9 leading syllables followed by 9, 6 or 3 lagging syllables, respectively, from the reference speaker. For the stimulus with 6 leading followed by 6 lagging syllables, if the temporal position of the syllables had no effect, then we predicted males would direct an equal proportion of turns towards each speaker because the total directional information is the same from each speaker. We therefore compared the turns towards the reference speaker to a null hypothesis of 50% turns using a Wilcoxon signed-rank test. Although turns were somewhat biased towards the speaker that first presented leading syllables, this did not differ significantly from the null expectation (Fig. 4A; V=100, N=18, P=0.1).
The remaining two stimuli are the inverse of each other: both contain 3 leading and 9 lagging syllables from one of the speakers, but in one case the leading syllables are at the beginning and in the other case they are at the end. We therefore compared the influence of the temporal position of the 3 leading syllables using a Wilcoxon signed-rank test (responses were paired; 18 subjects were tested with both stimuli). There was a difference in the response to these syllables, with 3 leading syllables at the beginning of the song being more influential than 3 leading syllables at the end of the song (Fig. 4A; V=80, N=18, P=0.02). Based on a linear regression, we extracted an equivalence point from Fig. 4A: 4.23 leading syllables at the beginning of the song are equivalent to 7.77 leading syllables at the end of the song. Males rarely reversed direction after turning initially towards the speaker that first broadcast leading syllables when the leading speaker was then switched within the song (2/25, 8/67 and 0/94 reversals/turns for stimuli with 3, 6 and 9 leading syllables at the beginning, respectively).
Response to songs with conflicting ILDs
As above, we performed a second set of experiments with ILD syllables in which some syllables were louder on one side and the remaining syllables were louder on the other side. The reference speaker was defined as that which first presented the louder syllables. The three stimuli began with 3, 6 or 9 louder syllables followed by 9, 6 or 3 missing syllables, respectively, from the reference speaker (Fig. 4B). These stimuli were analyzed as for the analogous ITD stimuli above. For the stimulus with 6 ILDs favoring each speaker, there was a significant bias towards the speaker that first broadcast louder syllables (Fig. 4B; Wilcoxon signed-rank test V=200, N=23, P=0.004). And 3 louder syllables at the beginning of the song were more influential than 3 louder syllables at the end of the song, when the remaining syllables were quieter (Fig. 4B; V=80, N=23, P=0.04). Based on a linear regression, we extracted an equivalence point: 5.09 louder syllables at the beginning of the song are equivalent to 6.91 louder syllables at the end of the song. Reversals of direction when the speaker switched after initial turns towards the speaker that first broadcast louder syllables were much more common for ILD stimuli than for ITD stimuli (25/39, 88/147 and 41/209 reversals/turns for stimuli with 3, 6 and 9 louder syllables at the beginning, respectively).
Response when timing and intensity information conflict
We presented several stimuli in which timing information favored one speaker while intensity information favored the other. For three of these stimuli, all of the syllables from one speaker were leading, while all of the syllables from the other speaker were louder (note that for these, and only these, stimuli, we broadcast sound from both speakers, but the sound from one speaker was quieter than the sound from the other speaker). In all three combinations tested, turning was biased towards the speaker broadcasting the leading syllables (Table 3).
A second series of stimuli separated conflicting ITD and ILD syllables in time (i.e. a certain portion of the song contained ITD cues favoring one speaker, and the remaining portion contained ILD cues favoring the opposite speaker). Specifically, we tested male turning response to 12-syllable stimuli with 3, 6 or 9 ILD cues favoring one speaker (syllables only broadcast from that speaker, i.e. an 8 dB ILD), and 9, 6 or 3 ITD cues, respectively, favoring the opposite speaker (4 ms lead). We tested both orders: ILD cues at the song beginning and ILD cues at the song end. We used the speaker broadcasting the louder ILD cues as the reference. There was a significant effect of both the number of ILD syllables (effect estimate=0.615, χ22=107.4, P<0.001) and their temporal position in the song (effect estimate=−1.24, χ22=13.4, P<0.001); a non-significant interaction effect was removed from the model. More ILD syllables and ILD syllables at the beginning of the song elicited more turns towards the speaker broadcasting the louder syllables (Fig. 4C). The effect of temporal position was strongest for songs containing 6 ILD syllables favoring one speaker and 6 ITD syllables favoring the opposite speaker. Here, 6 ILD syllables at the song's beginning were very effective in biasing turns towards the louder speaker, while 6 ILD syllables at the song's end were far less influential (Fig. 4C). In contrast, 6 ITD syllables at the beginning had only a marginal effect on lateralization (triangle at 6 syllables; Fig. 4C).
We investigated how male lateralization decisions are affected by inconsistent or conflicting directional cues within female songs. Because of sound degradation and neuronal and environmental noise, such cue conflict is likely in nature, and we provide one of the first experimental demonstrations of how non-human animals accumulate and weight conflicting cues to determine the direction of an incoming sound. The critical variables are the proportion of the song providing directional cues favoring one side over the other, and the position of those directional cues within the song. The position effects varied with the type of directional cue: time differences were most heavily weighted at the beginning of the song, while intensity differences were more heavily weighted when they were in the middle of the song. We discuss the implications of these results for mate localization in challenging acoustic environments, with respect to the neurobiology of directional hearing.
Time course of decision making
As for many decision-making processes, mate searching by lateralization of sound signals in C. biguttulus requires quick decisions to be made with often noisy or imperfect information. Speed of decision making is advantageous for males searching for females in competitive conditions (Kriegbaum and von Helversen, 1992). Our results indicate a trade-off between the speed of decision making and the accumulation of sufficient directional information. Four key findings illustrate the time course of directional decision making in C. biguttulus. (1) The accuracy of lateralization increased with the number of syllables consistently favoring one side. However, this effect tended to tail off at higher numbers of consistent syllables (Fig. 2). In addition, for the conflicting cue stimuli, syllables at the end were less effective in biasing turning (Fig. 4). Both results suggest that the decision is often fixed before the last third of the song. (2) Males regularly and accurately turn towards substantially truncated stimuli (i.e. a 3-syllable song; Ronacher and Hennig, 2004; Ronacher and Krahe, 1998; Ronacher et al., 2000). We presented a stimulus that contained the same total directional information as in the 3-syllable song, but was prolonged by many simultaneous syllables with no directional cues (Fig. 2B). For this ILD stimulus, the accuracy of male turning was strongly reduced (66.0% correct turns) compared with the response to the 3-syallable song (96.7% correct turns). This difference demonstrates that normally the decision is not fixed after just three syllables, and hence that additional syllables with ambiguous directional cues could induce lateralization errors. (3) When directional cues are absent at the beginning of the song, a small number of syllables with directional cues at the song's end still strongly biased turning (Fig. 2B). This indicates that individuals may postpone a decision when clear cues are not available. (4) Males frequently reversed direction in response to the conflicting ILD stimuli. This demonstrates that males immediately begin accumulating new directional information after initiation of a turning decision. This rapid error-correcting behavior may be critical for overcoming effects of sound degradation in natural habitats. Female response songs are low amplitude and males that erroneously move away from females will have a reduced likelihood of hearing more female response songs.
Time and intensity cues are weighted differently
The temporal position of directional cues within a song had a strong impact on the extent to which they biased male turning. However, the two types of directional cues differed in where they were most effective within the song. Time differences were most effective at the very beginning of the song (Fig. 3A). This effect is in accord with a model of adaptive neural coding showing that the song beginning offers the strongest directional cues (Hildebrandt et al., 2009, 2015). Furthermore, a model of song feature weighting for female evaluation of male signals also found a key role for the song's beginning: unattractive song syllables were much more likely to suppress female responses when they were at the beginning of the song than later on in the song (Clemens et al., 2014). However, the integration time constants for females are much longer than the natural duration of male songs (Clemens et al., 2014, 2017), while males often made a decision to turn well before the end of the female song.
In contrast, intensity differences at the beginning were least effective in biasing male turning responses when the remaining syllables provided no directional cues (Fig. 3B). A likely explanation for this result is that a few louder syllables at the beginning of the song result in adaptation in the ipsilateral ear and associated AN2 neuron, which carries directional information to the brain (Krahe et al., 2002; Ronacher and Krahe, 1998; Stumpner and Ronacher, 1991, 1994). When the stimuli then switch to the broadcast of simultaneous syllables, these are actually perceived as louder on the opposite ear (even though they are actually of equal amplitude), thus favoring lateralization to this side (Hildebrandt et al., 2009, 2015). However, male turning was strongly biased towards ILD cues in the middle of the song, even though these were also followed by some simultaneous syllables (Fig. 3B). Together, our results suggest that the region between the fourth and sixth syllables (i.e. the early middle portion of the song) is particularly influential in driving directional decisions based on amplitude differences.
Interactions between timing and intensity cues determine the strength of contralateral inhibition, which is an important mechanism of directional information processing (Pollack, 1998; Römer and Krusch, 2000; Siegert et al., 2011; Wolf, 1986). When timing and intensity information conflict, greater magnitudes of one cue type are needed to elicit a directional response in the hearing system favoring that cue type (Rheinlaender and Mörchen, 1979). In our study, when one speaker broadcast syllables leading the other speaker by 4 ms, but which were 4 dB quieter, most turns were directed towards the leading speaker. However, when one speaker broadcast 4 ms leading syllables for part of the song but then was missing the remaining syllables (i.e. there was an 8 dB ILD favoring the opposite speaker for these syllables), these louder syllables were at least as effective as 4 ms leading syllables, and sometimes more effective, in biasing turning males' turns. The stronger effects of an 8 dB ILD were especially pronounced when there were 6 of these ILD syllables at the beginning of the song followed by 6 conflicting ITD syllables favoring the opposite speaker (Fig. 4C); 6 syllables with an 8 dB ILD at the end of the song, preceded by 6 ITD syllables favoring the opposite speaker, were much less influential (Fig. 4C). The differences in the temporal position effects of ITD and ILD cues suggest that interactions in not only the magnitude but also the relative timing of these cue types influence how they are coded and integrated by the auditory system.
Exceptional directional hearing abilities have been documented in many species, and some of the sensory mechanisms for determining the location of a sound source are shared across taxa (Popper and Fay, 2005). These include the precedence effect (Brown et al., 2015; Litovsky et al., 1999; Reichert, 2018), as well as mechanisms to amplify directional cues in small-bodied animals such as pressure gradient receivers and contralateral inhibition (Robert, 2005). These commonalities are not surprising, as most if not all acoustic species face the challenge of evaluating and locating sounds of interest amidst a complex background of competing sounds (Bee and Micheyl, 2008). Relatively few studies have examined how directional cues are integrated across a signal, despite the clear relevance of this task for sound localization in natural environments. The phenomenon of the buildup of the precedence effect, in which multiple repetitions of stimuli containing directional cues result in stronger directional perception, has been demonstrated in a few species (Dent and Dooling, 2003; Tolnai et al., 2014). We found a similar effect: C. biguttulus also responded with a greater directional bias when more repetitions of syllables containing directional cues were presented. However, further studies are necessary to determine whether this reflects a process analogous to the buildup of the precedence effect.
We found several similarities between the behavior of C. biguttulus and multiple aspects of human directional hearing. First, in the phenomenon of onset dominance, the very first stimulus repetition is especially influential and directional cues from this repetition can outweigh many subsequent cues with conflicting directional information (Freyman et al., 1997; Saberi, 1996; Stecker and Hafter, 2002). Correspondingly, we found that when the song contained conflicting directional information, syllables at the stimulus beginning generally biased turning more than syllables at the stimulus end. Second, several directional hearing phenomena operate differently depending on whether ITD or ILD cues are used. For instance, when repetitive clicks are presented to human listeners, there is a difference in the temporal weighting of individual clicks for ITD and ILD cues (Stecker et al., 2013). Both cues are heavily weighted at the beginning, but ILD cues are often weighted heavily at the offset of the stimulus as well. We also found differences in the temporal weighting of ITD and ILD stimuli (Fig. 3), although for C. biguttulus, ILD cues in the early middle portion of the song were most effective. In humans, the precedence effect may ‘break down’ after a sudden switch in the speakers (a phenomenon in which the listener now perceives both sound sources, followed by localization dominance of the new leading or louder speaker; Clifton, 1987). This breakdown occurs with ILD cues but not when ITD cues are used (Krumbholz and Nobbe, 2002). A breakdown of the precedence effect may be analogous to our finding that males sometimes reversed their turning response if directional information switched between the two speakers over the course of a song. Intriguingly, males were far more likely to show this reversal behavior for ILD than ITD stimuli.
The effectiveness of communication systems ultimately depends on how well individuals overcome constraints imposed by the inevitably less than optimal signal transmission conditions in natural environments (Endler, 1993). Conflicting directional information is one of many challenges faced by receivers of acoustic signals. Our experiments on male C. biguttulus demonstrated that sound localization is affected in complex ways by conflicting directional cues. These results raise questions about both the mechanism and function of the differential weighting of directional information in this species. While some responses of C. biguttulus resembled those of human listeners faced with similar acoustic challenges, further study is needed to determine whether these grasshoppers actually experience precedence effects and associated phenomena. Electrophysiological recordings from freely moving animals (e.g. Wolf, 1986) could address the neuronal mechanisms involved in directional information processing. In terms of ultimate function, it is worth investigating whether the response of males to conflicting directional cues is adaptive in the sense that it enables them to localize mates more efficiently. We suggest that heavy weighting of directional cues at the beginning of the stimulus, combined with the possibility of reversals, may allow males to move more quickly and accurately towards females, but this should be tested in natural populations. Additionally, although we have argued that directional information is likely to often be conflicting or ambiguous in natural settings, few studies have directly measured this (Kostarakos and Römer, 2010). To better understand the relationships between environment, behavior and sensory systems, and to determine the generality of our findings, additional studies of communication in realistic settings are needed.
Members of the Abteilung Verhaltensphysiologie assisted with animal rearing and field collections. Michael Rumpold assisted with the experiments. Hannah Haberkern and two anonymous reviewers provided helpful comments on previous drafts of this manuscript.
Conceptualization: M.S.R., B.R.; Methodology: M.S.R., B.R.; Formal analysis: M.S.R.; Investigation: M.S.R.; Resources: B.R.; Data curation: M.S.R.; Writing - original draft: M.S.R., B.R.; Writing - review & editing: M.S.R., B.R.; Supervision: B.R.; Funding acquisition: M.S.R., B.R.
Funding was provided by the US National Science Foundation International Research Fellowship Program (IRFP 1158968) to M.S.R., and grants from the Deutsche Forschungsgemeinschaft (RO 547/12-1) and Leibniz-Gemeinschaft (GENART project; SAW-2012-MfN-3) to B.R.
The authors declare no competing or financial interests.