ABSTRACT
The detection, identification and discrimination of sound signals in a large and noisy group of signalers are problems shared by many animals equipped with ears. While the signaling behavior of the sender may present several solutions, various properties of the sensory system in receivers may also reduce the amount of signal masking. We studied the effect of spatial release from masking, which refers to the fact that the spatial separation between the signaler and the masker can contribute to signal detection and discrimination. Except in a limited number of cases, the contribution of peripheral directionality or central nervous processing for spatial unmasking is not clear. We report the results of a study using a neurophysiological approach in two species of acoustic insects, whereby the activity of identified interneurons that either receive contralateral inhibitory input (crickets) or inhibit one another reciprocally in a bilateral pair (katydids) was examined. The analysis of the responses of a pair of omega neurons in katydids with reciprocal inhibition revealed that spatial separation of the masker from the signal facilitated signal detection by 19–20 dB with intact binaural hearing, but only by 2.5–7 dB in the monaural system, depending on the kind of analysis performed. The corresponding values for a behaviorally important interneuron of a field cricket (ascending neuron 1) were only 7.5 and 2.5 dB, respectively. We compare these values with those reported for hearing in vertebrates, and discuss the contribution of spatial release from masking to signal detection under real-world chorus conditions.
INTRODUCTION
Many animals communicate acoustically in large and noisy groups (choruses), where the detection, identification and discrimination of sound signals is a common problem. In humans, considerable research over the last several decades has been devoted to discovering how we solve the so-called ‘cocktail party problem’, which refers to the difficulty of perceiving speech in noisy social settings (Cherry, 1953; Bronkhorst, 2000; Bee and Micheyl, 2008). The perceptual task of animals when they communicate in large aggregations (choruses) is rather similar. Impressive examples of such aggregations can be found in frog and insect choruses, the dawn and dusk choruses of songbirds, and flocking and colonial birds (Aubin and Jouventin, 1998; Brumm and Slabbecoorn, 2005; Gerhardt and Huber, 2002; Greenfield, 2002; Hulse, 2002; Langemann and Klump, 2005). Field measurements in such insect populations have demonstrated that receivers may be required to discriminate the individual calls of up to four nearby males and more than 10 others within hearing range, some of which are quite similar in amplitude (Römer and Krusch, 2000). As pointed out by Bee and Micheyl (2008), the perceptual task in animal aggregations may be even more complex as compared with the human ‘cocktail-party problem’, because heterospecific signalers often contribute significantly to background noise. In environments with high species diversity, such as nocturnal tropical rainforests, the airborne sound channel is shared by several species of anurans, as well as an estimated more than 50 species of insects and a large number of bats (Diwakar and Balakrishnan, 2007; Ellinger and Hödl, 2003; Kalko et al., 1996; Lang et al., 2005; Schmidt et al., 2011). The auditory system thus must be able to segregate many irrelevant sound sources from a few biologically important ones (i.e. signals from conspecifics, sound cues from predators).
The selection pressure in such acoustic environments has resulted in adaptations among signalers and receivers (for reviews, see Brumm and Slabberkoorn, 2005; Brumm, 2013). Among different taxa, signalers may increase the sound pressure level (SPL) of calls under noisy conditions (the Lombard effect; Brumm and Zollinger, 2011), time their signals during periods of relative silence (Narins 1982; Gogala and Riede, 1995), use multimodal/alternative signals (Higham and Hebets, 2013) or increase the signal duration/signal redundancy to counteract the effect of masking by noise. Receivers have also evolved adaptive solutions for this problem, some of which are remarkably similar among different taxa. These include the development of more selective auditory filters for sound (Schmidt and Römer, 2011; Schmidt et al., 2011), comodulation masking release (Klump, 1996; Buus, 1998) or gain control mechanisms (Pollack, 1988; Römer and Krusch, 2000) in order to detect biologically relevant signals or discriminate among signal variants. An additional mechanism is spatial release from masking; human receivers, for example, may experience an improvement in speech recognition when the speech signal and masking noise are spatially separated to some degree (Arbogast et al., 2002; Bregman, 1990; Freyman et al., 2001; Klump, 1996).
A basic requirement for the proper functioning of this mechanism is some degree of peripheral directionality of the auditory system. When the signal and masker are spatially separated, either interaural differences in the time of arrival (interaural time differences) or interaural intensity differences, or both, are created that can be used for spatial release for masking (SRM). Such interaural cues are not available when the signal and masker arrive from the same location. Interaural time and intensity differences are biophysical cues that result in discharge differences in the responses of auditory afferents. These may be further enhanced at the level of auditory interneurons by central nervous (inhibitory) interactions. The biophysical solutions for directional receivers in animals are quite diverse in both vertebrates and invertebrates (Michelsen, 1992, 1998; Michelsen and Larsen, 2008; Robert, 2005), but all provide the directional cues necessary for spatial release from masking. However, although humans and animals share the same fundamental problem of sound source segregation in noisy social environments, relatively little is known about spatial release from masking in the context of animal acoustic communication (Bee and Micheyl, 2008; Hulse, 2002; Ratnam and Feng, 1998; Schmidt and Römer, 2011). The amount of unmasking, i.e. the reduction in the masked threshold for signal detection obtained in either behavioral or neurophysiological approaches, varies from 0 to 12 dB in humans (Gilkey and Good, 1995; Litovsky, 2005; Santon, 1987), 9 to 30 dB in birds (Dent et al., 1997, 2009), 12 to 19 dB in pinnipeds (Holt and Schustermann, 2007) and 3 to 6 dB in treefrogs (Bee, 2007; Nityananda and Bee, 2012; Schwartz and Gerhardt, 1989).
In all of these behavioral approaches, and most neurophysiological ones, it is completely unknown how much of the observed values of SRM is due to the peripheral directionality of the ears, and how much can additionally be attributed to central nervous processing. In the present paper, therefore, we addressed this question and report the results of a neurophysiological approach taken in two species of acoustic insects, measuring the activity of identified interneurons that either receive contralateral inhibitory input or inhibit each other reciprocally. The study system chosen was a pair of identified first-order interneurons in a katydid and a cricket, for several reasons. The omega neuron in katydids and crickets is a local interneuron located in the prothoracic ganglion; it receives excitatory input from most receptors on the soma-ipsilateral side and induces reciprocal inhibition in its side-homologous counterpart (Selverston et al., 1985; reviewed in Hedwig and Pollack, 2008). Where such connectivity exists, one would expect strong directionality due to the enhancement of contralateral inhibition. As a consequence, we expected the amount of SRM in the binaural system to be high as compared with a monaural system in which any contralateral inhibition would be excluded by eliminating the opposite ear. To quantify the effect of spatial unmasking in crickets, we extracellularly recorded the action potential (AP) activity of the ascending neuron 1 (AN1). AN1 is the only ascending interneuron that forwards the species-specific information about the calling song from the prothoracic ganglion to the brain (Wohlers and Huber, 1982), and the difference in activity between these neurons appears to be essential for directional steering (Schildberger and Hörner, 1988). AN1 also receives contralateral inhibition from the omega neuron, but not from its counterpart on the opposite side of the prothoracic ganglion (Horseman and Huber, 1994a,b). Another difference between the cricket and katydid systems is the degree of peripheral directionality at the frequencies of conspecific signals, which is higher in katydids as compared with crickets (Michelsen, 1998; Rheinlaender and Römer, 1980). Thus we expected that the contribution of both peripheral directionality and central inhibitory connections for the total amount of SRM would be smaller in crickets.
MATERIALS AND METHODS
Katydids
Experiments with katydids were performed with adult male and female Mecopoda elongata Linnaeus 1758 (Orthoptera, Tettigoniidae, Mecopodinae) equivalent to ‘species S’ in Sismondo (1990). Animals were originally collected in the field in Malaysia (Ulu Gombak, Selangor, Kuala Lumpur) and were later reared in a laboratory culture at the Department of Zoology in Graz, Austria. They were kept at 27°C and 70% relative humidity on a 12 h:12 h light:dark cycle and fed ad libitum with fish food, oat flakes and fresh lettuce.
Neurophysiology
We recorded the extracellular activity of a pair of local auditory interneurons, the so-called omega neurons in the prothoracic ganglion. They receive excitatory input from most of the receptors of the soma-ispilateral hearing organ, and reciprocal inhibitory input from their side-homologous counterpart (reviewed in Hedwig and Pollack, 2008). For details of the preparation and recording technique, which allowed recording of the activity of both cells simultaneously, see Wiese and Eilts (1985) and Römer and Krusch (2000). Recordings were obtained with extracellular tungsten electrodes (0.5–1.5 MΩ resistance) placed close to the crossing segments of the two cells (see Fig. 1A,B). Neurophysiological experiments with katydids and crickets were conducted at a temperature of 21°C in an anechoic chamber.
Acoustic stimulation and experimental procedure
As a model of the conspecific calling, song we used a digitized chirp that was originally recorded from a singing male using a ¼ inch microphone (type 2540, Larson Davis, Depew, NY, USA) and a sound level meter (CEL 414, Casella, Bedford, UK), at a sampling rate of 192 kHz. The chirp had a syllable rate of 55 Hz and a total duration of 270 ms.
We could not use the original nocturnal noise from the habitat of M. elongata as background in the playback, because M. elongata was always singing in the background in nocturnal recordings. To prevent a conspecific signal in the background from contributing to the neuronal response, we used background noise recorded in the nocturnal rainforest of Panama as a masker, which has similar spectral content and variation of heterospecific signals (Siegert et al., 2011). A segment of 200 s was used in a loop for continuous playback. Power spectra of sound signals were calculated in CoolEdit using a Hanning window function and a fast Fourier transform size of 512 points.
The playback was controlled in Cool Edit Pro (Syntrillium Software, Phoenix, AZ, USA) driving an external sound card (Edirol FA-101, Roland, Tokyo, Japan). The amplitude of both the chirp and the masker were controlled by two separate channels of an attenuator (PA-5, Tucker Davis Technologies, Alachua, FL, USA) and a stereo amplifier (NAD 214, NAD Electronics, Pickering, ON, Canada). Signal and noise were broadcast by a pair of string-tweeters (EAS-10TH400A, Technics, Kadoma, Japan) with a flat frequency response between 200 Hz and 40 kHz. Because of the limited frequency response of the speaker to 40 kHz, this mimicks a receiver listening to a conspecific from medium distances as a result of excess attenuation (Römer and Lewald, 1992). The spectra of the chirp and background noise are shown in Fig. 1C. The two loudspeakers were positioned perpendicular to the animals' longitudinal body axis. When signal and masker were presented from the same side, the speakers were positioned next to each other. The SPL of the signal was calibrated relative to 20 µPa at the position of the preparation by continuous playback of only the last syllable within the chirp (exhibiting the maximum amplitude). Similarly, the SPL of the masker was calibrated by continuous playback of only the peak amplitudes in the masker. For calibration, a condenser microphone with a flat frequency response characteristic between 4 and 48 kHz was used (LD 2540, Type 4133, Larson Davis). Calibration was carried out in ‘fast’ reading mode with the sound level meter CEL 414 attached to a filter unit (CEL-296).
Spatial release from masking experiment
First, the threshold of one omega neuron was determined for both the conspecific signal and the masker with soma-ipsilateral stimulation. The threshold for responding to the chirp was defined as the lowest SPL that elicited a neuronal response in five out of 10 presentations. The background noise used as masker contained peaks that were the result of heterospecific signals randomly spaced in time. The threshold for the masker was defined as the lowest SPL that elicited a response to each of these peaks during the first 20 s of the masker presentation. As a control, the response of the omega neuron was recorded in the situation without a masker. In consecutive experiments, the chirp was presented at 20 dB above threshold, and the SPL of the masker was varied in increments of 5 dB from a signal-to-noise ratio (SNR) of +20 dB (signal 20 dB more intense as compared with the masker) to −20 dB (masker 20 dB more intense as compared with signal). Each stimulus configuration was presented 25 times; the first five responses were excluded from the analysis to eliminate potential adaptation effects.
After completion of these presentations of signal and masker from the same soma-ipsilateral side, the masker was shifted to the opposite side and all stimuli were repeated at the same SNRs. The difference in the SNR at the masked threshold (see below) gives the amount of SRM in the intact system, when both peripheral directionality and central neural processing contribute to spatial unmasking. To investigate the contribution of the peripheral directionality for SRM alone, the contralateral leg with the ear was cut off in each katydid. Such a manipulation eliminates not only the contralateral ear, but also one of the four potential inputs for sound pressure onto the ipsilateral posterior tympanal membrane and, thus, could change the directionality of the pressure difference receiver (Michelsen et al., 1994). However, the manipulation leaves the main contralateral input via the acoustic spiracle intact, and the contralateral tympanal membrane has only minor effects with respect to directionality at the calling song frequency (Michelsen et al., 1994). Thus, we can be confident that the results after the manipulation reliably reflect the effects of the peripheral directionality. In the monaural situation, all stimulus presentations were repeated at the same SNRs as in a binaural system.
Data analysis and statistics
Because of the differences in the amplitudes of APs (Fig. 1B), the activity of both omega cells could be analyzed separately with a custom-written script (courtesy of M. Hartbauer) in Spike 2 (Cambridge Electronic Design, Cambridge, UK). The analysis of masking was performed in two different ways. First, the response to a chirp was measured in a time window of 350 ms (average of 20 responses). At the same time, the response to the masker was analyzed during the time period between two chirp presentations (average of 20 chirp periods). We defined the masked threshold as the SNR where the activity in response to the chirp was equal to the response to the masker.
In a second approach, we used signal detection theory to define the masked threshold (Green and Swets, 1966; Wiley, 2013). The activity of the omega neuron was used for the development of a rule-based neuronal burst-detector algorithm that scans brief time segments of 0.35 s for the occurrence of bursts related to chirp responses. Within this ‘scanning window’, the spike rate had to exceed the mean spike rate calculated over 40 s plus two times the s.d. for a minimum time of 150 ms to be considered a burst. Bursts were classified as hits when coinciding with a conspecific chirp (i.e. occurring in a time window of 350 ms after the trigger for a chirp). When no burst activity was detected within this time window, although a chirp had been broadcast, the signal was classified as a missed signal. Similarly, bursts were classified as false alarms when they occurred in the inter-chirp interval. The hit rate could amount to 100% when all 20 chirps resulted in a neuronal burst in the relevant time window. The rate of false alarms could be higher than 100%, because during the inter-chirp interval of ∼2 s, several masker-related events could induce a burst activity that was similar to the one induced by chirps.
Crickets
Last instar male and female crickets (Gryllus bimaculatus de Geer 1773) were taken from the colony at the Institute of Zoology, University of Graz, Austria, and maintained on a 12 h:12 h light:dark cycle. Animals were fed on fresh water, oat flakes, fish food and a rearing concentrate (Nekton Grillenzuchtkonzentrat, Pforzheim, Germany) ad libitum. Both sexes were separated and were used for experiments approximately 1 week after their final moult. All experiments were performed at 21–23°C.
Acoustic stimulation and experimental procedure
We digitally synthesized models of male calling songs at a rate of 48,000 samples s−1 with chirps of four pulses of 23 ms in duration, separated by a 16 ms inter-pulse interval (chirp duration 140 ms), using Cool edit Pro software (version 2.00; Syntrillim, Phoenix, AZ, USA). The carrier frequency was set at 4.9 kHz, which represents the frequency of highest sensitivity in G. bimaculatus. As a background masker, we synthesized an amplitude-modulated band-pass noise with a duration of 12 s by filtering a white noise sequence (low and high cut-off frequencies 2.5 and 9 kHz, respectively, for the spectra of signal and masker; see Fig. 1C). The envelope of the masker was modulated with a randomly fluctuating amplitude peak. The masker was presented in an endless loop that had no fixed temporal relationship to the conspecific signal. Conspecific signals were broadcast via a Raveland MHX 138 speaker that was placed at a distance of 40 cm to the preparation at an angle of 90 deg relative to the longitudinal body axis. The background masker was broadcast with Sinuslive NEO 13s speakers (Kaltenkirchen, Germany) at the same distance from the preparation, either directly next to the speaker broadcasting the conspecific signal from the ipsilateral side, or 180 deg opposite, from the contralateral side. Sound stimuli were presented using custom-made high-frequency amplifiers and a digital attenuator (PA5, Tucker Davis Technologies).
Prior to the masking experiment, we determined the response threshold for both the signal and the masker in each individual for ipsilateral and contralateral stimulus presentation. The threshold was defined as the lowest SPL that evoked at least one AP per syllable of the calling song, or at least one AP in response to a short 250 ms noise sequence with the highest amplitude in the noise loop. The remaining stimulus protocol was identical to that followed in experiments with katydids.
Neurophysiology
Extracellular AP activity of AN1 was recorded with hook electrodes attached to the circumoesophageal connective (for details of the preparation technique, see Stabel et al., 1989; for a recording example, see Fig. 1E). AP activity was amplified using a custom-made amplifier and digitized at 20 kHz using a CED 1401 plus data acquisition interface (Cambridge Electronic Design). Data were recorded to the hard disk of a computer using Spike 2 software (Cambridge Electronic Design). Neural recordings were also displayed on an oscilloscope and monitored through headphones.
For offline analysis, the AP activity of AN1 was separated from the activity of other cells (large APs in Fig. 1E from AN2) with a custom-written script (courtesy of M. Hartbauer) using amplitude and time course of APs in Spike 2 (4.0) (Cambridge Electronic Design). Such spike sorting resulted in spike distributions with no overlap between the AN1 and AN2 spikes, and a less than 3% overlap with the APs of other cells. We computed peristimulus time histograms (PSTHs; bin size 5 ms) for responses to 56 chirp repetitions in each stimulus situation. To quantify SRM, the modulation depth of each PSTH was calculated as the spike-rate difference between the maximum of the stimulus response during the first half of the PSTH and the average noise response in the time period starting 60 ms after the end of a chirp (Fig. 1F). The spike rate difference in responses to chirps at 65 dB SPL and the background noise response at threshold was defined as 100%; complete masking was defined when this spike rate difference decreased to 50%. In this way, AN1 responses were quantified for SNRs between +20 and −10 dB, when the masker was broadcast from either the same or the opposite direction as the signal. To determine the contribution of only the peripheral directionality to spatial unmasking, the contralateral tibia with the ear was cut off, and all measurements were repeated in the monaural system.
RESULTS
Katydids
When a masker is absent, the omega cell fires high-frequency bursts of APs in response to conspecific chirps that are presented every 2 s from the respective ipsilateral (soma) side (see representative recording example in Fig. 2A). When a masker was also presented on the ipsilateral side at a SNR of 0 dB, the cell fired strongly in response to the masker. This resulted in a reduction in the response to the chirp, and bursts of APs also occurred in response to the masker (Fig. 2B). However, when the masker was presented from the opposite side, the same omega cell was nearly unaffected by the contralateral masker, and showed only a few APs in response to prominent masker peaks (large spikes in Fig. 2C), so that the response to the chirps demonstrated a strong SRM. At the same time, the opposite cell (small spikes in the figure) fired strongly in response to the masker, which was ipsilateral for this cell. Finally, after the ear contralateral to the omega cell with the large APs had been eliminated, the degree of SRM was strongly reduced. In this case, the cell then fired bursts of APs in response to chirps and the masker, although the latter was presented from the contralateral (compare Fig. 2C and 2D). No small spikes are observed in Fig. 2D due to the elimination of the excitatory input to this cell.
Results of 10 preparations such as the one illustrated in Fig. 2 were quantified for SNRs ranging from +20 to −20 dB. Fig. 3 summarizes these results based on the response strength of the omega neuron to the chirp and the masker. In the binaural system, the presentation of the masker from the same direction as the signal had two effects: as the SNR decreased, the response to the chirp decreased and the response to the masker increased. At a SNR of +5 dB, the response strength to both was the same (Fig. 3A; predefined masked threshold). A shift of the masker to the contralateral (Fig. 3B) had no effect on the response to the chirp, and resulted in a gradual decrease of the response with decreasing SNR. However, the response to the masker remained small at any SNR, so that the masked threshold was reached at a SNR of –14 dB. Thus, in the binaural system the spatial separation of signal and masker by 180 deg resulted in a SRM of 19 dB. After elimination of the opposite ear, the results are almost identical to those observed for the intact system when the signal and masker were presented from the same side, including the masked threshold at +5 dB (Fig. 3C). However, the shift of the masker to the contralateral resulted in a gradual increase of the response to the masker with decreasing SNR, and a masked threshold at −2 dB. Thus, in the monaural system, the value for SRM is only 7 dB.
We also analyzed the omega cell responses using signal detection theory (Green and Swets, 1966; Wiley, 2013) to define the masked threshold. In the recording example shown in Fig. 2, the masker induced strong background activity when presented from the same side as the signal, including bursts of APs that could incorrectly be interpreted by the central nervous system as a signal (false alarm). The analysis of the probability of hits, misses and false alarms for the 10 preparations is summarized in Fig. 4. When the masker was broadcast from the same direction as the signal in the binaural system, the probability of hits started to decrease with decreasing SNR at +5 dB, while the rate of misses increased (Fig. 4A). At the same time, the probability of false alarms steadily increased, and was the same as the rate of hits at a SNR of +3 dB (predefined masked threshold). A further reduction of the SNR increased the probability of false alarms to more than 1.5, and reduced the rate of hits to 0.57 at a SNR of −10 dB. With a shift of the masker by 180 deg to the contralateral (Fig. 4B), the probability of hits was up to 1 for SNRs greater than −5 dB, and decreased as the SNRs declined. At the same time, the response to the masker did not influence the rate of misses and false alarms with SNRs up to −5 dB, but then increased to a SNR of −17 dB when the rate of false alarms matched the hit rate. Thus, in this analysis the spatial separation of signal and masker resulted in a SRM of 20 dB in the binaural system. In the monaural system, the value for the masked threshold (hit and false alarm rate being equal) was at +2.5 dB, almost identical to the binaural system (Fig. 4C), whereas the masker broadcast from contralateral resulted in a masked threshold at 0 dB. Thus, in the monaural system, the value for SRM is only 2.5 dB.
The quantitative values of SRM are summarized in Fig. 5. Based on the analysis of the response strength of the omega cell, the mean SRM in the binaural system is 19.17±4.9 dB (±s.d.). The corresponding values are significantly different in the monaural system (6.26±4.25 dB; two-tailed t-test, P<0.001). Based on the analysis of the rates of hits and false alarms, the mean SRM in the binaural system was 20.9±5.8 dB, which was again significantly different from that measured in the monaural system (4.05±2.7 dB; two-tailed t-test, P<0.001).
Crickets
Fig. 6A shows three PSTHs calculated for the ipsilateral AN1 at three SNRs of +10, 0 and −10 dB, when signal and masker were both broadcast from the same side. An increase of the masker level resulted in a reduction of the modulation depth in the PSTH for two reasons: on the one hand, there is an increase of the neuronal response to the increasing masker amplitude; on the other hand, the maximum stimulus response declined, so that even in a PSTH that averaged 56 responses, the temporal representation of the conspecific signal was almost lost at a SNR of −10 dB (Fig. 6A). The quantitative results of nine preparations are summarized in Fig. 6B,C. In binaural preparations, the modulation depth of PSTHs decreased as the SNRs decreased, and the predefined threshold of 50% modulation depth for the masked response occurred at a SNR of −3.0 dB. Presenting the masker from the contralateral side maintained maximal modulation depth values up to a SNR of +10 dB, before they decreased along with decreasing SNRs. In this spatially separated situation, the masked threshold was at a SNR of −10.5 dB (Fig. 6B). Thus the quantitative value for SRM in the binaural system (the difference in masked thresholds with ipsi- versus contralateral presentation of the masker) is 7.5 dB. In the monaural system, the degree to which the modulation depth decreased with decreasing SNRs was almost the same as in the binaural system when signal and masker were presented from the same, ipsilateral side (Fig. 6C). However, the spatial separation of signal and masker had only a minor effect. The quantitative values of SRM are summarized in Fig. 5. The mean SRM in the binaural system is 7.54±4.5 dB. The corresponding values are significantly different in the monaural system (3.49±1.9 dB; two-tailed t-test, P<0.05).
DISCUSSION
Spatial release from masking is one of several mechanisms that operate on the receiver side to augment communication under conditions of high noise levels. In humans and various other animal taxa, SRM has usually been studied with psychophysical methods, and the values measured for the increase in hearing performance obtained when spatially separating the signal from noise varied widely between 0 and 30 dB (Bee, 2007; Dent et al., 1997, 2009; Gilkey and Good, 1995; Holt and Schustermann, 2007; Litovsky, 2005; Nityananda and Bee, 2012; Santon, 1987; Schwartz and Gerhardt, 1989). Some of these differences may be attributed to the different behavioral tasks (signal detection or discrimination), while other differences may be related to the different amount of directionality inherent in the system under study, or the spectral and temporal features of signals and noise. In contrast, only few studies have explored the underlying neural mechanisms of SRM. A central question in neurophysiological studies is how the relevant signal is represented in the auditory pathway and how its representation is affected by the masker. It is obvious from results of psychophysical experiments that signal detection and discrimination abilities are severely degraded in the monaural system (for a review, see Zurek, 1992). The goal of our study was, therefore, to advance our understanding of the neural mechanisms underlying SRM in insects. Specifically, we addressed the question of how much of the total amount of SRM can be attributed to peripheral directionality as compared with central nervous inhibitory interactions.
Spatial release from masking is not a solution that is present in all acoustic insects. In grasshoppers, information that allows pattern recognition and localization is processed in parallel (von Helversen, 1984; von Helversen and von Helversen, 1995). The input from both ears is pooled internally with the consequence that even when masker and signal are presented from opposite directions, the masker contributes to the masking of the signal in the pooled activity (for a behavioral confirmation, see Ronacher and Hoffmann, 2003). In contrast, serial processing has been proposed for crickets and katydids, where the information received by both ears remains separated in the afferent auditory pathway. This is a prerequisite for spatial unmasking, as the masker and the signal are then represented separately in two bilaterally paired neuronal networks (Doherty, 1985; Pollack, 1986; Stabel et al., 1989; Wendler, 1989; Schul et al., 1998; Römer and Krusch, 2000). Further evidence that supports this hypothesis has been provided by a recent study that examined the weighting of calling song cues for choice decisions in field crickets (Gabel et al., 2015).
Our results for the katydid indicate that signal detection improves substantially when the signal is spatially separated from the masker. As expected from the relatively strong degree of peripheral directionality, and the fact that the omega neurons inhibit each other reciprocally, SRM values of 19–20 dB were observed in the binaural system, compared with only 2.5–7.5 dB for monaural hearing, depending on the kind of analysis. In the cricket, comparable values are smaller, with 7.5 and 2.5 dB in the respective binaural and monaural systems, but these confirm that binaural interactions play an important role in SRM. In the few cases where this has been tested neurophysiologically in vertebrates, similar differences have been reported. For example, some auditory neurons in the frog torus semicircularis (a homolog of the inferior colliculus) demonstrated maximum masking release values of 9.4 dB, but only 2.9 dB for auditory nerve fibers (Lin and Feng, 2001). After eliminating the inhibitory action of GABA pharmacologically, the strength of spatial unmasking strongly decreased even for large angular separations of signal and masker to values obtained for auditory nerve fibers (Lin and Feng, 2003). For cells in the cat IC, maximal values for SRM of ∼20 dB have been found (Lane et al., 2005).
In katydids, an analysis of activity of the mirror-image omega cells allowed us to follow the changes associated with a spatial shift of the masker on both sides of the auditory system. In the omega cell ipsilateral to the presentation of the signal, the activity in response to the masker was strongly reduced after spatial separation of signal and masker (Fig. 3B). As in crickets (Selverston et al., 1985), the omega cells in katydids mutually inhibit one another (Molina and Stumpner, 2005). Such reciprocal inhibition is the most likely explanation for the fact that the masker-induced activity in one omega cell prevents most of the activity in the opposite cell, even when the SPL of the masker was high enough to be suprathreshold. Even more pronounced results were obtained in the analysis of the rate of hits and false alarms (Fig. 4B): spatial separation of signal and masker resulted in an almost perfect rate of hits (100%) and false alarms (0%), although the masker was broadcast at a SPL 30 dB above threshold. However, strong masker-induced activity in the cell ipsilateral to the masker also induced a decrease in the response to the signal in the opposite cell at SNRs of −10 dB or less (Fig. 3B). Such a reduction of the response to the signal in the presence of a contralateral masker is less evident in the hits and false alarms analysis because, despite a small reduction in the burst strength, the burst classifier still identified the burst as a response to the signal.
Compared with the katydid, the value for SRM in the cricket binaural system was only 7.5 dB. This, however, is in the same range as values found for two species of rainforest crickets (6–9 dB), using nocturnal background noise as masker (Schmidt and Römer, 2011). Several reasons may account for the difference between the katydid and cricket SRM values. First of all, the peripheral directionality of the ear for the masker spectrum (2.5–9 kHz) was small, as was indicated by the value of 2.5 dB for SRM in the monaural system. Moreover, AN1 in crickets is not inhibited by its mirror-image counterpart, but via the omega neuron, and although the inhibition induced by the omega neuron in AN1 is relatively strong, it appears to be less effective when compared with the reciprocal inhibition observed in the pair of omega cells in katydids (Horseman and Huber, 1994a,b; Römer and Krusch, 2000). Finally, the different values for SRM between katydids and crickets may also arise from the different approaches taken during the analysis of the masked threshold. For crickets, we decided to use the neuronal representation of the four syllables in the chirp, which was based on averaged PSTHs, as a criterion, whereas in the katydid, the criterion was based on single burst responses to chirps. Despite the averaging process, the SNR of −10.5 dB at the masked threshold and spatial separation of signal and masker was higher in the cricket as compared with −19 to −20 dB in the katydid.
How much does the absolute value of SRM tell us about unmasking under natural conditions? Schmidt and Römer (2011) criticized the conventional method that has been used to determine SRM under laboratory conditions. Using neurophysiological methods, they investigated the effect of natural background noise on signal detection thresholds in a tropical cricket, both in the laboratory and in a natural setting, for the same individual and background noise. Displacing the masker by 180 deg from the signal in the laboratory improved SNRs by 6 dB, which is comparable to the value of 7.5 dB for the SRM in the present study. However, experiments carried out directly in the nocturnal rainforest yielded SNRs at the masked threshold of aproximately −23 dB, which can be compared with those in the laboratory with the same masker, where average SNRs reached only −14.5 dB. Thus, the magnitude of SRM in the field was 13.5 dB. The authors argued that this was due to the fact that single-speaker masker playbacks do not properly reconstruct the noise situation in a spatially realistic way: in the natural habitat, multiple sound sources are distributed in space, rather than concentrated at a single point. In the present study, we also created these ‘unrealistic’ masker broadcast conditions and, therefore, the absolute values of SNRs at the masked threshold may overestimate the problem of masking in nature. However, the major aim in the present study was to investigate the contribution of peripheral directionality and central processing to masking, and we conclude that the differences between the binaural and monaural systems should reliably reflect the impact of central inhibitory interactions on spatial unmasking.
Acknowledgements
We thank M. Hartbauer for providing the burst detector algorithm, A. Schmidt and E. Schneider for helpful comments on an earlier draft of the manuscript, and S. Crockett for proofreading an earlier draft of the manuscript. We also thank the two anonymous referees for their valuable comments and suggestions.
Footnotes
Author contributions
M.B., S.H. and H.R. designed the experiments. M.B. and S.H. conducted the experiments and analyzed the data on katydids and crickets, respectively. All authors contributed equally to drafting and writing the article.
Funding
This study was supported by the Austrian Science Foundation FWF, Project P23896-B24 and Project I1054-B25 to H.R.
References
Competing interests
The authors declare no competing or financial interests.