Bats are able to recognize and discriminate three-dimensional objects in complete darkness by analyzing the echoes of their ultrasonic emissions. Bats typically ensonify objects from different aspects to gain an internal representation of the three-dimensional object shape. Previous work suggests that, as a result, bats rely on the echo-acoustic analysis of spectral peaks and notches. Dependent on the aspect of ensonification, this spectral interference pattern changes over time in an object-specific manner. The speed with which the bats' auditory system can follow time-variant spectral interference patterns is unknown.
Here, we measured the detection thresholds for temporal variations in the spectral content of synthesized echolocation calls in the echolocating bat, Megaderma lyra. In a two-alternative, forced-choice procedure, bats were trained to discriminate synthesized echolocation-call sequences with time-variant spectral peaks or notches from echolocation-call sequences with invariant peaks or notches. Detection thresholds of the spectral modulations were measured by varying the modulation depth of the time-variant echolocation-call sequences for modulation rates ranging from 2 to 16 Hz. Both for spectral peaks and notches, modulation-detection thresholds were at a modulation depth of ∼11% of the centre frequency. Interestingly,thresholds were relatively independent of modulation rate. Acknowledging reservations about direct comparisons of active-acoustic and passive-acoustic auditory processing, the effectual sensitivity and modulation-rate independency of the obtained results indicate that the bats are well capable of tracking changes in the spectral composition of echoes reflected by complex objects from different angles.
Bats, dusk- and night-active animals, have to be able to orient in their environment with only sparse or no visual feedback. They produce short ultrasonic echolocation calls and evaluate the echoes produced by reflective surfaces. Echolocation allows bats to orient in space, localize objects and define distances of objects in complete darkness. Moreover, bats can identify the three-dimensional shape of objects in space on the basis of the auditory analysis of the echoes generated by the objects(Simmons et al., 1974; Schmidt, 1988).
In general, the intensity, temporal structure and spectral composition of an echo provide information about the object's size, shape and structure(Schmidt, 1988; Grunwald et al., 2004; Simon et al., 2006). When moving around an object, the echolocating bat will perceive amplitude and frequency modulations in the echoes' spectral envelopes that depend on the angle from which the object is ensonified. Stich and Winter(Stich and Winter, 2006)described this echo-acoustic perceptual experience as resembling a visual experience caused by so-called physical or metallic colours: due to spectral interferences, these colours change their appearance with the angle of illumination and observation.
von Helversen (von Helversen,2004) showed that the bat Glossophaga soricina was able to discriminate two hollow forms, a hemisphere and a paraboloid with the same diameter and depth. Each object generated a spectral interference pattern with frequency peaks and notches that varied systematically with ensonification angle. This variation was highly specific to the object. It is conceivable that the bats solved this task by evaluating the changes in the peak and notch patterns in correlation with their movement around the objects(Moss and Surlykke, 2001; von Helversen and von Helversen,2003).
Echo-acoustic analysis of complex surfaces is a challenging task for gleaning bats. These bats pick up prey from the ground. A well-studied species of gleaning bats is Megaderma lyra Geoffroy 1810, the great false vampire bat. For prey detection, M. lyra relies on prey-generated rustling noises (Neuweiler,1990). To facilitate separation of prey objects from background, M. lyra employs short, multi-harmonic, downward modulated frequency sweeps as their echolocation calls(Schmidt et al., 2000). Complex objects of interest are typically ensonified from different aspects,and the aspect-dependent interference patterns of the perceived echoes provide important information about the three-dimensional shape of the ensonified object. In terms of auditory processing, this behaviour requires tracking changes of spectral interference patterns over time. This psychoacoustical study was designed to investigate the auditory sensitivity of M. lyrato changes in the position of spectral peaks and notches across a sequence of synthesized echolocation calls. These call sequences were generated to mimic the echoes as they would return from a three-dimensional object whose reflection characteristics change with ensonification angle. Thereby, we want to analyze the importance of time-variant spectral features for echo-acoustic object discrimination. Unlike previous studies, the changes of the peak and notch centre frequencies were time variant, varying sinusoidally with a certain modulation frequency. The bats' detection threshold for variations in the spectral envelope was measured by presenting a synthesized echolocation-call sequence filtered with time-variant filters.
MATERIALS AND METHODS
Experiment 1: time-variant peak detection
Four adult M. lyra, one male and three females, took part in the training. One of the female bats died within the data-acquisition period, thus most of the data presented is from the remaining two females and one male. They were kept in a 12 m2 room with free access to water. During training periods consisting of five consecutive days, the bats were fed with mealworms as a reward. Apart from the training rewards the animals were fed one mouse per week.
All experiments were performed in an echo attenuated chamber (3.5 m×2.2 m×2.2 m) with a wall foam coating. The setup consisted of a starting perch on one side of the room, ensuring a precise positioning of the bat, and two ultrasonic speakers, one in the left and one in the right hemi field. The two ultrasonic speakers were placed at the same distance and angle in each hemi field to the bat's starting position: the distance from the speakers to the bat's head was 1.2 m; the angle between the speakers and the bat's head was 90°. A feeding dish was placed below each speaker.
The source signal was a sequence of 17 synthesized echolocation calls. Each call was a multi-harmonic frequency sweep with a duration of 1.5 ms. The fundamental frequency swept from 23 to 19 kHz. Five harmonics were generated with attenuations of 30, 10, 5, 0 and 5 dB for harmonics one to five,respectively. The call was windowed with a raised-cosine window with a 0.2 ms rise time, 1.1 ms steady state and 0.2 ms decay time.
For the time-invariant echolocation-call sequence, a band-pass filter with a reference centre frequency (CF) of 60 kHz and a bandwidth of ±10% of the CF was applied to all 17 calls in the sequence. For the generation of the time-variant echolocation-call sequences, the CF of the band-pass (peak)filter was sinusoidally modulated around the reference CF along a log-frequency axis. The filter was designed as a finite-impulse-response,band-pass filter of order 62. The detection threshold for variations of spectral peaks was measured by varying the modulation depth (in % of the CF)of the time-variant filtered echolocation-call sequence. To measure the bats'sensitivity to the CF modulation, we presented modulation depths of 100, 52,40, 30, 24, 18, 14, 11 and 9% of the CF. A modulation depth of 100% defined a frequency range of ± one octave around the CF and produced filter CFs between 30 and 120 kHz. The modulation rate of the CF modulation was 2, 4, 8 or 16 Hz. One echolocation-call sequence always contained two modulation periods. In consequence, the overall duration of the echolocation-call sequence and the temporal separation between the echolocation calls in the sequence decreased with increasing modulation rate. For a modulation rate of 2 Hz, the echolocation-call sequence was 1 s long and the temporal separation between the echolocation calls was about 61 ms; for a modulation rate of 16 Hz, the echolocation-call sequence was 125 ms long and the temporal separation between the echolocation calls was about 6 ms. Spectrograms of an unfiltered call and time-variant and time-invariant echolocation-call sequences are shown in Fig. 1A,B. These echolocation-call sequences simulate a bat moving twice around an abstract virtual acoustic object and ensonifying it from eight different angles. Different flight speeds are represented by modulation rates between 2 and 16 Hz. While this range of modulation rates is low compared with many auditory studies on the perception and encoding of time-variant signals, the rates are certainly high enough to include the speed of spectro-temporal modulations encountered by a bat when it moves around an object ensonifying it from different angles.
To preclude the bats' use of overall presentation level or absolute-frequency cues (Krumbholz and Schmidt, 1999), the presentation level was roved by ±6 dB and the reference CF was roved by ±10% over trials. Moreover, the phase of the sinusoidal frequency modulation was roved over trials.
The echolocation-call sequences were computer generated (Matlab 5.3,Mathworks, Natick, MA, USA) and digital–analog converted (RX6, sampling rate 260 kHz; Tucker Davis Technologies, Gainesville, FL, USA). The echolocation-call sequences were amplified (Rotel RB 976 MK II; Worthing, UK)and presented over the ultrasonic loudspeakers (Matsushita EAS 10 TH 800D;Osaka, Japan) at a level of 65 dB SPL (preceding the roving level). The frequency response of all setup components, including speakers, was flat within ±5 dB between 5 and 100 kHz. The echolocation-call sequences were heterodyned by two DSPs (RP2, sampling rate 200 kHz; Tucker Davis Technologies), allowing the experimenter to follow the presentation acoustically via headphones.
In a two-alternative, forced-choice experiment, psychometric functions were obtained for variations in the spectral content of synthesized echolocation calls. The time-variant filtered echolocation-call sequence was played back by one speaker and the time-invariant filtered echolocation-call sequence by the other. While hanging on the perch, the bat perceived the echolocation-call sequences alternately from each speaker. There was a fixed inter-stimulus interval of 500 ms between successive echolocation-call sequence presentations. The echolocation-call sequence presentations stopped as soon as the bat left the perch. The bat had to therefore make its decision at the starting position. On the other side of the room, opposite to the perch, the experimenter was seated, controlling the procedure and the data storage via touch screen (WES TS, ELT121C-7SWA-1; Nidderau-Heldenbergen,Germany). The experimental program was written in Matlab 5.3.
The bats were trained to fly to the speaker from where they perceived the time-variant filtered echolocation-call sequence. For the initial training,the modulation depth was set to 40% of the CF. As a control, one bat was trained to fly to the time-invariant filtered echolocation-call sequence. Whether the time-variant echolocation-call sequence was presented at the left or right position was determined by a pseudo-random sequence, with the same echolocation-call sequence never occurring more than three times in a row at the same position. As soon as the bats were able to solve this task with a stable performance of >85% correct choices over several days, the modulation depth of the time-variant filtered echolocation-call sequence was decreased and increased. 30 trials for each modulation depth were collected. The performance was calculated as decisions for the side of the time-variant echolocation-call sequence in percent correct as a function of the modulation depth. The significance level was set to 75% correct choices. After evaluating the threshold modulation depth for a specific modulation rate, the bats were trained for the next modulation rate and the corresponding threshold was measured.
Experiment 2: time-variant notch detection
The animals, the experimental setup and the procedure were the same as in Experiment 1.
The source signals were the same synthetic call sequences as in Experiment 1. The filter was designed as a finite-impulse-response, band-stop (notch)filter of order 64, a reference CF of 60 kHz and a bandwidth of ±10% of the CF. For the time-invariant echolocation-call sequence, this filter was applied to all 17 calls in the echolocation-call sequence.
For the generation of the time-variant echolocation-call sequences, the CF of the band-stop filter was sinusoidally modulated around the reference CF along a log-frequency axis. As in Experiment 1, the detection threshold for variations of spectral notches was measured by varying the modulation depth(in % of the CF) of the time-variant filtered echolocation-call sequence. The stimuli are illustrated in Fig. 1C,D.
Experiment 1: time-variant peak detection
The bats' performance in the time-variant peak detection task was very similar and, thus, the threshold was calculated as a mean value for individuals. Psychometric functions for the modulation rates of 2, 4, 8 and 16 Hz for all bats are shown in the four panels of Fig. 2.
At a modulation rate of 2 Hz, the four bats were able to detect a frequency modulation depth of 10.9% of the CF, on average(Fig. 2A). At a modulation rate of 4 Hz, the four bats could detect a modulation depth of 10.9% of the CF, on average (Fig. 2B). At a modulation rate of 8 and 16 Hz, the three remaining bats could detect a modulation depth of 11.2 and 11% of the CF, respectively(Fig. 2C,D). For a CF of 60 kHz, 11% of the CF corresponds to a frequency bandwidth of 13 kHz.
Surprisingly, all animals readily transferred the discrimination task from one modulation rate to the next, although not only the modulation rate but also the overall echolocation-call sequence duration changed.
The slight decrease of the bats' performance when the modulation depth was increased from 40% of the CF to 100% can be attributed to the bats being trained on a modulation depth of 40%, and they seemed slightly irritated by the high modulation depths, allowing the assumption that these signals may have sounded different from the initially trained condition (40%).
However, the modulation depth used for the training did not affect the threshold value: as a control, one female bat was retrained at a modulation rate of 4 Hz to a modulation depth of 100% after the data acquisition for all other experimental conditions was finished. Fig. 3 depicts the psychometric function obtained after initial training to a 40% modulation depth and the function obtained after initial training to a 100% modulation depth. Although the above-threshold performance differs somewhat between these data acquisitions, near-threshold performance is very similar, ensuring the validity of the obtained threshold values for all animals.
In general, we were able to observe that the spectral peaks have to vary by about 11% of the CF to be discriminated from the time-invariant peaks. Furthermore, this threshold seems to be independent of the modulation rate in the tested range.
Experiment 2: time-variant notch detection
Psychometric functions for the detection of a time-variant spectral notch are shown in Fig. 4 in the same format as for Experiment 1. Again, the threshold was calculated as a mean value for three bats. For a modulation rate of 2 Hz, the threshold was 11.3%of the CF. For modulation rates of 4, 8 and 16 Hz, the thresholds were 11.4,10.9 and 11.8% of the CF, respectively. The general performance was slightly worse, but all in all did not differ from that of the first experiment.
As was the case for the time-variant peak detection, transfer to a new modulation rate did not require retraining and thresholds were largely independent of modulation rate.
The current psychoacoustical study was designed to investigate the auditory sensitivity of the bat M. lyra to time-variant spectral peaks and notches imposed on sequences of synthesized echolocation calls. We found that M. lyra is well able to discriminate a time-variant echolocation-call sequence from a time-invariant echolocation-call sequence. The detection threshold for the time-variant echolocation-call sequence in the tested range lies at 11% of the CF, independent of whether a spectral peak or notch was modulated. Furthermore, the detection threshold seems to be unaffected by the modulation rate across the tested range from 2 to 16 Hz. In the following, we will discuss the obtained data in regard to these three points: the obtained threshold values in general, the apparent modulation rate independency and the threshold values of the peak and notch signals in comparison.
The detection threshold for changes in the spectral domain lies at ∼11%of the CF, independent of whether the CF of a peak or notch filter was varied. This threshold is comparable to frequency modulations (7–21%) occurring in the active-acoustic object-discrimination experiment of Simon et al.(Simon et al., 2006) based on the assumption that the bats exploited spectral-notch changes in that experiment. In an earlier two-front, phantom-target study, Schmidt obtained similar threshold values for M. lyra of 6–13% for spectral-notch centre frequency changes(Schmidt, 1992). Note,however, that again, these thresholds were obtained in an active-acoustic paradigm where the bats evaluated the spectral content of echoes of their own calls. The frequency differences, on the other hand, were static within a trial. The current data obtained in a passive-acoustic paradigm with time-variant filtering corroborate these findings.
M. lyra is a gleaning bat; it rarely hunts actively for flying insects and therefore does not have to detect wing flutter. Nevertheless, the current thresholds, obtained in a passive-acoustic paradigm, are comparable to values obtained for other bat species in active-acoustic paradigms(Mogdans and Schnitzler, 1990; Bartsch and Schmidt, 1993; Esser and Kiefer, 1996). Typically, M. lyra catches its prey from the ground and first detects it by listening to prey-generated rustling noises. By first relying on passive rustling noises and then moving in to evaluate and catch possible prey, it might not need to analyse fine modulation differences.
Lyzenga and Carlyon measured in humans the detection of just noticeable differences for a sinusoidal modulation of the CF of a synthetic formant with a fixed fundamental (Lyzenga and Carlyon,1999). The thresholds they obtained were larger, by a factor of two, than thresholds for the discrimination of (static) formant frequencies(Lyzenga and Horst, 1997). This seems to hold for starlings as well, which also show 2–3 times larger threshold depths for low modulation frequencies than for just noticeable frequency differences between pure tones(Langemann, 1991). This difference might explain our slightly increased thresholds in comparison with other studies, where frequency differences were static(Schmidt, 1992; Simon et al., 2006).
The current detection thresholds for spectral changes in the envelope were apparently independent of modulation rate. Temporal processing therefore does not seem to be a critical factor for the discrimination task. By contrast, the current data are consistent with an analysis of place cues along a tonotopic frequency axis. Moore and Sek (Moore and Sek, 1995; Moore and Sek,1996) and Sek and Moore (Sek and Moore, 1995; Sek and Moore, 2000) claim that the detection threshold for frequency modulations of low-frequency pure tones for humans is modulation-rate dependent, as the low-frequency tones are encoded by phase-locked, temporal cues. In mammals, phase locking is limited to frequencies below ∼5 kHz(Rose et al., 1968; Palmer and Russell, 1986; Oertel, 1999). Higher frequencies are encoded exclusively by place cues. In humans, spectral place cues provide worse frequency accuracy than phase-locked, temporal cues(Moore and Sek, 1995).
In the bat, each of the presented ultrasonic calls can only be encoded by auditory place cues. Thus, no phase-locked, temporal information concerning the current frequency composition of the call is available. Consequently, the frequency acuity is limited. The phase-locking, low-pass filter does not impair the peripheral auditory representation of the modulation rate as all tested modulation rates were considerably lower than the phase-locking filter cut-off frequency, meaning that the fluctuations of the spectral envelope can easily be encoded through phase-locking. In summary, the current data are consistent with the hypothesis that the spectral peaks and notches are encoded via place cues in the peripheral auditory system and that the bats'central auditory system is fast enough to follow the changes of these place cues over time for the range of modulation frequencies tested.
Several electrophysiological studies on temporal encoding in the mammalian auditory cortex have revealed low-pass characteristics of synchronous cortical discharges with a cut-off frequency around 20 Hz(Schulze and Langner, 1997; Lu et al., 2001; Liang et al., 2002). In an electrophysiological study with rising and falling FM stimuli, responses of neurons in the primary auditory cortex of the gerbil were recorded(Ohl et al., 2000). Across the range of tested modulation frequencies (1–24 Hz), the neurons' responses did not vary with modulation rates. This again fits with the modulation-rate-independent thresholds we obtained in this study for modulation rates lying in a similar range.
In a psychophysical study in the bat Tadarida brasiliensis,Bartsch and Schmidt tested perceptual sensitivity to sinusoidal frequency modulation at much higher rates (10–2000 Hz, CF=40 kHz)(Bartsch and Schmidt, 1993). They found that threshold modulation depths deteriorated with increasing modulation rate. As we only tested modulation rates between 2 and 16 Hz, we are not able to comment on whether the bats may have showed increased thresholds for even higher modulation rates. Note that our stimulus trains were intended to simulate a stationary complex object ensonified by a bat surrounding the object twice and ensonifying it from eight different angles. In this context, modulation rates above 16 Hz would have represented a highly unnatural situation, 16 Hz already representing an extreme.
Comparison of peak- and notch thresholds
In the current study, the bats were equally sensitive for time-variant peaks and notches. In the following, we discuss this finding in regard to the question of whether the bat M. lyra extracts pitch information from its harmonically structured echolocation calls or whether echo analysis is based on the auditory processing of spectral place profiles.
Sedlmeier was able to show that M. lyra categorizes ultrasonic pure tones and complex harmonic structures with attenuated or missing fundamentals almost identically(Sedlmeier, 1992). This was interpreted as that the bat perceives the `missing fundamental', enabling it to integrate different acoustic qualities to a complex perception. Sedlmeier suggested that the bats perceive a pitch corresponding to the fundamental frequency of a sound and categorize sounds with different spectral features according to their pitches (Sedlmeier,1992). Preisler and Schmidt further investigated this topic, and examined whether M. lyra evaluates complex harmonic structures according to their pitch or on the basis of overall spectral similarity(Preisler and Schmidt, 1998). They observed that the tested bats differed in which of the strategies they applied to solve the task. Krumbholz and Schmidt showed that M. lyraspontaneously classified test signals according to their broadband spectral similarity, using trained signals as spectral templates, not pitch cues(Krumbholz and Schmidt,1999).
As the slope of the filters used in the current study was rather steep(filter-order 62), an echolocation call filtered with a band-pass (peak)filter centred at 60 kHz will cause a pitch percept corresponding to 60 kHz. Due to the time-variant filtering, the bats would hear a time-variant pitch. When the notch filters are applied, on the other hand, the bats always hear all harmonics except the one filtered out by the notch filter. Thus, the pitch would always correspond to the calls' fundamental frequency of ∼21 kHz. As pitch extraction is rather insensitive to amplitude modulations of higher harmonics, this percept would not be strongly affected by the time-variant filtering. In summary, if the bats had applied a pitch-based analysis, one would expect a better performance with the band-pass (peak) filters than with the band-stop (notch) filters. The finding that this is not the case corroborates the conclusions of Krumbholz and Schmidt(Krumbholz and Schmidt, 1999)that in most cases the bats recruit a spectral profile rather than a pitch analysis for echo imaging.
In summary, the current data show that the bat M. lyra can discriminate time-variant from time-invariant echolocation-call sequences with good accuracy. In the range of modulation rates tested (2–16 Hz), the discrimination performance was constant. The fact that sensitivity to time-variant spectral peaks and notches was similar argues in favour of a spectral profile analysis rather than a pitch-based analysis of the harmonic echolocation-call sequences. With the reservation of comparing passive-acoustical and active-acoustical auditory processing, the current data indicate that the bats' central auditory system is fast enough to track the changes in the spectral composition of returning echoes when the bat ensonifies an object while flying around it.
We thank Uwe Firzlaff, Holger Goerlitz and Sven Schoernich for helpful comments on earlier versions of this paper. We also thank two anonymous reviewers for very constructive reviews of the manuscript. This work was funded by a grant from the `Deutsche Forschungsgemeinschaft' (Wi 1518/8) to L.W.