SUMMARY
The unique combination of flight and echolocation has opened the nocturnal air space as a rich ecological niche for bats. By analysing echoes of their sonar emissions, bats discriminate and recognize three-dimensional (3-D) objects. However, in contrast to vision, the 3-D information that can be gained by ensonifying an object from only one observation angle is sparse. To date, it is unclear how bats synchronize echolocation and flight activity to explore the 3-D shape of ensonified objects. We have devised an experimental design that allows creating 3-D virtual echo-acoustic objects by generating in real-time echoes from the bat's emissions that depend on the bat's position relative to the virtual object. Bats were trained to evaluate these 3-D virtual objects differing in their azimuthal variation of either echo amplitude or spectral composition. The data show that through a very effective coordination of sonar and flight activity, bats analyse an azimuthal variation of echo amplitude with a resolution of approximately 16 dB and a variation of echo centre frequency of approximately 19%. Control experiments show that the bats can detect not only these variations but also perturbations in the spatial arrangement of these variations. The current experimental paradigm shows that echolocating bats assemble echo-acoustic object information – acquired sequentially in flight – to reconstruct the 3-D shape of the ensonified object. Unlike previous approaches, the recruitment of virtual objects allows for a direct quantification of this reconstruction success in a highly controlled experimental approach.
INTRODUCTION
Object detection, classification and recognition are essential for successful navigation through complex natural habitats. Various combinations of sensory modalities are used to generate an internal spatial representation of the surroundings. Independent of the modality employed for the analysis of the environment, three-dimensional (3-D) object recognition is influenced by the observation angle. Visually, object analysis and recognition is aided by translating an object's shape from different viewpoints (Logothetis and Sheinberg, 1996); e.g. humans and monkeys recognize objects viewed from angles never previously viewed, with mental translation and rotation of the object's shape playing an important role (Murray et al., 1993; Logothetis and Sheinberg, 1996; Hamm and McMullen, 1998; Willems and Wagemans, 2001; Lloyd-Jones and Luckhurst, 2002). This mental shape representation can even be translated from one modality to another. For example, an object can be recognized visually when it had previously been explored haptically from different angles (Norman et al., 2004).
Some mammals have adapted to exploring habitats where vision is impracticable. By producing short ultrasonic calls through their mouth or nose, bats trigger echoes from reflective surfaces for both orientation and object analysis (Griffin, 1944; Griffin and Simmons, 1974). In general, the intensity, temporal structure and spectral composition of an echo provide information about the object's size and shape (Schmidt, 1988; Grunwald et al., 2004; Simon et al., 2006). As an image in the visual system, an echo is dependent on the observation angle. Extraction of an object's 3-D shape from echo-acoustic information, however, is more difficult than in the visual system. An echo only encodes an object's distance and its depth dimension unambiguously. The height and width of an object, which are most easily accessed in vision, are not imaged explicitly in sonar. However, the integral of these dimensions is encoded in echolocation in terms of echo intensity (Wiegrebe, 2008; but see Heinrich et al., 2011). Therefore, not only one dimension, as in the visual system, but two dimensions need to be extracted through sequential echo analysis. Indeed, both bats and echolocating toothed whales typically scan an object with a series of echolocation calls with varying ensonification angles by moving around it (Helweg et al., 1996a; Helweg et al., 1996b; von Helversen, 2004). This produces object- and movement-correlated amplitude modulations (AMs) and frequency modulations (FMs) in the echo stream. Through the cognitive assembly of object information from this echo stream, a 3-D internal representation of an object's size and shape can be constructed (Falk et al., 2011; von Helversen, 2004).
We have developed an experimental setup that can create 3-D virtual echo-acoustic objects for bats by presenting digitally generated reflection characteristics, which change in accordance to the bats' ensonification angle with respect to the virtual object (VO). In previous studies with echo-acoustic VOs (e.g. Grunwald et al., 2004; Firzlaff et al., 2007; Weissenbacher and Wiegrebe, 2003; Schmidt, 1992), bats received a static VO independent from where the bat echolocated. The technical enhancement implemented in the present study allows us to investigate the correlation between a bat's movement in space and the sonar exploration of a 3-D VO. To analyse a bat's coordination of flight and sonar activity, we created two different VOs: one changed dependent on the azimuthal ensonification angle and the other one did not. In a two-alternative forced choice (2-AFC) paradigm, bats were trained to explore two VOs in free flight. Thereby, they experienced echoes as if they were moving around two real objects. This highly controlled phantom echo paradigm allows us to investigate the bats' coordination of sonar and flight activity when exploring VOs of explicitly defined echo-acoustic properties.
MATERIALS AND METHODS
Animals
Six adult Megaderma lyra E. Geoffroy 1810, two males and four females, were used in the training. They were kept in a 12 m2 room with free access to water. During training periods of five consecutive days, the bats were fed mealworms as a reward. Apart from the training rewards, the animals were fed one mouse and two crickets per week.
Experimental setup
All experiments were performed in an echo-attenuated chamber (3.5×2.2×2.2 m) with a foam coating on the walls. The setup consisted of a starting perch on one side of the room, ensuring a precise positioning of the bat, two virtual-object units (VO units) and a high-speed IEEE 1394a video camera (Basler A602f, Ahrensburg, Germany) above the VO units. The dimension of the camera's observation field was 1×2 m. The VO units were placed in the centre of each hemi-field of the observation field and consisted of an ultrasonic ¼ inch microphone (Brüel & Kjær 4135 capsule, Nærum, Denmark) with a preamplifier (MV 302, Microtech Gefell, Gefell, Germany), a power supply (Brüel & Kjær 2807), an omni-directional speaker (Elac 4PI PLUS.2, Elac Electroacustic, Kiel, Germany), a landing platform and a feeding dish (Fig. 1). The microphone was mounted vertically with its membrane facing downward and positioned concentrically in the circular loudspeaker. Thus, the microphone was always ensonified perpendicularly, independent of the bat's azimuthal position relative to the VO unit. Both the microphone and the omni-directional speaker had a flat frequency response (microphone ±5 dB; speaker ±10 dB between 20 and 120 kHz). The experimenter seated in the chamber rewarded the bat for a correct decision by opening an iris aperture (Linos Photonics, Göttingen, Germany) above the feeding dish if the bat landed on the corresponding platform. The chamber was illuminated by a 40 W red light bulb positioned above the camera.
The bats' echolocation calls were recorded by the microphones, band-pass filtered (20–100 kHz), amplified by 80 dB (PM 5171, Philips, Hamburg, Germany), analog–digital converted and filtered by a real-time processor (RX6, sampling rate 260 kHz, Tucker Davis Technologies, Gainesville, FL, USA). Only calls louder than 55 dB SPL were used to generate an echo from each VO. Thus, both VO units could be active simultaneously if they were ensonified loud enough. The VOs were computer generated (MATLAB 5.3, MathWorks, Natick, MA, USA). The bat's azimuthal position relative to the two VO units was determined by a second computer running a customized program version of EyeSeeCam (Schneider et al., 2009). In the customized version, the eye-tracking module was replaced by a dedicated bat-tracking module tailored to this experiment. The camera signal (100 frames s–1) was converted by subtracting each frame from the stationary background acquired when the program is initiated, calculating the centroid of the resulting darkest pixel group and converting the position of this centroid into an azimuthal angle with respect to each VO. The azimuthal angle was encoded by 36 circular sectors of 10 deg each. The bat's angular position relative to the two VO units was sent to the real-time processor via an input/output (I/O) device. Based on this information, the RX6 chose one of the predefined impulse responses (IRs; see Stimuli, below) with which to convolve the bats' calls to create the echo from each VO. If the bat was outside the 1×2 m area, IRs consisting of zeros were used for the convolution, resulting in no echo playback. The outgoing digital–analog converted echoes were amplified (RB 976 MK II, Rotel, Worthing, UK) and presented over the ultrasonic loudspeakers. The echoes were additionally heterodyned by two further real-time processors (RP2, sampling rate 200 kHz, Tucker Davis Technologies), allowing the experimenter to follow the presentation acoustically via headphones.
The duration of the IR, the I/O delay of the RX6 and the distance between the microphone membrane and the speaker membrane contribute to the overall delay of the echo presentation. The minimum I/O delay of 0.43 ms of the RX6 processor is increased by 0.313 ms for the FM-Peak and AM conditions and by 0.153 ms for the FM-Notch condition, because of the IR itself (see Stimuli, below). The path from the bat to the speaker membrane is 5.5 cm shorter than the path to the microphone membrane. Consequently, an additional delay of 0.162 ms is introduced for the VO. Thus, the position of the VO is always behind the first and strongest physical reflection from the VO unit by a distance of 15.4 cm (overall IO delay of 0.905 ms; see Fig. 2B–D) for the FM-Peak and AM conditions and by a distance of 12.7 cm (overall I/O delay of 0.745 ms) for the FM-Notch condition.
It is important to note that monitoring the position of the bat relative to the VO unit is sufficient to create a VO. The bat's head aim or the main axis of its echolocation beam do not need to be monitored. This is because the VO unit creates an echo in real time of the call as it is picked up by the VO unit's microphone. If the bat chooses to change its sonar aim, the ensonification loudness, picked up by the microphone, and accordingly the echo loudness, radiated by the concentric speaker, will change, as it is the case for a real object placed at the VO unit's position.
Echo-acoustic calibration
To ensure that the physical echoes of the VO units did not mask the virtual echoes generated by the loudspeaker of the VO units, we measured the physical and virtual echoes reflected by a VO unit. Measurements were made using a virtual bat, consisting of an ultrasonic speaker (Panasonic EAS10TH800D, Osaka, Japan), a concentrically mounted ¼ inch measuring microphone (Brüel & Kjær 4135) and the corresponding conditioning amplifiers. Sound presentation, recording and cross-correlation was performed by an audio analyzer (Stanford Research SR780, Stanford Research Systems, Sunnyvale, CA, USA) running at a sampling rate of 260 kHz. Calibrations were performed with a periodically repeated, synthetic M. lyra echolocation call.
Because the VO units were axially symmetric, physical echoes were always the same irrespective of the bat's ensonification angle of the VO units. Fig. 2 illustrates the waveforms of the echoes extracted from the calibration. In all panels the first signal is the direct crosstalk from the speaker to the microphone. The second signal (at a correlation lag of approximately 2 ms, marked with a single asterisk) is the physical reflection of the emitted sound from the VO unit. The third signal in Fig. 2B–D (indicated by the two asterisks), is the reflection from the VO, itself at different attenuations in the AM condition (see Stimuli, below). As can be seen in these panels, physical and VO echoes are well separated in time and the VO can produce the louder echo.
Stimuli
Analogous to a situation where a bat ensonifies a real 3-D object from various directions and receives different echoes depending on the ensonification angle, the current experimental design permitted the presentation of real-time echoes that changed when the bat moved from one 10 deg sector to the next. The VOs are defined exclusively in terms of their IRs. An IR is defined by the reflection characteristics of an object ensonified with an impulse. The echo the bat receives from an object is the result of the convolution of the echolocation call with the IR of the object. In our experimental design, the angular position of the bat relative to the VOs determined the IR, which was used for the real-time convolution. All IRs were band-pass filtered with cut-off frequencies of 20 and 120 kHz and consisted of 80 or 164 coefficients. One object, made up of 36 different IRs (azimuthally modulated), changed depending on the azimuthal angle and the other, consisting of 36 identical IRs, was independent of the bat's azimuthal angle.
AM condition
In the AM experimental condition, bats were trained to detect angular changes in the amplitude of the IRs, i.e. in the target strength of the VOs. Each VO consisted of a set of 36 IRs, one for each 10 deg sector. Each IR was generated as a linear-phase, finite-impulse-response band-pass filter with 164 coefficients. The band-pass cut-off frequencies were 30 and 120 kHz at their –6 dB points. The azimuthally modulated IRs were generated in blocks of six IRs. The first four IRs had an attenuation of –70 dB, and the remaining two had an attenuation of –30 dB. The resulting AM depth is 40 dB. This azimuthal reflection pattern was repeated six times, creating a cylindrical VO that had 12 strong and 24 weak reflections. The unmodulated IRs all had the same attenuation level of –35 dB. To quantify the sensitivity for angular amplitude variations, IR sets were generated with AM depths between 40 and 0 dB in 5 dB steps. Care was taken that the overall attenuation across all six IRs comprising one cycle of the VO was constant and independent of modulation depth. The modulated VO with an azimuthal AM depth of 40 dB and the unmodulated VO are shown in Fig. 3A,B.
The AM condition was run in two versions: an active-acoustic version and a passive-acoustic version. In the active-acoustic version, the VO units provided real-time generated echoes of the bats' vocalisations towards the VOs. Here, the bats had to vocalise towards a VO unit to receive the VO's echo. In the passive-acoustic condition, the echoes were generated from a synthetic M. lyra echolocation call. Here, the bats received echoes as in the active condition, but these occurred periodically at a rate of 9.85 Hz (interpulse interval of 100 ms) and independent of whether the bats emitted echolocation calls. It is important to emphasize that in both conditions the bat's position still determined with which IR the echo was generated. The synthetic echolocation call was a multi-harmonic frequency sweep with a duration of 1.5 ms. The fundamental frequency swept from 23 to 19 kHz. Five harmonics were generated with attenuations of 30, 10, 5, 0 and 5 dB, beginning with the first harmonic. The call was windowed with a raised-cosine window with a 0.2 ms rise time, 1.1 ms steady state and 0.2 ms decay time.
FM conditions
FM-Notch condition
In the FM-Notch condition, all IRs were finite-impulse-response, band-stop filters with a reference centre frequency (CF) of 60 kHz, a bandwidth of ±15% of the CF and 80 coefficients. The number of coefficients was reduced from 164 to 80 in this condition to facilitate the technical realisation (MATLAB ‘fir1’ function). All 36 IRs of the unmodulated VO had the same CF. For the generation of the modulated VO, the CF of the band-stop filters was sinusoidally modulated around the reference CF along a log-frequency axis. To measure the bats' sensitivity to the azimuthal FM, depths of 100, 52, 40, 30, 24, 18, 14, 11 and 9% of the CF were applied to the modulated VO. An FM depth of 100% defined a frequency range of ±1 octave around the CF. Therefore, filter CFs between 30 and 120 kHz were produced when using a CF of 60 kHz. Along the thirty-six 10 deg sectors, three complete modulation periods were presented, such that each modulation period was defined by 12 IRs. The CFs of the filters are depicted in Fig. 3C,D.
FM-Peak condition
The stimuli in the FM-Peak condition were created analogous to those in the FM-Notch condition except that the IRs were defined as band-pass filters with a reference CF of 60 kHz, a bandwidth of ±10% of the CF and a filter order of 164.
FM-Control condition
This control experiment was implemented to address the question, to what extent can the bats exploit the deterministic, sinusoidal variation of the CF of the modulated VO? Here, the band-stop CF was modulated for both the rewarded and the unrewarded VO, but for the rewarded VO, the azimuthal modulation was sinusoidal whereas the CF varied randomly across the same range in the unrewarded VO. The CFs of the filters are depicted in Fig. 3E,F. The random variation was refreshed for each trial.
To preclude the bats' use of residual differences between the overall target strengths of the two VOs, the overall target strength was roved by ±6 dB over trials and VOs. To eliminate absolute position cues and to force the bats to fly around the VOs, the modulated VO was randomly rotated around the vertical axis, meaning that the peak positions were rotated but the overall periodicity was kept intact. This was implemented for all AM and FM conditions. Furthermore, for the FM conditions, the reference CF was roved by ±10% over trials to prevent the use of absolute-frequency cues.
Procedure
In a 2-AFC experiment, psychometric functions were obtained for the detection of modulated VOs. The bats were trained to fly around the VOs. Each echolocation call was convolved with the current IR. The camera tracking the bat's position in space determined the IR for filtering. The modulated echoes were played back by one VO unit and the space-unmodulated echoes were played back by the other. Playback did not start until the bat had left the perch. On the other side of the room, opposite to the perch, the experimenter was seated, controlling the procedure and the data storage via a touch screen (WES TS, ELT121C-7SWA-1, Nidderau-Heldenbergen, Germany). The experimental program was written in MATLAB 5.3.
For all conditions, the bats were trained to approach the feeder associated with the VO that provided the orderly angular target strength or CF variation. The bat made a choice by landing on the platform of the VO unit. For correct choices, the feeding dish was opened and the bat was rewarded with mealworms. Whether the modulated VO was presented at the left or right VO unit was determined by a pseudo-random sequence, with the same VO never occurring more than three times in a row at the same VO unit. As soon as the bats were able to solve this task with a stable performance of >80% correct choices over several days, the azimuthal AM or FM depth was decreased. Psychometric functions for AM and FM detection were acquired with 30 trials per modulation depth. The psychometric functions were fitted with a sigmoidal function and the 75% correct value of the fit was taken as threshold.
Flight and sound data acquisition
During the FM conditions, the momentary position of the bat was sampled with a frequency of 100 Hz in each trial and the echoes generated by the real-time processor were recorded for each VO using a Firewire sound card (Phase 24, Terratec Electronic, Nettetal, Germany) at a sampling frequency of 192 kHz. The recordings were analysed off-line in terms of the number of calls and the average repetition period in a 4 s interval preceding the bats' decision in each trial. Because the sounds are recorded after the processor, only those sounds loud enough to trigger playback are included in the analysis.
To analyse the flight patterns and sonar emissions, 120 trials were selected for each bat, consisting of 30 trials for each of four modulation depths, two above and two below threshold. In the flight-pattern analysis, the average number of different sectors visited by a bat for at least 10 ms was calculated separately for these 120 trials. Both sound and flight data were analysed according to which VO was selected in each trial.
In a subset of trials, the bat's flight path was saved to a video file for off-line analysis to depict exemplary flight paths. Flight paths are displayed in terms of the summed difference images of the recorded video.
RESULTS
Psychoacoustic results
AM condition
Four bats were successfully trained to discriminate the AM VOs from unmodulated VOs. Psychometric functions for the active-acoustic AM condition are shown in Fig. 4A. At an AM depth of 40 dB, all animals detected the AM reliably. At this high AM depth, the target strength of the rewarded VO fluctuated strongly when the bat flew around it. Note that because of the roving-level paradigm, the rewarded VO was not necessarily the one producing the higher or the lower overall target strength. With decreasing AM depth, the azimuthal variation of target strength from the two VOs becomes more similar and, consequently, performance decreases. Threshold azimuthal AM depths, derived from a sigmoidal fit to the psychometric functions, are given in the legend of Fig. 4A. Although bat 1 detected an azimuthal AM depth of only 9 dB and bat 5 a depth of 12 dB, bats 2 and 3 needed larger azimuthal AM depths of 21 and 23 dB to discriminate the AM VOs from unmodulated VOs.
The thresholds obtained in the passive-acoustic AM condition were 11, 12 and 15 dB AM depth (psychometric functions are depicted in Fig. 4B), comparable to those measured in the active-acoustic AM condition, indicating that the evaluation of the AM depth in the echoes is independent of the emitted echolocation call and that a call rate of approximately 10 calls s–1 is sufficient to solve the task.
FM conditions
Three bats were successfully trained to the FM-Notch condition. At an FM depth of 100% of the CF, the frequency content of the returning echoes fluctuated strongly when the bats flew around the modulated VO and the bats could easily discriminate between the FM VO and the unmodulated VO. With decreasing FM depth, the two VOs increasingly resembled each other until the bats were no longer able to distinguish between them. The resulting performance curves are given in Fig. 5A. The bats' performance resulted in threshold FM depths of 15% of CF for bat 2, 20% for bat 3 and 29% for bat 6.
In the FM-Peak condition, psychometric functions (shown in Fig. 5B) were obtained for four bats. FM depths of 15% of the CF for bat 1, 18% for bat 2, 18% for bat 3 and 19% for bat 4 were needed in order to discriminate between the two VOs.
In the FM-Control condition, the CF of the azimuthal modulation varied sinusoidally for the rewarded VO whereas the CF varied randomly across the same range for the unrewarded VO. In this very difficult experimental condition, data acquisition was only possible for two bats. Performance thresholds were 41% of CF for bat 2 and 69% of CF for bat 3 (Fig. 5C).
Analysis of flight patterns and sonar emissions of the bats around the VOs
Supplementary material Movie 1 shows an exemplary trial for bat 2 with a time expansion factor of 10. Two typical flight paths pursued in an experimental trial are depicted for each bat in Fig. 6 (the bat number is indicated on the left side of each panel). Shown is a flight path during one trial from the moment the bat left the starting perch until it made a decision by landing on the platform of one of the two VO units. Each bat pursued a different strategy to solve the task. Bat 1 usually flew around the VO whereas bat 2 typically flew back and forth between the two VOs. Bats 3 and 6 often made a decision based on the information gained from only one VO.
The number of 10 deg sectors of the selected or not-selected VO in each trial visited by each bat for at least 10 ms was analysed for a subset of 120 trials, divided into above- and below-threshold trials (Fig. 7). In the 60 above-threshold trials (Fig. 7A), all bats visited a significantly higher number of sectors at the VO they ultimately selected than at the VO they did not select (Wilcoxon rank-sum test, P<0.001). This same significance can be seen for the 60 below-threshold trials (Fig. 7B), except for bat 2. On average, bats 1, 2 and 4 visited more sectors at the selected VO than bats 3 and 6. All bats visited on average at least 10 sectors at either the selected VO or the not-selected VO. Because one complete sine wave of the azimuthal CF modulation was presented along 12 consecutive sectors, it can be concluded that all bats had the possibility to gather enough azimuthal information to solve the task.
Sound analysis
Fig. 8 depicts the average number of emitted calls towards the selected and not-selected VO for each bat during the 120 trials (above and below threshold). When a bat ensonified a VO unit, usually only this VO unit played back echoes, as the calls were not loud enough to trigger playback by the other VO unit. In rare cases where echoes might have been triggered by the other VO's echo, all echoes from one VO following the echo of the other VO in a time window of 7 ms were omitted for analysis. The bats ensonified on average the selected VO three to four times as much as the not-selected VO (>20 and <10 calls, respectively). This call distribution is significant for all above- and below-threshold trials (Wilcoxon rank-sum test, P<0.001).
DISCUSSION
The current experiments investigated the behavioural strategy and perceptual sensitivity of echolocating bats inspecting the 3-D shape of VOs. The presentation of echo-acoustic VOs was implemented to allow both the systematic variation of echo-acoustic object features and the assessment of the behavioural strategies of the echolocating bats. Because of its excellent airborne manoeuvring capacity and the recruitment of echolocation to investigate objects, M. lyra proved to be a well suited animal model with which to study object-related echolocation strategies. The psychophysical sensitivity to the echo-acoustic properties of the VOs was described in terms of the minimum AM and FM depth required to discriminate an azimuthally modulated VO from an unmodulated VO. The threshold azimuthal AM depth varied between 9 and 23 dB and the FM depth was between 15 and 29% of the CF.
Falk et al. (Falk et al., 2011) trained Eptesicus fuscus to discriminate a smooth 16 mm sphere from different spheres with increasingly structured textures. Inspection of the ensonification-angle-dependent reflection characteristics of the textured objects shows that both amplitude and frequency modulations provided potential cues for the bats. The current paradigm confirms that bats can effectively use these cues, but it allows for a direct quantification of isolated echo-acoustic parameters underlying the behavioural performance.
In the following we will discuss the obtained data in terms of active versus passive echo analysis, sonar frequency analysis, FM and AM detection in bats, ensonification-correlated movements and the reconstruction of 3-D shape from consecutive echoes.
Active echolocation versus passive echo analysis
In a passive-acoustic paradigm, Genzel and Wiegrebe (Genzel and Wiegrebe, 2008) measured FM thresholds of 11% of CF both for spectral peaks and notches. The thresholds were in a range similar to those of the present study (15 to 29% of CF). This supports the hypothesis that in an active-acoustic paradigm, the bats mainly recruit a spectral profile analysis for echo imaging (Krumbholz and Schmidt, 1999; Genzel and Wiegrebe, 2008). The increased threshold values observed in the present study in comparison to other studies (Schmidt, 1992; Genzel and Wiegrebe, 2008) might be explained by different factors. In the Genzel and Wiegrebe experiment (Genzel and Wiegrebe, 2008), the whole stimulus train was always presented to the bats. In the current experiment, the amount of information gained by the bats depended on both the echolocation and flight activity of the bats: the bats had to actively move around the VO units while echolocating towards the VOs, analysing the returning echoes and developing- an efficient strategy for obtaining sufficient information.
In the passive-acoustic AM condition, the bats could acquire object information when they flew around the VOs and listened to the periodically occurring echoes from the VOs. The bats didn't need to echolocate. In the active-acoustic AM condition, however, the bats could only acquire object information when they flew around the VOs, echolocated towards them and processed the real-time generated echoes from the VOs. Interestingly, threshold spatial AM depths did not differ significantly (Wilcoxon rank-sum test, P=0.743) between these conditions, indicating that information gathered actively via sonar in the active-acoustic AM condition was equivalent to information provided passively by the synthetic calls. The object information acquired in these conditions seems to be in the same order of magnitude. This coincides with a similar information rate: in the active-acoustic AM condition, the bats' call rates were on average 10 Hz for the selected VO; in the passive-acoustic AM condition, the presented repetition rate was 9.85 Hz. The comparison of the performances of the bats that took part both in the active-acoustic and the passive-acoustic AM conditions shows that bat 3 performed much worse in the active-acoustic AM condition. Thus, this bat obviously gained much less object information in the active-acoustic AM condition than in the passive-acoustic AM condition. This could result from a less effective vocalization strategy in the active-acoustic AM condition; compared with the other bats, bat 3 emitted very few calls to the not-selected VO, i.e. it decided based on only very few calls emitted to the not-selected VO (see Fig. 8).
Sonar frequency analysis
Several studies have investigated the sensitivity of bats to FMs. Simon et al. (Simon et al., 2006) conducted an active-acoustic experiment in which the bat Glossophaga soricina had to discriminate hollow hemispheres differing in size. The size differences resulted in systematic spectral echo differences. The minimum perceivable size difference corresponded to a change of the spectral notch frequency of 7 to 21%.
Schmidt (Schmidt, 1992) presented phantom targets to M. lyra that mimicked echoes reflected from an object with two parallel planes. The bats could discriminate between targets differing in internal delay by approximately 1 μs, corresponding to a shift in the first spectral notch by 6 to 13%.
The current FM thresholds are well in the range of Simon et al.'s (Simon et al., 2006) results and only slightly higher than the thresholds obtained by Schmidt (Schmidt, 1992). In contrast to the present study, however, where the frequency composition of the echoes varied within one trial over time, the differences in the frequency composition of the echoes produced by the (phantom) objects were constant within each trial in the two comparative studies. It has been shown for humans that FM depth thresholds for the CF of formant-like harmonic complexes were roughly two times larger than just-noticeable differences for formant frequency (Lyzenga and Carlyon, 1999; Lyzenga and Horst, 1997).
FM and AM detection in bats
The current experiments investigated the time-variant echo information a bat receives when moving around a complex 3-D object and ensonifying it from different angles. A complementary, and much more studied case of time-variant echoes, is the analysis of echoes generated from fluttering targets like flying insects. In this case, the echoes are time-variant because of the movement (the beating wings) of the ensonified object. The AMs in the echo sequences perceived by a bat ensonifying such a fluttering target are in the range of 15 to 30 dB (Roeder, 1963; Kober and Schnitzler, 1990; Moss and Zagaeski, 1994). Psychophysical experiments with fluttering targets have been implemented with real objects, specifically rotating propellers, and the animals were trained to detect changes in the rotation speed. This allows varying the modulation frequency but not the modulation depth. The AM depths created by insects as fluttering targets are well in the range of azimuthal AMs used here. An electrophysiological study by Reimer et al. (Reimer et al., 1987) showed that in some neurons in the inferior colliculus of the bat Rhinolophus rouxi, a 6% sinusoidal AM still elicited response synchronization. High neural sensitivity for sinusoidally frequency- or amplitude-modulated stimuli is also found in the inferior colliculus of the horseshoe bat, Rhinolophus ferrumequinum (Schuller, 1979). The lowest psychophysical threshold of 9 dB azimuthal AM depth observed here would correspond to a modulation depth of 31%. A study by von der Emde and Menne (von der Emde and Menne, 1989) with virtual targets measured a threshold between 6 and 12% for the discrimination of wingbeat rates for R. ferrumequinum. In a follow-up study, von der Emde and Schnitzler (von der Emde and Schnitzler, 1990) demonstrated that these bats recognized previously learned virtual objects representing fluttering insects ensonified from novel angles, suggesting an internal 3-D representation for these targets. A crucial difference between AM/FM depth detection investigated here and AM/FM detection in rhinolophid bats is that these bats emit long-duration, constant-frequency tones (constant-frequency bats). Thus the call duration is much longer than a modulation cycle allowing for a continuous monitoring of amplitude changes. The rhinolophid bats rely on these AMs and FMs in echoes from fluttering insects to detect and classify their airborne prey (Kober and Schnitzler, 1990; Schnitzler and Flieger, 1983; Tian and Schnitzler, 1997; von der Emde and Schnitzler, 1990; von der Emde and Schnitzler, 1986; Gustafson and Schnitzler, 1979).
FM bats such as M. lyra produce very short echolocation calls, making both the AM and FM detection much more difficult. However, studies with FM bats concerning passive-acoustic stimuli have measured threshold FM depths of 4 to 8% of the CF in the bat Tadarida brasiliensis (Bartsch and Schmidt, 1993) and FM depths between 0.5 and 4.4% of CF for the bat Phyllostomus discolor (Esser and Kiefer, 1996). The threshold FM depths in these two passive-acoustic studies are clearly lower than those obtained here. Esser and Kiefer (Esser and Kiefer, 1996) suggested that the high FM sensitivity in P. discolor is related to the extensive use of FM communication calls. It is therefore conceivable that M. lyra and other FM bats that also employ FM communication calls of longer duration (Schmidt-French et al., 2006; Schwartz et al., 2007; Janssen and Schmidt, 2009; Schmidt, 2001; Schmidt and Seidl, 2000; Schmidt et al., 2007; Voelk et al., 2001) detect much smaller FMs not in an echolocation context but in a communication context.
Ensonification-correlated movements
The echo-acoustic analysis of 3-D objects requires movements of the bat around the object. The current experimental paradigm included a rove both of the overall target strength or spectral composition and of the azimuthal position of the ‘glints’ of the rewarded VO. This forced the bats to evaluate the correlation of their own movement and the sequence of perceived echo levels or echo spectra. The cognitive analysis of this correlation is mandatory to create an echo-acoustic representation of the ensonified object: if an animal in the AM condition inspected each VO just from one angle, the target strength it would perceive cannot be used to discriminate the modulated from the unmodulated VO. The current movement analysis shows that the bats followed quite stereotyped but individually different flight paths around the VOs in the experiments: bat 2 took a relatively long time to decide to land on a VO unit, reflected in the high number of visited sectors (Fig. 7A,B); in contrast, bat 6 visited the smallest number of sectors, which in turn resulted in a higher perceptual threshold. The bats investigated more thoroughly the VO they ultimately selected in a trial. This is represented by the higher number of visited 10 deg sectors of the VO and the higher number of emitted calls towards the selected VO. This is not due to a landing buzz [typical landing buzz interpulse intervals of 5–16 ms (Arlettaz et al., 2001; Russo et al., 2007; Melcon et al., 2007; Melcon et al., 2009)], which the bats did not typically emit (in our study the shortest interpulse intervals were greater than 20 ms). Rather, this reflects an early decision for a VO that the bats then affirmed by flying around the VO of interest. This finding indicates that they adjusted their echolocation and flight activity according to the reflection dynamics of the VOs.
Reconstruction of a complex shape from consecutive echoes
In the FM-Control condition, two bats were able to discriminate between a VO where the notch CFs varied azimuthally in a sinusoidal manner and a VO with a random azimuthal arrangement of notch CFs. The obtained thresholds for both bats were higher than those obtained in the FM-Notch condition. Nevertheless, these data provide clear evidence that the bats could detect modulations of the echo-acoustic properties of the VOs along the azimuth axis, and also demonstrate that changes of VO properties along the azimuth axis are memorized by the bats. These memorized azimuth-dependent properties could be used as building blocks for an internalized 3D representation of object shape information.
These results are in accordance to theories of how 3-D objects are mentally reconstructed and formed visually or haptically. When viewing or haptically exploring an unknown object, humans and monkeys will probe the object from different angles. These different viewpoints aid the internal construction of this object. Additionally, this mental translation of object shape is crucial for the observer for the recognition of previously viewed or haptically explored objects when introduced to a novel observation angle (Murray et al., 1993; Logothetis and Sheinberg, 1996; Hamm and McMullen, 1998; Willems and Wagemans, 2001; Lloyd-Jones and Luckhurst, 2002; Norman et al., 2004).
The current data highlight the exceptional demands for sensory-motor control in sonar object perception. The experiments provide a quantification of the extent to which spectral and amplitude information from ultrasonic echoes serve to assess the 3-D shape of ensonified objects. Data from the FM-Control condition indicate that bats can not only detect changes of these echo parameters in flight but also process the information, in conjunction with flight-motor information, to construct an internalized 3-D representation of the ensonified object. Understanding the neural basis of these audio–motor interactions appears to be one of the most challenging tasks in animal sonar research. The current experiments provide an experimental framework in which tackling of these questions may become feasible in a rigorously controlled experimental paradigm.
LIST OF ABBREVIATIONS
Acknowledgements
We thank Thomas Dera and Erich Schneider for the technical realisation of the real-time tracking of the bats. Thanks also goes to Benedikt Grothe and Tobias Bonhoeffer for helpful comments and ideas during the early development of this study.
FOOTNOTES
FUNDING
This work was funded by a grant from the Deutsche Forschungsgemeinschaft [Wi 1518/8] to L.W.