Acoustic communication signals degrade as they propagate between signalers and receivers. While we generally understand the degrading effects of sound propagation on the structure of acoustic signals, we know considerably less about how receivers make behavioral decisions based on the perception of degraded signals in sonically and structurally complex habitats where communication occurs. In this study of acoustic mate recognition in Cope's gray treefrog, Hyla chrysoscelis (Cope 1880), we investigated how the temporal structure of male advertisement calls was compromised by propagation in a natural habitat and how females responded to stimuli mimicking various levels of temporal degradation. In a sound transmission experiment, we quantified changes in the pulsed structure of signals by broadcasting synthetic calls during active choruses from positions where we typically encountered signalers, and re-recording the signals from positions where we typically encountered potential receivers. Our main finding was that the silent gaps between pulses become increasingly ‘filled in’ by background noise and reverberations as a function of increasing propagation distance. We also conducted female phonotaxis experiments to determine the threshold modulation depth required to elicit recognition of the pulsatile structure of the call. Females were surprisingly tolerant of degraded temporal structure, and there was a tendency for greater permissiveness at lower playback levels. We discuss these results in terms of presumed mechanisms of call recognition in complex environments and the acoustic adaptation hypothesis.
Many animals must contend with constraints on communication posed by the degradation and masking of long-range acoustic signals that propagate through physically and sonically complex habitats (Brumm and Slabbekoorn, 2005; Forrest, 1994; Richards and Wiley, 1980; Wiley and Richards, 1978). These constraints potentially act as sources of selection that can influence the evolution of how signals are designed (Boncoraglio and Saino, 2007; Ey and Fischer, 2009; Ryan and Kime, 2003), how signalers behave to produce signals in time and space (Brumm and Slabbekoorn, 2005; Ryan and Kime, 2003) and how the auditory systems of receivers extract biologically relevant signal properties (Bee and Micheyl, 2008; Brumm and Slabbekoorn, 2005; Langemann and Klump, 2005). In addition, receivers in some taxa can actually use the extent of degradation in acoustic signals to obtain information about the distance to signalers (Naguib and Wiley, 2001). Beyond the context of acoustic distance determination, however, few studies have investigated how receivers make other behavioral decisions (e.g. mate choice) based on the perception of degraded signals.
Anuran amphibians (frogs and toads) represent ideal animal systems for investigating sound transmission and the perception of masked or degraded acoustic signals. In many anuran species, reproduction depends on the ability of females to detect, recognize and localize the long-range advertisement calls of males in acoustically and structurally complex environments. Detailed experimental studies of phonotaxis behavior have elucidated the spectro-temporal call properties that mediate species recognition and female mate choice in a number of species (Gerhardt and Bee, 2007; Gerhardt and Huber, 2002; Ryan, 2001; Wells and Schwartz, 2007). The background noise of breeding choruses is a well-known constraint on the perception of male calls by female frogs (Bee, 2007; Bee, 2008a; Bee, 2008b; Gerhardt and Klump, 1988; Schwartz et al., 2001; Schwartz and Gerhardt, 1998). Additionally, previous studies of frogs have shown that male advertisement calls undergo predictable forms of attenuation and degradation as they propagate through typical breeding habitats (Castellano et al., 2003; Kime et al., 2000; Penna et al., 2006; Ryan et al., 1990; Ryan and Sullivan, 1989). Comparatively few studies, however, have explicitly investigated how habitat-induced effects of signal propagation influence the responses of female frogs to advertisement calls [but see for example Gerhardt and Murphy (Gerhardt, 1976; Murphy, 2008)].
Our objective was twofold in this study of Cope's gray treefrog (Hyla chrysoscelis). First, we conducted a sound transmission experiment in active choruses during the breeding season to assess changes in the received temporal structure of calls at likely positions of female receivers. Male gray treefrogs produce a trilled advertisement call comprising a series of about 20–60 discrete pulses delivered at species-specific rates (35–50 pulses s−1) and with species-specific shapes and spectral content (Fig. 1A) (Gerhardt, 2001). Individual pulses are about 9–14 ms in duration and they contain acoustic energy in two distinct regions between 1000–1500 Hz and 2000–2800 Hz (Gerhardt, 2001). The pulsed temporal structure of the call is crucially important for species recognition and mate attraction, as females show strong selectivity for calls with conspecific pulse rates (Bush et al., 2002; Gerhardt, 2005; Gerhardt, 2008; Gerhardt and Doherty, 1988; Schul and Bush, 2002). In dense choruses, however, in which the calls of multiple males may overlap in time and frequency (Schwartz et al., 2002), the temporal structure of pulsed calls could become compromised to such a degree at the position of a receiver so as to interfere with call recognition or the perception of call attractiveness (Marshall et al., 2006; Schwartz, 1987; Schwartz and Gerhardt, 1995; Schwartz and Marshall, 2006). More specifically, the silent ‘gaps’ between pulses in a male's call (Fig. 1A) could become ‘filled in’ at the position of a receiver due to the effects of both background noise and reverberations associated with physical structures (e.g. vegetation) in the habitat (Ryan and Sullivan, 1989). This ‘filling in’ effect could, in turn, impair sound pattern recognition based on perception of the call's temporal structure. In Experiment 1, we quantify this ‘filling in’ effect of signal propagation through the sonically and structurally complex habitat of breeding choruses.
Our second objective was to test the hypothesis that ‘filling in’ the usual silent gaps between pulses impairs recognition of the signal as that of an appropriate mate. We did this by mimicking the conspecific advertisement call using sinusoidally amplitude-modulated (SAM) tones (Fig. 1B). We used SAM tones instead of naturally degraded calls for two reasons. First, they afford considerable experimenter control over the temporal structure of the stimulus and allowed us to focus our investigation on the ‘filling in’ effect of transmission through the habitat without introducing potential confounds (e.g. excess attenuation of higher frequencies). Second, SAM tones have been used widely in previous studies of hearing in humans and other animals (reviewed in Joris et al., 2004), including frogs (e.g. Diekamp and Gerhardt, 1995; Rose and Capranica, 1985). In Experiment 2, we first confirmed that SAM tones were treated as conspecific advertisement calls by showing that responses to both types of stimuli exhibited similar patterns of temporal selectivity for differences in pulse rate. Then, in Experiment 3, we mimicked the ‘filling in’ effects of sound propagation by systematically varying the modulation depth of the SAM tones to determine the minimum modulation depth required to elicit reliable behavioral responses indicative of call recognition. Our results indicate that under some conditions females can be surprisingly tolerant of calls having compromised temporal structures in which the silent gaps between pulses become filled with sound.
MATERIALS AND METHODS
Subjects and study areas
This study was conducted between 12 May and 1 July in 2008 and 2009, at several ponds located in the Carver Park Reserve (44°52′49″N, 93°43′3″W; Carver County, MN, USA) and the Crow-Hassan Park Reserve (45°11′20″N, 93°38′21″W; Hennepin County, MN, USA). At both study areas, male gray treefrogs called from positions at the surface of the water on floating mats of filamentous algae (e.g. Pithophora sp.) or on dead or emergent aquatic vegetation, such as pondweed (Potamogeton sp.), reed grasses (Phragmites sp.) and reed canary grass (Phalaris arundinacea), which together represented much of the dominant vegetation in the pond habitats. We have observed unpaired females approaching males along the surface of the water, and pairs in amplexus are commonly found in the same areas where we observed calling males.
Animal collections and handling followed procedures published elsewhere (Bee, 2007; Bee, 2008a). Briefly, females were collected in amplexus between 22:00 h and 01:00 h, returned to the laboratory, and maintained at 2°C to delay oviposition until they were tested (usually the following day). On the day of testing, females were transferred to a 20°C incubator at least 30 min prior to testing to allow their body temperatures to equilibrate to the temperature at which all phonotaxis tests were conducted (20°C). Females were returned to their location of capture following testing, typically within 1–3 days of collection. A total of 85 females were used as subjects in 691 phonotaxis tests. Individual females were typically tested 5–10 times each, as described in more detail below [see Gerhardt et al. for evidence for a lack of carry-over effects (Gerhardt et al., 2000)].
Experiment 1: call transmission in a breeding chorus
The objective of Experiment 1 was to quantify changes in the received depth of modulation in the pulsatile advertisement call as a result of transmission through a breeding chorus. To this end, we conducted a sound transmission experiment in which we broadcast and re-recorded advertisement calls in the frogs' natural habitat. The test signal (Fig. 1A) was a synthetic advertisement call based on values close to the averages of calls recorded in local Minnesota populations (M.A.B., unpublished). The call comprised 36 pulses (11 ms pulse duration; 50% pulse duty cycle) shaped with species-typical envelopes (4 ms inverse exponential rise time, 7 ms exponential fall time). The pulse rate was 45.5 pulses s−1 (22 ms pulse period). Each pulse was constructed by adding two phase-locked sinusoids with frequencies (and relative amplitudes) of 1300 Hz (−9 dB) and 2600 Hz (0 dB). We hereafter refer to this signal as the ‘standard call’.
We broadcast the standard call at 85 dB SPL (sound pressure level) [re. 20 μPa, fast, root mean square (RMS), C-weighted] at a recording distance of 1 m. This SPL is near the lower limit of the range of natural variation recorded in natural populations (Gerhardt, 1975). Prior to commencing playbacks in each new location, the sound level was calibrated in the field using a CEL-430.A1 sound level meter (Casella USA, Amherst, NH, USA) held 1 m away from and aimed toward the front of the speaker. We re-recorded 10 repetitions of the call over a 1-min period at each of five recording distances (1 m, 2 m, 4 m, 8 m and 16 m). These distances were chosen to encompass the likely distances over which female treefrogs assess potential mates in a chorus (Murphy and Gerhardt, 2002). We broadcast calls using a portable CD player (Sony Electronics Inc., Park Ridge, NJ, USA) connected to a SME-AFS speaker (Mineroff Electronics Inc., Elmont, NY, USA) that floated on a foam platform placed on the surface of the water. Broadcast calls were recorded onto a Marantz PMD 670 solid-state recorder (D&M Professional, Itasca, IL, USA) using a hand-held Sennheiser ME62 omni-directional microphone (Sennheiser USA, Old Lyme, CT, USA) mounted on a Sennheiser MZS20-1 shockmount pistol grip inside a Sennheiser MZW-701 blimp windscreen. The microphone was held 10 cm above the surface of the water pointing in a straight line toward the face of the playback speaker.
We replicated the experiment at 23 different broadcast sites in four separate ponds at our two study areas (two ponds per area); we regularly collected females for testing from the same ponds. We selected between four and eight different broadcast sites within each pond (depending on pond size). These sites were located within 1–3 m of the pond bank and were selected randomly with the constraint that the selected site had to fall within an area of the pond from which males were calling at the time. Once a broadcast site was selected, we randomly determined the direction of the broadcast along the pond bank (e.g. to the left or right while facing the center of the pond). The sequence of recording distances was determined randomly for each broadcast site. We typically conducted playbacks at 2–3 sites on a given night. All transmission playbacks were conducted between 21:00 h and 01:00 h during the peak of active choruses during the breeding season.
Experiment 2: validation of SAM tones as attractive signals
Our aim in Experiment 2 was to test the hypothesis that SAM tones are effective signals that elicit patterns of responsiveness similar to those elicited by conspecific advertisement calls. We took advantage of findings from previous studies showing that female gray treefrogs are exquisitely sensitive to differences in pulse rate, with females strongly preferring the rates typical of conspecific males over faster and slower rates (Bush et al., 2002; Gerhardt, 2005; Gerhardt, 2008; Gerhardt and Doherty, 1988; Schul and Bush, 2002). We reasoned that if females exhibited both phonotaxis and species-typical preferences for conspecific pulse rates in response to SAM tones, then we could reasonably conclude that SAM tones possessed efficacy as an artificial mate attraction signal for use in Experiment 3 (see below).
We conducted five two-choice phonotaxis tests. In two tests, we paired the standard call (45.5 pulses s−1; Fig. 1A) against alternatives that consisted of synthetic advertisement calls with pulse rates of either 22.7 pulses s−1 or 90.1 pulses s−1. Pulse rates of 22.7 pulses s−1 and 90.1 pulses s−1 fall outside the range of natural variation, and the slower pulse rate is similar to that of a closely related congener, the eastern gray treefrog (Hyla versicolor). Differences in pulse rate were created by increasing the inter-pulse interval while maintaining a constant pulse duration (i.e. variable pulse duty cycle). Females of H. chrysoscelis are pure pulse rate discriminators and are relatively insensitive to variation in both pulse duration and pulse duty cycle over wide ranges of values (Schul and Bush, 2002). The two alternatives in each test had the same total call duration (759 ms) and therefore differed in total pulse number; the slower-pulse-rate and faster-pulse-rate alternatives to the standard call had 18 and 69 pulses, respectively. These synthetic calls were generated as described above.
Full descriptions of our equipment, testing apparatus and general testing procedures have been published elsewhere (Bee, 2008a; Bee, 2008b; Bee and Schwartz, 2009). Briefly, phonotaxis tests were performed in a 2 m-diameter circular test arena located inside a temperature-controlled, semi-anechoic sound chamber with wall and ceiling treatments to reduce reverberations (Industrial Acoustics Corporation, Bronx, NY, USA; inside dimensions: 220 cm×280 cm×216 cm, L×W×H). The arena walls (60 cm high) were acoustically transparent but visually opaque, and the perimeter of the arena floor was divided into 24 bins of 15 deg. The two A/D/S L210 speakers (Directed Electronics, Vista, CA, USA) used to broadcast stimulus alternatives were located on the chamber floor just outside the arena wall, 180 deg and 2 m apart, and directed toward the center of the arena. The positions of the two speakers were systematically rotated around the arena between testing days to control for any possibility of a directional response bias in the chamber. Acoustic stimuli were broadcast using Adobe Audition v1.5 (Adobe Systems Inc., San Jose, CA, USA) running on a PC interfaced with an M-Audio FireWire 410 soundcard (M-Audio, Irwindale, CA, USA) and HTD 1235 amplifier (Home Theater Direct, Inc., Plano, TX, USA). In all tests, the two alternatives repeated continuously with a period of 5 s, and the two alternatives were temporally arranged so that there was an equal period of silence preceding and following each stimulus alternative. The amplitudes of all acoustic stimuli were calibrated to be 85 dB SPL (fast RMS, C-weighted) at a distance of 1 m by placing the microphone of a Brüel & Kjær Type 2250 sound level meter (Brüel & Kjær, Norcross, GA, USA) at the approximate position of a subject's head at the start of a trial.
Phonotaxis tests were performed under infrared illumination, recorded with an overhead CCTV camera and observed from outside the chamber on a video monitor. A test began by placing the subject in a small holding cage located on the chamber floor at the center of the test arena. Following a 1-minute acclimatization period, we began broadcasts of the alternating stimuli. After four repetitions of both alternatives, the subject was remotely released using a rope and pulley system that could be operated from outside the chamber. Subjects were given up to 5 min to respond by touching the wall of the test arena within a 15 deg bin centered in front of a playback speaker. A total of 20 subjects was tested in each of the five phonotaxis tests. We conducted two-tailed binomial tests (α=0.05) of the hypothesis that a proportion of subjects greater than 0.50 would approach one of the two alternatives against the null hypothesis that equal proportions (0.50) of subjects would respond to each alternative.
Experiment 3: response thresholds as a function of modulation depth
The objective of Experiment 3 was to investigate how females perceived signals with a degraded temporal structure. In separate no-choice phonotaxis tests, we presented individual females (N=20) with each of six different SAM tones with modulation rates of 45.1 Hz (i.e. the conspecific pulse rate) and modulation depths ranging between m=0.0 and m=1.0 in steps of 0.2 (see insets in Fig. 5 in the Results section). By varying the depth of modulation between 100% (m=1.0; i.e. the natural condition) and 0% (m=0.0; i.e. unmodulated), we simulated the full range of temporal degradation that could occur during sound transmission in the natural habitat. In addition, tests of all six levels of modulation depth were replicated at each of three playback levels (61 dB, 73 dB and 85 dB SPL) using different groups of subjects randomly assigned to each level (N=20 per level; total N=60). With the exception that tests in Experiment 3 consisted of single-stimulus playbacks at multiple sound levels (calibrated as above), our testing protocol generally followed that described above for Experiment 2. Each subject was tested in a sequence of 10 tests, six of which corresponded to a test of one of the six levels of modulation depth, and four of which constituted reference conditions. The reference conditions consisted of broadcasts of the standard call both at 85 dB SPL and at the playback level at which the SAM tones were also broadcast. We counted a response as occurring when the subject touched the wall of the test arena in the 15 deg bin centered in front of the playback speaker. We also measured the latency of responses and the angle (relative to the speaker) at which females first touched the wall of the test arena. Reference conditions were designed to allow us to monitor a female's response motivation throughout testing. Females (N=5) that did not respond to a reference condition were considered unmotivated, their data were discarded and they were replaced by a new female added to the subject pool. The test sequence for a given subject began with a reference condition at that female's assigned playback level, followed by three tests using the SAM tones, then two additional reference conditions (one at the assigned playback level and one at 85 dB SPL), followed by tests of the remaining three SAM tones, and then a final reference condition at the assigned playback level. We tested the six SAM tones differing in modulation depth in a different random order for each female. Subjects were given 5–10 min ‘timeout’ intervals between consecutive tests in a sequence. Previous studies have shown these general methods to be effective for using within-subjects designs to test motivated females (Bee, 2007; Bee, 2008a; Bee and Schwartz, 2009; Bush et al., 2002; Schul and Bush, 2002).
We determined the lowest modulation depth that reliably elicited call recognition by estimating behavioral response thresholds in two ways following Bee and Schwartz (Bee and Schwartz, 2009). The first method was based on response proportions. Using the same general methods and response criterion as in the present study, Vélez and Bee (Vélez and Bee, 2010) estimated a ‘false alarm’ rate of 20%; i.e. about 2 in 10 females released at the center of our test arena in the absence of any broadcast sound are expected to exhibit behaviors that our criterion would consider to be a response. Importantly, subjects in that study were collected at the same times and from the same localities and tested with the same equipment and apparatus as those used in the present study. Therefore, using one-tailed binomial tests we estimated the upper bound of a response threshold as the lowest modulation depth eliciting responses from a proportion of subjects significantly greater than 0.20 at that depth and all greater modulation depths. The lower bound estimate was the next lowest modulation depth. As a second method of estimating response thresholds, we used circular statistics (V-tests) (Zar, 1999) to determine whether the angles at which females first touched the arena wall were randomly distributed or oriented toward the playback speaker (0 deg). We estimated the upper threshold bound as the lowest modulation depth eliciting significant orientation at that depth and all larger modulation depths; the lower bound was the next lowest modulation depth. Threshold modulation depths (mt) based on both response probabilities and orientation were calculated as the arithmetic mean of the upper and lower bound estimates and are also expressed in dB based on the following equation: threshold in dB=10•log10 (mt2).
Experiment 1: call transmission in a breeding chorus
Measured values of ΔV decreased as a function of increasing distance (Fig. 2A). Differences in mean ΔV values across the five recording distances were significant (ANOVA: F4,76=87.1, P<0.01) and consistent with a decreasing trend with increasing distance (linear contrast: F1,19=140.8, P<0.01). At the two shortest recording distances of 1 m and 2 m, most of the mean values of ΔV were close to 1.0 (e.g. ranging between 0.75 and 0.88; Fig. 2B,C). Thus, at these small distances from the source, the pulsatile structure of the signal was still clearly evident in our recordings (Fig. 3). Beyond 2 m, mean values of ΔV decreased steadily, reaching values at the 16-m distance ranging between 0.06 and 0.55 across the four study ponds. Thus, at larger distances from the source, the pulsatile structure of the call was less evident in our recordings (Fig. 3). There were also significant differences in mean ΔV values among the four ponds in which our recordings were made (F3,19=8.9, P<0.01), and there was a significant interaction between recording distance and pond (F12,76=3.0, P<0.01; Fig. 2). As illustrated in Fig. 2, mean ΔV values were always lowest at all distances in recordings made in Pond 3. Moreover, the nature of the decrease in mean ΔV values between 2 m and 16 m differed between Pond 3 and the other ponds. In Ponds 1, 2 and 4, ΔV values decreased linearly as a function of distance. Between distances of 2 m and 16 m in these three ponds, the rates of decrease in mean ΔV values ranged between 0.04 m−1 and 0.06 m−1 (i.e. between 4% m−1 and 6% m−1). By contrast, the mean ΔV values for Pond 3 exhibited exponential decay as a function of distance, in which ΔV decreased by similar amounts (0.18 to 0.24) with each doubling of distance between 2 m and 16 m (cf. Fig. 2B,C). Although not quantified as part of this study, we believe the relevant physical features of the habitat (e.g. types and density of vegetation) were quite similar at all four ponds; however, our impression was that the density of calling males and levels of calling activity were considerably higher in Pond 3.
Experiment 2: validation of SAM tones as attractive signals
When given a choice between two synthetic pulsed calls differing in pulse rate (Fig. 4A), 100% of females chose the standard call with a pulse rate near the population average (45.5 pulses s−1) over alternatives with a slower pulse rate (22.7 pulses s−1, P<0.01) and a faster pulse rate (90.1 pulses s−1, P<0.01). In parallel tests conducted with SAM tones (Fig. 4B), 100% of females chose the stimulus with a modulation rate of 45.1 Hz over alternatives modulated at rates of 22.6 Hz (P<0.01) and 90.2 Hz (P<0.01). Eighteen out of 20 females (90%) chose the synthetic call with a pulse rate of 45.5 pulses s−1 over the SAM tone modulated at 45.1 Hz (P<0.01). Across all five tests, the mean response latencies ranged between 65 s and 91 s, and the fastest response latencies ranged between 30 s and 38 s; these latencies are fairly typical for this species (e.g. Bee and Riemersma, 2008). Together, these results suggest the following. First, although females discriminated between pulsed calls and SAM tones, and preferred the former, they nevertheless exhibited robust phonotaxis in response to SAM tones. Second, when call duration was held constant, females were selective for pulse rates close to the average of conspecific calls regardless of whether the stimuli were pulsed calls or SAM tones. From these data, we concluded that SAM tones modulated at rates near 45 Hz are recognized by female gray treefrogs as conspecific mate attraction signals.
Experiment 3: response thresholds as a function of modulation depth
The proportion of subjects that responded to SAM tones modulated at 45.1 Hz increased as a function of increasing modulation depth (Fig. 5). Differences in response probability across the six levels of modulation depth were significant at all three playback levels tested (Cochran's Q tests; 61 dB: Q=19.5, P<0.01; 73 dB: Q=50.4, P<0.01; 85 dB: Q=62.2, P<0.01). At the 85-dB playback level, modulation depths of m=0.4 and higher elicited responses from a proportion of subjects greater than that expected based on the false alarm rate (P-values<0.01; Fig. 5, black bars); unmodulated tones (m=0.0) and tones modulated at a depth of m=0.2 did not elicit a greater proportion of responses than the expected false alarm rate at this playback level (P-values>0.19; Fig. 5). Hence, for SAM tones presented at 85 dB, we estimated a threshold modulation depth of mt=0.3 (−10.5 dB). At playback levels of 73 dB and 61 dB (Fig. 5, gray bars and striped bars, respectively), the proportion of subjects responding significantly exceeded the expected false alarm rate at modulation depths of m=0.2 and higher (P-values<0.01; Fig. 5) but not in the unmodulated (m=0.0) conditions (P-values>0.09; Fig. 5). Therefore, we estimated a threshold modulation depth of mt=0.1 (−20 dB) for these two signal levels.
At playback levels of 73 dB and 85 dB, females touched the wall at angles that were significantly oriented in the direction of the speaker (0 deg) at modulation depths of m=0.2 and greater (Table 1). Based on these results, we estimated behavioral response thresholds of mt=0.1 (−20 dB) for SAM tones broadcast at 73 dB and 85 dB. In the 61-dB conditions, females exhibited significant orientation toward the speaker at all modulation depths, including in response to the unmodulated tones (m=0.0; Table 1). Hence, a modulation threshold could not be estimated from orientation angles for the 61-dB playback level.
Subjects in Experiment 3 remained highly motivated to respond across the test sequence, as indicated by the consistency of their responses in the reference conditions. For example, the mean latencies to respond to the three separate presentations of the standard call at the assigned playback levels ranged between 86 s and 96 s and did not differ significantly (ANOVA: F2,114=0.7, P=0.50; following log10 transformations to improve normality). In comparisons of the groups of subjects assigned to the three different playback levels, there were no significant differences in the mean, log10-transformed latencies in response to the standard call presented either at the assigned playback levels (ANOVA: F2,57=1.8, P=0.17) or at 85 dB SPL (ANOVA: F2,57=0.2, P=0.86). In addition, the angles at which females first touched the wall of the test arena were strongly and significantly oriented in the direction of the playback speaker (0 deg) in all reference conditions (−2.9 deg<μ<4.7 deg; 0.9<r<1.0; V-values>0.90, P-values<0.01).
Temporal call structure under natural listening conditions
During call transmission, we found that a measure of pulse structure (ΔV) that quantified the amplitude in pulses relative to that in the gaps between them declined with distance as a result of the interpulse intervals becoming ‘filled in’. Two different sources probably contributed to the degradation of temporal call structure reported here. The first was the contribution of abiotic and biotic sources of background noise and, in particular, the signals of calling males [i.e. ‘signal clutter’ (Forrest, 1994)]. Obviously in choruses of treefrogs, especially potent sources of acoustic interference and masking are the vocalizations of conspecifics (Gerhardt and Huber, 2002; Wells and Schwartz, 2007). The significance of this for call recognition by receivers will depend on the density and distribution patterns (in time and space) of calling males and on the distance separating sources and receivers. As source–receiver distance increases, for example, the relative contribution of background noise to the loss of pulse definition also will increase, with the magnitude of the effect depending on the frequency of the background noise and the source's calls (Castellano et al., 2003). This is because the signal strength declines but the average background noise level remains similar. In addition, with increases in chorus density or calling activity, the level and constancy of background noise also increases, resulting in greater loss of temporal call structure. We believe the effects of greater background noise probably account for the faster initial decreases in ΔV that occurred with distance in Pond 3 compared with the other ponds (Fig. 2).
The second source contributing to the degradation of temporal call structure reported here probably involved several well-known factors stemming from physical attributes of the environment (e.g. plants, ground, atmosphere) (Forrest, 1994; Naguib, 2003; Richards and Wiley, 1980). As sound propagates away (spreads spherically) from a source, it can be refracted, reflected and partially absorbed. Such processes can not only induce temporal and spectral changes as they occur but also spawn multiple transmission routes to a receiver and thus multiple times of arrival for formerly well-defined structural signal elements [i.e. ‘habitat clutter’ (Forrest, 1994)]. Sound energy restricted to pulses at the signal source can thus become more diffusely distributed in time at a receiver, an effect that contributes to a ‘filling in’ of the gaps between pulses.
Our general findings from Experiment 1 are qualitatively similar to those obtained in other acoustic signal transmission studies of anurans (e.g. Castellano et al., 2003; Kime et al., 2000; Ryan et al., 1990) and other taxa, such as birds (e.g. Barker et al., 2009; Mathevon et al., 2005) and mammals (e.g. Brown et al., 1995; Daniel and Blumstein, 1998). The just cited anuran studies employed a cross-correlation-based measure of degradation that was influenced by a suite of signal changes. Our results are best compared with those of Ryan and Sullivan (Ryan and Sullivan, 1989), who broadcast calls that had a pulsed structure, focused on temporal degradation and also calculated ΔVs. They observed a decline in this integrity measure between about 0.5% m−1 and 1.0% m−1 for calls of the toads Bufo valliceps and Bufo woodhousii, which is markedly slower than the rate of decline in ΔV reported here (4% m−1 to 6% m−1). The most likely explanation for this difference is that our recordings were made in dense choruses whereas Ryan and Sullivan made their recordings at times when no nearby toads were calling (Ryan and Sullivan, 1989). This difference again highlights the contribution of background noise to the deterioration in received temporal structure. However, differences in habitat (physical structure) as well as call temporal and spectral characteristics of the study species might also account for some of the more rapid degradation with distance in our data set. For example, most of the energy in the gray treefrog calls was above 2 kHz while the dominant frequencies of the toad calls were below 1.5 kHz. All else being equal, anuran calls of higher frequency should be more vulnerable to excess attenuation and under some circumstances to structural deterioration (Kime et al., 2000).
Recognition of degraded temporal call structure
In Experiment 2, we found that SAM tones elicited a robust and selective phonotaxis response. As with pulsed calls, offering alternatives with a modulation rate that was either well elevated or reduced relative to about 45 Hz elicited unanimous discrimination in favor of the conspecific modulation rate. Pulse shape is believed to be of little importance to call recognition in H. chrysoscelis as compared with H. versicolor (Gerhardt, 2005; Schul and Bush, 2002). Nevertheless, when paired against the standard pulsed call, the 45.1 Hz SAM tones proved significantly less attractive. We interpret these results to mean that SAM tones are recognized as conspecific advertisement calls but that the natural form of pulses is preferred under some circumstances (see Gerhardt, 2005), such as when it can be compared with sinusoidally shaped call elements.
The inherently strong response of females to the 45.1 Hz SAM stimulus in Experiment 2 allowed us to estimate in Experiment 3 how the spread of sound energy into gaps could impact on call recognition. Females were clearly sensitive to the depth of modulation at playback levels of 61 dB, 73 dB and 85 dB and showed reduced tendencies to approach the signal source as the depth of modulation declined. However, at all playback levels, 70% or more of females showed positive phonotaxis even when modulation depth had been reduced to m=0.4. Our estimates of signal recognition thresholds based on both response probabilities and angular orientation indicated that only very shallow modulation (e.g. 0.1≤mt≤0.3) was required to elicit statistically reliable phonotaxis behavior. These findings suggest that females would actually be quite tolerant of a degraded pulse structure that involved a substantial ‘filling in’ of the gaps between pulses. We conclude from these data that the attractive potential of calls suffering from significant degradation in temporal structure due to noise or energy spread can remain high as long as attending females are close enough to a calling male to detect his calls. For example, most males must be within about two meters for their calls to reach a receiver at the highest SPL we employed, 85 dB SPL (Gerhardt, 1975). At these distances, unless a male is concealed in dense vegetation, calls will exhibit little degradation due to physical interactions with habitat components (Figs 2, 3) (J.J.S., unpublished). Although background noise potentially can reduce modulation depth to a greater extent (e.g. Fig. 2, pond 3), it is extremely unlikely that even the background din of the chorus would significantly compromise the inherent ability of pulsed calls to attract a nearby female. Bee found that although calls of H. chrysoscelis were masked from females at a SNR of −12 dB, about 70% or more of females showed phonotaxis at a SNRs of 0 dB and higher (Bee, 2007). Bee and Schwartz showed that the threshold for signal recognition in chorus-like noise occurs at a SNR of about 0 dB (Bee and Schwartz, 2009). Sustained levels of background noise in gray treefrog choruses commonly range between 70 dB and 80 dB SPL (Schwartz et al., 2001; Swanson et al., 2007). So, a male calling at 90 dB at 1 m should be detectable by most females at distances of between about 4 m and 8 m.
Tolerant receivers and the acoustic adaptation hypothesis
The acoustic adaptation hypothesis posits that selection on signalers has yielded signals with characteristics well suited to propagation in those habitats in which the animals most often communicate (Boncoraglio and Saino, 2007; Ey and Fischer, 2009; Ryan and Kime, 2003). Although there has been some support for the hypothesis from work with birds and mammals (Boncoraglio and Saino, 2007; Brumm and Naguib, 2009; Ey and Fischer, 2009), to date for anurans, there have been relatively few data generated consistent with its predictions (Ey and Fischer, 2009; Wells, 2007). Our study was not designed to test the hypothesis directly but our results bear important relevance to it. While the transmitted calls of H. chrysoscelis exhibited loss in temporal structure, even at the maximum transmission distance we evaluated (16 m), mean values of ΔV at three of the four study ponds ranged between 0.36 and 0.54. By comparison, females were quite tolerant of degradation in temporal structure, as the highest threshold modulation depth required to elicit phonotaxis was mt=0.3, and most estimates were less than or equal to mt=0.1. In addition, there was a tendency for threshold modulation depths to be lower at the lowest signal level (61 dB) compared with the highest signal level (85 dB). Thresholds at the 73-dB signal level were intermediate and were more similar to those at 61 dB or 85 dB depending on whether they were estimated based on response probabilities or angular orientation, respectively. Importantly, at the 61-dB signal level, no threshold modulation depth could be estimated based on angular orientation, as females at this signal level were significantly oriented in the direction of the speaker even in response to the unmodulated (m=0.0) signal. This finding suggests that at low amplitudes, call features unrelated to amplitude modulation, such as call duration, call rate and spectral content, were sufficient to elicit phonotaxis behavior. Based on results from phonotaxis tests in combination with our measured levels of ΔV, we believe it is unlikely that degradation of pulse structure would have seriously compromised the ability of calling males to elicit phonotaxis by potential mates, even for those calling at the greatest source–receiver distances in the most dense and active choruses (e.g. Pond 3 at 16 m).
Given the importance of pulse structure to call recognition by female gray treefrogs, a reasonable expectation would have been that transmission-induced temporal degradation would profoundly reduce the efficacy of calls for phonotaxis. Although selectivity for fine temporal parameters might increase at low amplitude under some circumstances (Beckers and Schul, 2004), our results suggest that females may have been under natural selection to reduce their selectivity at low signal amplitudes for those attributes of advertisement signals most vulnerable to degradation when source–receiver separation is great. This could allow them, when relatively far from males, to focus their attention on call features and signaling behaviors less susceptible to deterioration but still incorporating useful information. Thus, females could attend principally to call duration and rate when well separated from potential mates but increase the weight they place on fine-scale temporal features, such as pulse structure, as they approach callers. In gray treefrogs (both H. chrysoscelis and H. versicolor), call duration and rate are criteria that females use to choose among conspecifics (Bee, 2008b; Gerhardt, 2001; Schwartz et al., 2001) and that may also encode information on genetic quality (call duration) (Welch et al., 1998) and physical condition (calling effort) (Schwartz and Rahmeyer, 2006). For H. chrysoscelis, the pulse rate of calls is especially relevant to discrimination of conspecific males from those of H. versicolor in mixed-species choruses of gray treefrogs (Schul and Bush, 2002), a discrimination that could be particularly important at close range just prior to selecting a mate.
The hypothesis that anuran receivers might be tolerant when it comes to temporal degradation of pulse structure is supported by neurophysiological evidence. Under ideal acoustic conditions, the firing of neurons of the anuran auditory periphery reliably encodes the timing pattern of amplitude modulation (AM) or pulses in signals (Gerhardt and Huber, 2002; Rose and Gooler, 2007). In the torus semicircularis of the anuran midbrain, this synchronized response pattern is replaced by a temporally selective rate code. Here the firing rate of subsets of auditory neurons is highest to stimuli with AM rates close to species-specific pulse rates (Diekamp and Gerhardt, 1995; Rose et al., 1985), and such responses can be sensitive to changes in the duration of even a single interpulse interval (Edwards et al., 2002). However, the ability of eighth nerve fibers to encode the inherent periodicity structure of sounds, to which upstream central nervous system neurons ultimately are sensitive, can decline with increases in background noise levels (Simmons et al., 1992), decreases in signal amplitude (Schwartz and Simmons, 1990) (but see Rose and Capranica, 1985) and reductions in depth of modulation (Rose and Capranica, 1985). Thus, under circumstances where the nervous system of females is better able to respond specifically to the pulsatile nature of calls, females utilize this aspect of signal structure in decision making to a greater extent than when such neural responses are weaker or less likely.
Previous studies of green treefrogs (Hyla cinerea) also support the hypothesis that anuran receivers may be tolerant of signal degradation under certain listening conditions, and that tolerance for degraded signals may co-vary with signal amplitude (Gerhardt, 1976; Gerhardt, 1981). In H. cinerea, with distance, the relative amplitude of the high frequency component of the advertisement call is more greatly attenuated than the low frequency component. Two-choice phonotaxis experiments demonstrated that at low signal amplitudes, similar to what females would experience at greater source–receiver distances, females are less sensitive to deficiencies in the relative amplitude of the higher call frequency component than they are at higher signal amplitudes. The neurophysiological basis for this receiver tolerance could be explained by the differential sensitivities of the two sensory papillae in the anuran inner ear. Auditory nerve fibers innervating the basilar papilla (BP), the inner ear organ of anurans ‘tuned’ to higher frequencies, are less sensitive (i.e. require higher signal amplitudes to fire) than are those innervating the amphibian papilla (AP), which are tuned to lower frequencies. Call discrimination based on patterns of AM also occurs in H. cinerea (Gerhardt, 1978a; Gerhardt, 1978b; Oldham and Gerhardt, 1975). Females choose unmodulated advertisement calls over pulsatile aggressive calls. Interestingly, Gerhardt found that discrimination between weakly and more strongly modulated calls only occurs when the difference in depth of AM is greater than ~ 40% and the alternatives are of sufficiently high amplitude (Gerhardt, 1978b). Our results with H. chrysoscelis and the data from H. cinerea underscore the utility of testing response behavior over a range of signal amplitudes corresponding to those which individuals may be exposed in nature Gerhardt (Gerhardt, 2008).
Given the levels of tolerance shown by female treefrogs for degraded signals, especially at low amplitudes, it is possible that the strength of selection on signal structure for transmission-enhancing attributes has been relatively weak. Evidence suggests that selection may have acted more strongly on receivers to modify their criteria for signal recognition (or at least those governing positive phonotaxis) in ways that allow them to tolerate signal degradation under the conditions when it is most likely to occur (e.g. noisy conditions, large source–receiver distances). In addition, judicious choice by male frogs of a suitable calling site may be a relatively easy way for males to increase the active space of their signals (Parris, 2002), thereby reducing the need for structural changes in calls that could improve their propagation.
Our study represents one step towards understanding better how receivers make behavioral decisions based on the perception of degraded signals. Because female frogs in natural choruses are exposed simultaneously to calls from multiple males, and thus have an opportunity to compare signals, an important next step will be to conduct discrimination tests. Our prediction is that females would often discriminate in favor of calls exhibiting a deeper relative to a shallower depth of modulation, especially at high signal amplitudes. While the use of SAM tones would provide one rigorous means to test this prediction, it will also be important to reintroduce greater biological realism by using pulsed calls, which are preferred over SAM tones and for which the nervous system may exhibit greater overall temporal selectivity (Diekamp and Gerhardt, 1995). Future work investigating both the tolerance of anuran receivers for degraded signals and the impacts on signal degradation of male call site preferences within the available habitat (e.g. Ptacek, 1992) might provide important evolutionary explanations for the lack of evidence supporting the acoustic adaptation hypothesis in anurans.
We thank the Minnesota Department of Natural Resources and the Three Rivers Park District for generous access to animals and study areas. We also thank J. Cook, D. Heil, J. Henderson, J. Henly, S. Hinrichs, J. Lane, A. Leightner, S. Markegaard, A. Morabu, C. Nguyen, S. Petterson, A. Rapacz-Van Neuren, K. Riemersma, D. Rittenhouse, M. Rodionova, A. Smith, K. Speirs, A. Thompson, J. Walker-Jansen and especially S. Tekmen for their help collecting and testing frogs. This work was approved by the University of Minnesota Institutional Animal Care and Use Committee (IACUC protocol 0809A46721, last approved 22/09/2009) and was supported by grants from the National Institute on Deafness and Other Communication Disorders (DC008396 and DC009582 to M.A.B.), the National Science Foundation (IOS 0342183 to J.J.S.) and the University of Minnesota Undergraduate Research Opportunities Program (to M.C.K.). Deposited in PMC for release after 12 months.