The sender–receiver matching hypothesis predicts that species-specific features of vocalizations will be reflected in species-specific auditory processing. This hypothesis has most often been invoked to explain correlations between vocal frequency ranges and the frequency range of auditory sensitivity; however, it could apply to other structural features, such as the rise time of stimuli. We explored this hypothesis in five songbird species that vary in the rise times and frequency range of their vocalizations. We recorded auditory evoked potentials (AEPs) to onset and sustained portions of stimuli that varied in both frequency and rise time. AEPs are gross potentials generated in the auditory nerve and brainstem and measured from the scalp. We found that species with shorter rise times in their vocalizations had greater amplitude and shorter latency onset AEPs than species with longer rise times. We also found that species with lower frequency and/or more tonal vocalizations had stronger sustained AEPs that follow the sound pressure changes in the stimulus (i.e. frequency following responses) than species with higher frequency and/or less tonal vocalizations. This is the first study in songbirds to show that acoustic features such as rise time and tonality are reflected in peripheral auditory processing.
Songbird vocalizations are a well-studied phenomenon (Catchpole and Slater, 2008), but less is known about songbird auditory processing. Nonetheless, an expectation has emerged that species-specific spectral–temporal features of vocalizations will be reflected in species-specific auditory processing (Dooling et al., 2000; Woolley et al., 2009). We refer to this expectation as the sender–receiver matching hypothesis. The sender–receiver matching hypothesis has been supported for some common features of vocalizations such as the match between the frequency range of vocalizations and the frequency range of best auditory sensitivity (Dooling et al., 1978; Konishi, 1970; Henry and Lucas, 2008) and between harmonic structure and harmonic processing (Lohr and Dooling, 1998; Dooling et al., 2002; Lohr et al., 2006). If the match between sender and receiver is a general principle of communication it should apply to the auditory processing of a wide range of vocal features. For instance, the sender–receiver matching hypothesis should apply to the auditory processing of structural features of vocalizations, such as rise time (the time it takes a vocalization to go from zero to full amplitude), although this expectation has yet to be tested.
Understanding the auditory processing of the acoustic structure of vocalizations is important to our understanding of communication because the structure of acoustic features of vocalizations is closely linked to their function (Bradbury and Veherncamp, 1998). For instance, the alarm calls of many species of songbirds and small mammals are relatively high-frequency tonal elements with slow rise times (Marler, 1955; Marler, 1959; Leger and Owings, 1978). These features presumably diminish the ability of potential predators to localize the sender by eliminating the use of interaural phase or time of arrival differences (Klump, 2000). Functionally, this allows the sender to alert conspecifics to danger without providing localization cues to the predator. In contrast, mobbing calls of a number of species are broadband with rapid rise times (Fickens and Popp, 1996). The rapid rise of these vocalizations enhances the ability of conspecifics to localize the sender using interaural timing differences, while the broad frequency range can improve the ability of the receiver to localize the sender using interaural intensity differences (Klump, 2000). Functionally, these mobbing calls provide very precise information about the location of low-risk predators.
Alarm and mobbing calls provide extreme examples of structural variation among vocalizations, but songs can also vary in structure, often in species-specific ways. Songs that are primarily designed for long-distance advertisement often contain structural elements that enhance propagation and the ability of individuals to localize the sender, such as short rise times. Short-distance songs (e.g. courtship songs) tend to have features that minimize propagation and diminish the ability of eavesdroppers to localize the sender, such as long rise times (Bradbury and Veherncamp, 1998). Although there are many examples of the association between structural features and the function of vocalizations, the question of how these features are coded by the peripheral auditory system remains relatively unexplored in non-model organisms.
Here, we tested whether the sender–receiver matching hypothesis applied to two structural features of vocalizations: rise time and frequency range. We tested this hypothesis in five species: the American tree sparrow, Spizella arborea (Wilson 1810); the brown-headed cowbird, Molothrus ater (Boddaert 1783); the dark-eyed junco, Junco hyemalis (Linnaeus 1758); the house finch, Carpodacus mexicanus (Müller 1776); and the white-crowned sparrow, Zonotrichia leucophrys (Forster 1772). First, we analyzed five parameters of each species' song: the rate of frequency modulation and rise time, and the minimum, maximum and dominant frequency. Then, we used auditory evoked potentials (AEPs) to quantify species-specific responses to tonebursts ranging in rise time from 1 to 5 ms and in frequency from 0.5 to 6 kHz. AEPs are gross potentials generated by temporally synchronous discharges of neurons in the auditory nerve and brainstem in response to sound (Hall, 2007). We measured both the amplitude and the latency of the auditory brainstem response (ABR), which is generated by the onset of acoustic stimuli, and the amplitude of the frequency following response (FFR), which is a sustained response that is time locked to sound pressure fluctuations in the stimulus (i.e. it follows the frequency of the stimulus).
We had three general predictions based on the structural elements (i.e. rise time and frequency range) of each species' songs (see Fig. 1 for vocal exemplars). (1) The amplitude and latency of the ABR should be related to the rise time of each species' song. Specifically, species with rapid rise times of vocal elements should have greater onset responses to tones with rapid rise times, but smaller onset responses to tones with slow rise times, than species with slower rise times of vocal elements. (2) Species with more tonal vocalizations should have stronger FFRs than species with less tonal vocalizations. (3) The frequency range of each species' auditory sensitivity will correlate with the frequency range of its song. Specifically, species with higher frequency vocalizations should have greater high-frequency sensitivity, while species with lower frequency vocalizations should be more sensitive to lower frequencies.
MATERIALS AND METHODS
Acoustic signal space
We analyzed 10 song exemplars, acquired from the Cornell Lab of Ornithology Macaulay Library, for each species. All songs were recorded in the field and we preferentially selected exemplars from the midwest and northeast. We analyzed five parameters of each species' song using the Raven Pro ver. 1.4 (Cornell Lab of Ornithology) measurement tool: rise time and the rate of frequency modulation, as well as minimum, maximum and dominant frequency (frequency with the greatest spectral energy).
To measure the rate of frequency modulation we first identified all of the inflection points in the song – the point at which the direction of frequency change switched from ascending to descending, or vice versa. We then measured the amount of frequency change (in Hz) and the duration of the section of song between each pair of the inflection points. The rate of frequency modulation for that section was calculated as the frequency change in Hz divided by the duration of the subsection in seconds. We then calculated a grand average for each of the song exemplars. This measure of frequency modulation therefore reflects the relative tonality (low FM rate) or modulation (high FM rate) in a song. Rise time was measured from the sound pressure waveform of the stimulus. We defined rise time as the time it took for the element to go from zero to peak amplitude for the first element in the song. Minimum and maximum frequency were measured from spectrograms generated from full songs in Raven Pro with 5.8 ms Blackman windows and a dynamic range of 15 dB. Dominant frequency was determined from a power spectrum of the entire song. We present the mean ± s.e.m. for each of the stimulus parameters.
Capture and housing
Birds were caught at several private residences in Lafayette, IN, USA, at the Lilly Nature Center in West Lafayette, IN, USA and on Purdue property in West Lafayette, IN, USA. American tree sparrows, dark-eyed juncos, house finches and white-crowned sparrows were caught using baited walk-in traps and mistnets from March to April and October to December of 2010. We avoided trapping from May to September so as not to interrupt breeding attempts. The capture of species was evenly distributed among the trapping months. The brown-headed cowbirds were captured in May and June of 2010. Subjects were transported to and subsequently housed at Purdue University in West Lafayette. Individuals were then tagged with colored leg bands for individual identification. Each subject was housed individually in a 1 m3 steel cage and provided ad libitum with mixed seed, water and grit. The light cycle was set to local conditions. We tested a total of 12 American tree sparrows (mean mass ± s.d., 18.2±1.5 g), 15 brown-headed cowbirds (39.3±6.3 g), 9 dark-eyed juncos (19.0±1.5 g), 8 house finches (20.9±0.9g) and 6 white-crowned sparrows (26.8±0.2 g). Individuals were released 24–48 h after the completion of auditory testing. All methods were approved under PACUC protocol nos 05-058 and 08-132.
All experiments were conducted in an anechoic sound chamber (1.2×1.2×1 4 m) lined with 7.7 cm Sonex foam (Acoustic Solutions, Richmond, VA, USA). Individuals were anesthetized with a combination of midazolam (4–6 mg kg−1) and ketamine (40–60 mg kg−1) injected into the breast muscle. If myogenic responses (e.g. eye opening, wing fluttering) became large enough to interfere with recording, a supplemental injection of midazolam (2–3 mg kg−1) and ketamine (20–30 mg kg−1) was given. Subjects were positioned at the center of the sound chamber on a microwavable heating pad wrapped in towels. The temperature between the bird and the outermost towel was monitored with a temperature probe connected to a digital read-out in the adjacent recording room. The temperature was maintained at 39±2°C by adding or removing layers of towel between the bird and the heating pad.
Auditory stimuli were created in SigGen32 on a computer with an AP2 sound processing card (Tucker Davis Technologies, Alachua, FL, USA). Stimulus presentation and response recording were coordinated with a TDT II rack-mounted system (TDT, Alachua, FL, USA) and a computer running TDT BioSig32 software. Stimuli were converted from digital to analog signals with a TDT DA1 and equalized across frequencies with a 31 band equalizer (Behringer Ultragraph model FBQ6200, Bothell, WA, USA). Stimuli were then amplified with a Crown D75 amplifier and presented through a magnetically shielded speaker suspended 30 cm above the bird's head (RCA Model 40-5000, RadioShack, Fort Worth, TX, USA; 140–20,000 Hz frequency response). We calibrated the sound levels in a sham experiment with a Bruel & Kjaer model 1613 precision sound level meter and model 4131 2.6 cm condenser microphone (Bruel & Kjaer, Norcross, GA, USA).
Subjects were presented with 20 ms tone bursts at a fixed intensity of 64 dB sound pressure level (SPL). We have pilot data suggesting that stimulus duration (>8 ms) does not affect the first peaks of the ABR in birds (M.D.G., L.E.B. and J.R.L., unpublished data). Note that we chose to present stimuli at a fixed SPL rather than at a fixed level above threshold for the following reason: 64 dB SPL is a behaviorally relevant stimulus level across the frequency range of our stimuli. Stimuli at this intensity level are likely to evoke behavioral responses from individuals in the wild. In natural behavioral situations individuals that are equidistant from a sound source will encounter a signal at some fixed amplitude, which may reflect biologically meaningful differences in the level above thresholds for different species. ABR amplitude is a reflection of the number of neurons responding to a signal onset and the synchrony of those neurons. As such, it is unsurprising that ABR waveforms can be used to derive audiograms, which in turn correlate with the shape of behavioral audiograms (Gall et al., 2011). Thus, conducting this experiment at a given dB above threshold compensates for biologically relevant species differences and would therefore introduce a bias in the results. Care should be taken, therefore, in interpreting the absolute amplitude of the auditory brainstem responses, as they are directly affected by the level of the stimulus above threshold.
Each stimulus was presented with a cos2-gated rise ramp of 1, 2, 3, 4 or 5 ms (‘ramp’). We defined rise time as the duration of time between the onset of the stimulus and the maximum intensity of the stimulus (Fig. 2). Stimuli ranged in frequency from 0.5 to 6 kHz. Two responses were recorded at each frequency × ramp combination. Each response was averaged over either 500 stimulus repetitions presented with a 90 deg phase or 500 repetitions presented with a 270 deg phase. Stimuli were presented at a rate of 25 s−1. Responses were sampled at 40 kHz for 30 ms beginning 1.2 ms prior to the arrival of the stimulus at the ear.
The responses were conducted from needle electrodes that were placed just below the skin at the vertex of the head (non-inverting), in the mastoid just behind the ear (inverting) and on the nape of the neck (ground). The electrode leads were connected to a TDT headstage (HS4) and then passed through a biological amplifier (TDT DB4) where the responses were bandpass filtered from 0.3 to 10 kHz, notch filtered at 60 Hz and amplified (×200,000). The analog signals were then converted to digital signals (TDT AD2) and conducted to a computer running TDT BioSig32.
Offline response analyses
The ABR and two types of FFRs (FFR1 and FFR2) to each stimulus were measured offline in PRAAT (ver. 5.1.32) (Boersma and Weenink, 2010). FFR1 contains both a cochlear microphonic (CM), which is generated by hair cell potentials and tends to be smaller in birds than in mammals (Dooling et al., 2002), and a neural FFR. The FFR2 is the second harmonic FFR generated when stimuli are presented in alternating polarity and should contain only the neural FFR (see below). To measure the ABR we did a point-to-point addition of the responses to each phase and divided the voltage of the resulting waveform by two (hereafter ‘summed’) to improve visualization of the ABR. ABR amplitude was measured as the voltage difference between the first positive peak and first negative trough of the onset responses. We also recorded the latency of the first positive peak and the latency of the first negative trough for each ABR (Fig. 2). The patterns were very similar for the two latency measurements; therefore, we present only the results for the latency of the first positive peak.
The FFR was measured from responses recorded to stimuli presented in a constant phase (90 or 270 deg). To measure the FFR2 we summed the responses to stimuli presented in phases of 90 and 270 deg, as for the ABR measurement. Summing responses to stimuli that are 180 deg out of phase should eliminate the CM, because the CM maintains the phase of the stimulus. The neural response is not expected to be perfectly out of phase; therefore, some of the neural response is retained when the responses to stimuli presented in alternating phase are summed. The degree to which these responses are out of phase, and therefore the amplitude of the retained response, may depend on the taxon-specific neural generators of the FFR. FFR1 can contain both a CM component and a neural FFR component. The CM component is expected to begin shortly after the arrival of sound at the ear and can be seen before the ABR. The neural FFR component does not begin until after the onset of the ABR (Huis in't Veld et al., 1977; Lucas et al., 2007). We saw little to no periodic amplitude fluctuations before the ABR, suggesting that responses to constant phase stimuli are dominated by neural responses (see Fig. 2 for response exemplars). Therefore, we focus primarily on the FFR but also present FFR2 responses.
We measured the amplitude of the FFR and FFR2 to the 10 ms plateau of each stimulus. For each of the stimuli we trimmed the first 6.2 ms and last 13.8 ms from the 30 ms response recording, leaving only the sustained response to the 10 ms stimulus plateau (the portion of the stimulus at full amplitude). The amplitude of the stimulus plateau was identical across frequencies and across ramps. We then generated a power spectrum from the AEP waveform (FFR single polarity, FFR2 combined polarity). We extracted the amplitude of the sustained response (dB re. V) at the stimulus frequency (FFR1) or at the second harmonic of the stimulus frequency (FFR2), which is generated by the half-wave rectification that occurs when the signal is transmitted from the cochlea to the auditory nerve (see Lucas et al., 2007). We also extracted and averaged the noise floor in 25 Hz intervals from ±100 Hz from the peak of interest. We discarded any responses where the FFR1 or FFR2 was less than 3 dB above the noise floor. The FFR2 amplitude was smaller than the FFR1 amplitude, which resulted in a greater number of FFR2 amplitude values being dropped from the model compared with FFR1 responses. At 6 kHz we found a DC shift, but not the AC potential associated with FFR1 and FFR2; therefore, we only analyzed FFR1 and FFR2 from 0.5 to 4 kHz.
The acoustic signal space data were analyzed with t-tests adjusted for multiple comparisons with the Bonferroni method (α=0.005). We analyzed our AEP data with repeated measures ANOVA in Proc MIXED in SAS 9.2 with bird identity included as a subject factor. The Kenward–Rogers algorithm was used to calculate the denominator degrees of freedom. First-order autoregressive covariance structure was chosen as it produced the lowest Akaike's information criterion (AIC) value. However, there was little qualitative difference between the first-order autoregressive model and models with other covariance structures (e.g. compound symmetry, unstructured). Separate models were used to analyze ABR amplitude, ABR latency, FFR1 and FFR2. The independent variables were frequency, ramp, species and their interactions. Non-significant interaction terms were removed from the model in order of decreasing P-value. Significant effects were investigated post hoc with least squares means using the diff procedure and a Tukey–Kramer adjustment for multiple comparisons. All data were checked for normality and homoscedasticity. Latency did not meet normality assumptions and was inverse transformed. Least squares means ± s.e.m. (back-transformed where appropriate) are reported throughout.
Acoustic signal space
We found that the rate of frequency modulation and the rise time of elements varied among species. House finches, white-crowned sparrows and American tree sparrows had the slowest rate of frequency modulation, dark-eyed juncos were intermediate to most species and brown-headed cowbirds had the highest rate of frequency modulation (Fig. 3). The rate frequency modulation of white-crowned sparrow song was not significantly different from that of American tree sparrows (t18=0.98, P=0.34) or house finches (t18=1.4, P=0.19). The frequency modulation rate of American tree sparrows did not differ significantly from that of the dark-eyed junco (t18=1.3, P=0.20). All other pairs of species were significantly different from one another (t18>3.2, P<0.005). The rise time also varied among species and was similar in pattern to the rate of frequency modulation. White-crowned sparrows, American tree sparrows and house finch song elements had the slowest rise times, dark-eyed junco were intermediate and brown-headed cowbird song elements had the most rapid rise time (Fig. 3). The rise time of American tree sparrow song elements did not differ from the rise time of house finch song elements (t18=1.6, P=0.13) or white-crowned sparrow song elements (t18=1.4, P=0.18). All other pairs of species differed significantly from one another (t18>5.2, P<0.001).
The dominant, minimum, and maximum frequencies also varied among the species. Brown-headed cowbirds had the highest dominant frequency, followed by dark-eyed juncos, American tree sparrows, house finches, and white-crowned sparrows (Fig. 3). The peak frequency of song did not differ significantly between two of the adjacent pairs of species [American tree sparrows and dark-eyed juncos (t18=0.09, P=0.93); house finches and white-crowned sparrows (t18=0.78, P=0.44)], but all other pairs of species were significantly different from one another (t18>4.2, P<0.001).
Brown-headed cowbirds had both the lowest minimum frequency (t18>9.3, P<0.001) and the highest maximum frequency (t18>8.2, P<0.001). House finches had a significantly lower minimum frequency than American tree sparrows (t18=6.0, P<0.0001) and dark-eyed juncos (t18=5.6, P<0.0001), but all other pairs of species did not differ in their minimum frequency (t18<3.1, P>0.006). Similarly, American tree sparrows had a higher maximum frequency than the dark-eyed junco (t18=6.5, P<0.001) and the white-crowned sparrow (t18=6.7, P<0.001), but all other species did not differ in their maximum frequency (t18<2.9, P>0.008; Fig. 3).
The range of frequencies used also differed among most pairs of species. Brown-headed cowbirds had a larger frequency range than all other speces (t18>18.8, P<0.001). In contrast, dark-eyed juncos had a frequency range that was narrower than that of all of the species except the white-crowned sparrow (t18=2.7, P=0.14). The frequency range of the house finch did not differ significantly from that of the American tree sparrow (t18=2.1, P=0.05) or the white-crowned sparrow (t18=3.0, P=0.006). All other pairs of species differed in their frequency range (t18>5.0, P<0.001).
Based on these results we would expect dark-eyed juncos to be most sensitive at high frequencies, to have large amplitude and short latency ABRs, and to have relatively poor FFRs (Fig. 3). We would expect American tree sparrows, house finches and white-crowned sparrows to have similar ABR amplitude, ABR latency and FFR amplitudes. However, these effects may be modulated by frequency range, with white-crowned sparrows and house finches having greater low-frequency sensitivity, but lesser high-frequency sensitivity than American tree sparrows. This difference in frequency range should result in slightly greater ABR amplitudes in American tree sparrows and greater FFR amplitudes in house finches and white-crowned sparrows. Brown-headed cowbirds have a mismatch between vocal range and frequency sensitivity; therefore, we expected that they would have frequency sensitivity that was intermediate to that of the other species. Based on the vocalization of brown-headed cowbirds we would predict that cowbirds would have large ABR amplitudes and short ABR latencies, but weak FFR amplitudes.
Dark-eyed juncos generally had the highest ABR amplitude, followed by American tree sparrows, white-crowned sparrows and house finches (species main effect: F4,196=7.6, P<0.001; Fig. 4), as would be predicted based on their vocalizations (Fig. 3). However, contrary to our predictions, brown-headed cowbirds had the lowest ABR amplitudes overall, significantly lower than those of all other species (t150>3.48, P<0.001). Dark-eyed juncos had significantly higher ABR amplitudes than all other species (t150>2.6, P<0.01) except the American tree sparrow (t150=1.9, P=0.06). Finally, American tree sparrows had significantly higher ABR amplitudes than house finches (t150=3.4, P<0.001). As predicted, ABR amplitude also decreased with onset ramp time (ramp main effect: F1,239=128.8, P<0.001) and was greatest at 2–4 kHz (frequency main effect: F5,943=127.0, P<0.001).
These general patterns were complicated by a significant three-way frequency × ramp × species interaction (F20,941=2.0, P=0.006) [two-way interactions: frequency × ramp (F5,936=12.6, P<0.001), frequency × species (F20,950=3.3, P<0.001), ramp × species interaction (F4,233=0.5, P=0.71)]. This primarily resulted from the following patterns. The slope of the amplitude by ramp function was similar across species at 2 and 3 kHz (Fig. 4). In contrast, the white-crowned sparrow differed most markedly from the other species, with shallower slopes, particularly at 0.5, 1, 4 and 6 kHz. At 4 kHz the American tree sparrow also had a shallower slope than brown-headed cowbirds, dark-eyed juncos and house finches. Finally, at low frequencies (e.g. 0.5 Hz) the house finch and American tree sparrow had steeper slopes than the other species. These patterns were generally predicted by features of each species' vocalizations.
In general, there was no significant main effect of species on ABR latency for the first positive ABR peak (F4,286=1.6, P=0.18). As predicted, ABR latency increased with onset ramp time (F1,351=305, P<0.001), and was shortest for all species at best frequency (generally 2–4 kHz; F5,932=9.2, P<0.001). These main effects were complicated by a significant frequency × ramp × species three-way interaction (F20,960=2.6, P<0.001) [two-way interactions: frequency × ramp (F5,936=5.4, P<0.001), frequency × species (F20,965=2.9, P<0.001), ramp × species (F4,338=0.6, P=0.68)]. In general, brown-headed cowbirds had high latencies at our extreme frequencies (0.5, 1 and 6 kHz; Fig. 5), whereas tree sparrows and juncos had relatively short latencies, particularly at the higher frequencies.
FFR1 and FFR2
FFR1 differed significantly across species (F4,184=19.9, P<0.001) and frequencies (F4,1981=582.7, P<0.001), but not ramps (F4,833=1.7, P=0.15, Fig. 6). As predicted, the FFR1 of the brown-headed cowbird tended to be weak – significantly lower than that of American tree sparrows (t404=4.3, P<0.001) and dark-eyed juncos (t404=3.8, P<0.001). However, the relatively strong FFR1 of dark-eyed juncos was not predicted. No other species were significantly different from one another. None of the interactions that included ramp were significant predictors of FFR1 and were dropped from the model.
These patterns were complicated by a significant interaction of frequency × species (F16,1931=22.8, P<0.001). At 0.5 kHz, the FFR1 of house finches was greater than that of all other species except American tree sparrows (t425>2.9, P<0.005). At 1 kHz, house finches had greater FFR1 amplitude than all other species except for American tree sparrows (t350>4.5, P<0.001), while American tree sparrows had greater FFR1 amplitude than white-crowned sparrows (t353=3.8, P<0.001) and brown-headed cowbirds (t348=4.3, P<0.001). There were no other significant differences among species at 1 kHz. At 2 kHz, house finches had greater amplitude than American tree sparrows (t348=3.2, P<0.001) and brown-headed cowbirds (t343=3.8, P<0.001). No other species were significantly different from one another at 2 kHz. There were no significant differences among house finches, American tree sparrows and white-crowned sparrows in FFR1 amplitude at 3 kHz (t350<0.42, P>0.67), but values for all other species were significantly different than one another (t345>3.2, P<0.001). At 4 kHz, values for American tree sparrows were not significantly different from those of dark-eyed juncos (t351=1.1, P=0.29) or white-crowned sparrows (t346=0.03, P=0.97) and values for house finches were not significantly different from those of white-crowned sparrows (t349=0.81, P=0.41). All other species were significantly different from one another at 4 kHz (t350>3.4, P<0.001).
Neither ramp (F4,380=0.36, P=0.84) nor any of the interactions that included ramp were significant predictors of FFR2. There were significant effects of frequency (F4,362=433, P<0.001), species (F4,151=7.3, P<0.001) and frequency × species (F16,342=2.8, P<0.001) on the FFR2. Brown-headed cowbirds had lower FFR2 amplitudes than house finches at 0.5 (t250=4.0, P<0.001) and 2 kHz (t314=4.4, P<0.001). Brown-headed cowbirds also had lower FFR2 amplitudes than dark-eyed juncos at 1 kHz (t170=4.1, P<0.001), 2 kHz (t243=4.9, P<0.001) and 3 kHz (t426=3.5, P<0.001). There were no other significant frequency-specific differences among the species.
Passerine vocalizations potentially convey an enormous amount of information encoded in multidimensional signals. Information can be encoded in frequency properties, frequency or amplitude modulation, temporal patterns or higher syntactical organization. The dimensions actually used by a species will be constrained by habitat effects (closed canopy species use tonal signals, whereas grassland species tend to use stronger frequency modulation), phylogenetic or morphological effects (larger billed species cannot sing broadband trills at high rates) or design properties (mating signals are designed to carry less far than mate attraction signals). Our results show that multiple dimensions of vocal signals are facilitated by multiple dimensions of the auditory filtering of those signals.
We tested whether auditory processing of rise time and stimulus plateaus in five species varied according to the structural features in their vocalizations. We found that this hypothesis was largely supported by our data, although we found some exceptions. The species with the second most rapid rise time of vocal elements, the dark-eyed junco, had the largest ABR amplitude across nearly all frequencies. Moreover, the dark-eyed junco tended to have steeper ABR amplitude by rise time functions, suggesting they would be more sensitive to changes in signal degradation that alter rise time. House finches and white-crowned sparrows have relatively tonal low-frequency vocalizations and had smaller ABR amplitudes than the other species. American tree sparrows also had relatively tonal, but higher frequency vocalizations and intermediate ABR amplitudes. Although cowbirds have relatively high-frequency vocalizations and rapid onsets in their vocalizations they have relatively poor ABR amplitudes at high frequencies, which replicates previous findings (Gall et al., 2011) and may be related to their unique breeding strategy of brood parasitism.
The latency of the ABR also appeared to be related to vocal features. Dark-eyed juncos tended to have the shortest latencies, suggesting rapid synaptic integration and strong temporal synchrony to the onset of sounds (Hall, 2007). At high frequencies, the latency by rise time function was shallower in dark-eyed juncos than in other species, suggesting robust onset coding across rise time in dark-eyed juncos. American tree sparrows also had relatively short latencies, despite their slower, more tonal songs. House finches and brown-headed cowbirds had the longest ABR latencies, suggesting poorer neural synchrony to the onset of stimuli.
The frequency following response also appeared to be related to the structure of vocalizations. FFRs generally fall off rapidly above 3–4 kHz; therefore, we expected the tonality and frequency range of the vocalizations to be related to FFRs. House finches, which have low-frequency tonal vocalizations, had the strongest FFRs. American tree sparrows had lower amplitude FFRs (FFR1 and FFR2) than house finches at low frequencies, which could be related to the higher frequency of their relatively tonal vocalizations. Brown-headed cowbirds had the weakest FFRs, as would be predicted by their modulated vocalizations with some very high-frequency elements. Dark-eyed juncos had intermediate FFRs and white-crowned sparrow had poor FFRs, which may be a result of the mixture of low-frequency tonal elements and rapidly modulated elements in their vocalizations.
Although our hypothesis that vocal features and acoustic coding would match was largely supported, we did find a few deviations from our expectations. The deviations we observed could be due to phylogeny. Three of the species we studied belong to the Family Emberizidae, the American tree sparrow (Naugler, 1993), the dark-eyed junco (Nolan et al., 2002) and the white-crowned sparrow (Chilton et al., 1995). One species, the brown-headed cowbird, belongs to the Family Icteridae (Lowther, 1993) and one species, the house finch, is in the Family Fringillidae (Hill, 1993). If phylogenetic factors are primarily responsible for the patterns among the species, we would expect the American tree sparrows, the dark-eyed junco and the white-crowned sparrow to be the most similar, while house finches and brown-headed cowbirds should be the most dissimilar from these species. The support for this expectation was somewhat ambiguous. American tree sparrows and dark-eyed juncos generally had similar ABR and FFR patterns across rise times, despite differences in the tonality of their vocalizations. However, the white-crowned sparrows were often more similar to brown-headed cowbirds or house finches than to American tree sparrows or dark-eyed juncos. This result is particularly interesting because similar differences between dark-eyed juncos and white-crowned sparrows have previously been shown in their auditory filter widths (Henry and Lucas, 2010b), suggesting that differences in acoustic structure can produce large differences in auditory coding. It is also possible that slight difference in the body size of the species could produce the differences in amplitude; however, previous work has suggested that body size is unlikely to play a strong role in ABR amplitude and latency (Henry and Lucas, 2010a; Gall et al., 2011). Finally, it is possible that different morphological features of the head lead to differences among species in the spatial arrangement of AEP generators relative to the electrode arrangement. If this were the case, we would expect that closely related species would differ less than species that are distantly related. It is likely that a combination of selective pressure to process features of vocalizations and phylogenetic or morphological constraints produced the patterns seen here.
To the best of our knowledge, this is the first study to explore the effects of rise time on the AEPs of songbirds. Generally, we found that as the rise time increased the amplitude of the ABR decreased and the latency increased. This is consistent with findings in mammals (Salt and Thornton, 1984; Burkard, 1991). We also found that the shape of the amplitude by ramp and latency by ramp functions varied across frequencies and also across species.
In mammals the decreased ABR amplitude and increased ABR latency that is observed with increases in rise time can be attributed to two main sources: (1) spectral splatter and (2) effective amplitude/neural synchrony (Hall, 2007). Spectral splatter is created when stimuli have rapid rise times because there is a trade-off between frequency specificity and rate of onset, although sophisticated windowing functions can improve the frequency specificity of stimuli with short rise times (Gorga and Thornton, 1989). Spectral splatter can result in greater ABR amplitude because a larger cochlear partition, and thus a greater number of neurons, responds to the stimulus (Spoendlin, 1972). This spectral splatter effect is likely to contribute to changes in ABR amplitude associated with stimulus rise time in songbirds as well. The exact nature of the effect may vary between species, particularly when the portions of the cochlea and/or neural populations tuned to specific frequencies vary between specie, as has previously been demonstrated (Gleich and Manley, 2000; Lucas et al., 2007; Henry and Lucas, 2008; Gall et al., 2011).
Spectral splatter can also affect the latency of the ABR. The latency of the ABR in mammals primarily reflects the time lag associated with the traveling wave in the cochlea, with ABR latency increasing continuously from high to low frequencies as a result of the tonotopic organization of the cochlea (Geisler, 1998; Hall, 2007). Stimuli with short rise times are likely to have shorter latencies because the spectral splatter activates higher frequency cochlear partitions. In birds it is less clear how spectral splatter affects ABR latency. ABR latency in birds is shortest at the best frequencies (typically 2–4 kHz) and increases above and below these frequencies, and is thought to be predominantly a result of frequency-specific synaptic integration time (Henry and Lucas, 2008). The cochlea of the bird is also arranged tonotopically (Gleich et al., 1994; Gleich and Manley, 2000), so latency shifts are expected to be continuous as you move away from the best frequency. Away from best frequency, spectral splatter could result in latency shifts by activating cochlear partitions with short latencies; however, whether that cochlear partition is above or below the stimulus frequency will be dependent on the location of the stimulus frequency relative to the best excitatory frequency. Importantly, this effect is expected to be minimal at the best frequency and become larger as the stimulus frequency moves away from the best frequency. Our data support this expectation, as the smallest latency shifts cause by changes in rise time were associated with stimuli near best frequency, while the largest shifts were seen when stimuli were at frequencies lower or higher than best frequency. This spectral splatter effect on latency is expected to be species specific, as the best frequency, and therefore shortest latencies, have been shown to vary among songbird species (Woolley and Rubel, 1999; Brittan-Powell et al., 2002; Henry and Lucas, 2008; Henry and Lucas, 2010a; Caras et al., 2010; Gall et al., 2011).
Effective amplitude and therefore neural synchrony can also be affected by the rise time of the stimulus. Neural synchrony increases with the amplitude of the stimulus because a greater number of receptor cells and/or neurons will respond to a given cycle of the stimulus. Stimuli with rapid rise times are more likely to generate synchronous responses because they have greater effective amplitude than stimuli with longer rise times. Evidence from gated noise bursts suggests that this effective amplitude effect may be the primary generator of ABR amplitude by rise time and ABR latency by rise time patterns in mammals (Burkard, 1991). The effects of rise time on ABR amplitude and latency are similar to the effects of intensity on ABR amplitude and latency in these species (Caras et al., 2010; Gall et al., 2011) (M.D.G., L.E.B. and J.R.L., unpublished data). This suggests that changes in effective amplitude of a stimulus with rise time may contribute strongly to ABR amplitude and latency shifts in songbirds.
Although there are many examples of the association between structural features and the function of vocalizations (Marler, 1959; Leger and Owings, 1978; Fickens and Popp, 1996), the question of how these features are coded by the peripheral auditory system has been relatively unexplored in non-model organisms. The nature of this peripheral coding has both perceptual and functional implications. This is of particular importance because our study suggests that different species may have different peripheral responses to stimuli with different rise times and different frequencies (see also Dooling et al., 2000). Therefore, understanding species-level variation in the coding of these vocal structures can improve our understanding of acoustic communication. To the best of our knowledge, this is the first test of the sender–receiver matching hypothesis in the context of rise times.
Overall, our results suggest that species-specific structural features of vocalizations are reflected in the species-specific auditory processing of these structural features. These data also suggest that the sender–receiver matching hypothesis can be generalized to a wide range of vocal features. However, the match between sender and receiver may be constrained by phylogeny or physiology. Additionally, our data address an important aspect of stimulus selection when designing auditory brainstem studies. Although the influence of many stimulus parameters on the ABR is well described in mammals, not all features have been adequately described in songbirds. Although there are many similarities between birds and mammals in ABR responses, they can also vary in important ways. For instance, the ABR latency by frequency patterns in birds differ substantially from those of mammals (Henry and Lucas, 2008; Gall et al., 2011). These differences may arise from taxon-specific cochlear or neural processing. It is important, therefore, to understand how stimulus parameters affect the ABR and FFR in non-mammalian taxa, and to incorporate this information into our interpretation of the data.
We would like to thank members of the Fernandez-Juricic lab and our anonymous reviewers for their helpful feedback on earlier versions of this manuscript.
This work was supported by the National Science Foundation (NSF) [IOS-1121728], an NSF doctoral dissertation improvement grant [IOS-1109677] and an Animal Behavior Society graduate student research award.