Kinematic analyses have demonstrated that the extent to which a songbird’s beak is open when singing correlates with the acoustic frequencies of the sounds produced, suggesting that beak movements function to modulate the acoustic properties of the vocal tract during song production. If motions of the beak are necessary for normal song production, then disrupting the ability of a bird to perform these movements should alter the acoustic properties of its song. We tested this prediction by comparing songs produced normally by white-throated sparrows and swamp sparrows with songs produced when the beak was temporarily immobilized. We also observed how temporarily loading the beak of canaries with extra mass affected vocal tract movements and song production. Disruption of vocal tract movements resulted in the predicted frequency-dependent amplitude changes in the songs of both white-throated sparrows and swamp sparrows. Canaries with mass added to their beak sang with their beak open more widely than normal and produced notes with greater harmonic content than those without weights. Both manipulations resulted in acoustic changes consistent with a model in which beak motions affect vocal tract resonances, thus supporting the hypothesis that dynamic vocal tract motions and post-production modulation of sound are necessary features of normal song production.

Many aspects of song production in songbirds are thought to parallel human speech, including a dependence on learning (Marler, 1970), gradual motor development (Marler and Peters, 1982) and lateralized neural specializations for production and perception (Nottebohm, 1971; Vicario, 1993; Wild, 1993a). Mechanisms of post-production modulation by the vocal tract during song production have, however, traditionally been considered to differ from human speech, in which the vocal source is coordinated with a vocal tract resonance filter (Flanagan, 1972; Lieberman and Blumstein, 1988). In humans, sounds produced by the laryngeal folds are selectively filtered by the vocal tract during speech production. In songbirds, sound is produced by the vibration of membranes in the syrinx (Greenewalt, 1968; Suthers et al., 1994; Goller and Larsen, 1997a; Larsen and Goller, 1999) or labia (Goller and Larsen, 1997b; Larsen and Goller, 1999). Unlike human speech, Greenewalt (1968) proposed that the acoustic characteristics of birdsong are based completely on the syringeal source and that the vocal tract plays no role in sound modification. Greenewalt’s (1968) proposition was based primarily on interpretation of acoustic patterns and syringeal anatomy, and his model was generally accepted as the mechanism of song production in songbirds.

By analyzing the acoustic structure of the songs of several species of birds in a helium-enriched atmosphere, Nowicki (1987) later demonstrated that the vocal tract does indeed play a role in birdsong production and, in this way, its function is similar to that of the vocal tract in human speech. The effects of helium on song suggested that the vocal tract plays a role in song production by influencing the relative amplitudes of overtones produced at the syringeal source. Nowicki (1987) hypothesized that birds must actively modify vocal tract resonances during song production to track the fundamental frequency produced by the syrinx (see also Nowicki and Marler, 1988; Fletcher and Tarnopolsky, 1999). Such tracking would be necessary to maintain the pure-tonal quality of songs across a range of fundamental frequencies (Nowicki and Marler, 1988).

According to the hypothesis of Nowicki (1987), dynamic changes in song timing and frequency features require corresponding dynamic changes in the physical configuration of the vocal tract. The resonance properties of the avian vocal tract might be modified in at least four ways during sound production, including (i) changing the length of the trachea using the tracheolateralis muscles, (ii) changing the extent to which the trachea is occluded using the glottis, (iii) changing the effective length of the vocal tract by opening and closing the beak, and (iv) moving the tongue (Nowicki and Marler, 1988; Patterson and Pepperberg, 1994; Fletcher and Tarnoplosky, 1999). In songbirds, the beak is particularly well-positioned to modify the acoustic resonances of the vocal tract, either by changing the actual length of the vocal tract, as in lip-rounding in speech (Lieberman and Blumstein, 1988), or by changing the impedance at the open end, and thus the effective length, of the vocal tract. The vocal tract may, therefore, act as a filter by selectively attenuating (i.e. reducing the amplitude of) all but a single, dominant frequency to produce the pure-tonal sounds characteristic of many birdsongs (Fig. 1A). Alternatively, vocal tract resonances may directly influence the vibration characteristics of the syringeal membranes by suppressing the production of overtones at the source to produce pure tonal notes (Nowicki and Marler, 1988) in a manner similar to that proposed for human soprano singing (Rothenberg, 1987). Both these models predict that a more closed beak will correspond to lower-frequency vocal tract resonances, while a more open beak should correspond to higher-frequency vocal tract resonances (Nowicki and Marler, 1988).

Fig. 1.

Predicted effects of immobilizing the beak during song production. In the hypothetical example depicted in A and B, a normal, unmanipulated bird produces two notes, one with a fundamental frequency, f0, of 1750 Hz (A) and the other with f0=3250 Hz (B). As the bird switches from one note to the next, the resonance frequency, RF, of the vocal tract also shifts, thus ‘tracking’ f0. This shift to higher frequencies corresponds to the beak opening wider. With the beak immobilized as in C and D, RF is fixed at some frequency (2200 Hz in this hypothetical case). The manipulation has no predicted effect on the normal ability of the bird to modulate f0 with its syrinx. Because RF is close to f0 for the lower-frequency note (C), its amplitude is only slightly affected by the manipulation. The higher-frequency note (D) is strongly attenuated because RF cannot shift to higher frequencies. Frequency-modulated notes such as those produced by swamp sparrows, will be affected similarly, with attenuation decreasing as f0 sweeps towards the center of the fixed RF and increasing as f0 sweeps away from the center of the fixed RF. These amplitude changes are observed as changes in the bandwidth and peak amplitude frequency of averaged power spectra. For clarity, this diagram is drawn as though immobilizing the beak completely fixes tract resonances, but other vocal tract motions may also influence its resonances. Thus, immobilizing the beak is expected only to limit the extent to which resonances can be modified by the bird.

Fig. 1.

Predicted effects of immobilizing the beak during song production. In the hypothetical example depicted in A and B, a normal, unmanipulated bird produces two notes, one with a fundamental frequency, f0, of 1750 Hz (A) and the other with f0=3250 Hz (B). As the bird switches from one note to the next, the resonance frequency, RF, of the vocal tract also shifts, thus ‘tracking’ f0. This shift to higher frequencies corresponds to the beak opening wider. With the beak immobilized as in C and D, RF is fixed at some frequency (2200 Hz in this hypothetical case). The manipulation has no predicted effect on the normal ability of the bird to modulate f0 with its syrinx. Because RF is close to f0 for the lower-frequency note (C), its amplitude is only slightly affected by the manipulation. The higher-frequency note (D) is strongly attenuated because RF cannot shift to higher frequencies. Frequency-modulated notes such as those produced by swamp sparrows, will be affected similarly, with attenuation decreasing as f0 sweeps towards the center of the fixed RF and increasing as f0 sweeps away from the center of the fixed RF. These amplitude changes are observed as changes in the bandwidth and peak amplitude frequency of averaged power spectra. For clarity, this diagram is drawn as though immobilizing the beak completely fixes tract resonances, but other vocal tract motions may also influence its resonances. Thus, immobilizing the beak is expected only to limit the extent to which resonances can be modified by the bird.

Kinematic analyses of singing in four species, the white-throated sparrow (Zonotrichia albicollis) (Westneat et al., 1993), the swamp sparrow (Melospiza georgiana) (Westneat et al., 1993), the song sparrow (Melospiza melodia) (Podos et al., 1995) and the Bengalese finch (Lonchura domestica) (Moriyama and Okanoya, 1996), demonstrated that the degree of beak opening during singing is positively correlated with the frequency of the sound being produced. In the northern cardinal (Richmondena cardinalis), gape is correlated with fundamental frequencies below 3.5 kHz, although the correlation is lost at higher frequencies (Suthers and Goller, 1996; Suthers et al., 1996). These results strongly suggest that beak movements during singing are capable of modulating the acoustic properties of the vocal tract, although additional experimental evidence is necessary to demonstrate a direct role for beak movements in modulating vocal tract resonances during normal song production.

If beak motions influence sound production, then disrupting the ability of a bird to perform these movements should alter song production. We tested this hypothesis in two ways. First, we temporarily immobilized the beak and compared songs produced normally with those produced when the beak was immobilized. Second, we temporarily loaded the beaks of birds with extra mass and observed how this loading affects vocal tract movements and song production. We hypothesized that disrupted beak movements should effectively restrict the ability of the bird to modify vocal tract resonances, which should, in turn, result in frequency-dependent changes in the amplitudes of sounds produced or in losses in the pure-tonal quality of song. Our experiments provide direct evidence for a role of beak movements in normal birdsong production and, thereby, further support the contention that source–tract coordination is an important aspect of the motor control of song.

Beak immobilization experiment

We recorded songs from three white-throated sparrows (Zonotrichia albicollis) and three swamp sparrows (Melospiza georgiana) before, during and after their beaks had been immobilized (8–20 songs per condition). To immobilize a bird’s beak, a small paper dowel was inserted between the upper and lower bills. The dowel was held in place by two dental rubber bands, both anchored by the ends of the dowel on either side of the beak (Fig. 2). One band extended over the upper bill and the other beneath the lower bill; a drop of cyanoacrylate glue prevented the rubber bands from sliding towards the narrow tips of the bills. The diameter of the dowel and its position along the length of the beak determined the beak gape (2.59±0.42 mm, N=3, for white-throated sparrows and 2.99±0.30 mm, N=3, for swamp sparrows; means ± S.D.). With the dowel in place, the bird’s beak was fixed in a partially open position. The dowel was left in place for up to 4 h, after which time the glue holding the rubber bands in place was loosened with a drop of acetone and the beak was freed. After 2–4 days of experience with the procedure, birds behaved normally and sang with the dowel in place. Motions of the head and neck were unaffected by the manipulation. Fixed gape distances were relatively small compared with the normal range observed in kinematic analyses, corresponding to the production of relatively low acoustic frequencies (approximately 3.0 kHz for white-throated sparrows and 3.5 kHz for swamp sparrows) (Westneat et al., 1993). The dowel itself may also increase the acoustic impedance of the open end of the vocal tract. For both these reasons, following the acoustic resonance models of sound production (Nowicki and Marler, 1988; Gaunt and Nowicki, 1998), we predicted that lower-frequency song components should be relatively less affected by the experimental manipulation than should higher-frequency song components (Fig. 1); specifically, we predicted that higher-frequency song components would suffer greater attenuation or a greater increase in harmonic content because the acoustic resonances of the vocal tract were constrained to be centered at lower frequencies.

Fig. 2.

Diagram of the immobilized beak. A small paper dowel is attached between the upper and lower bills. The dowel is held in place by two dental rubber bands, both anchored by the ends of the dowel on either side of the beak. A drop of cyanoacrylate glue prevents the rubber bands from sliding towards the narrow tips of the bills.

Fig. 2.

Diagram of the immobilized beak. A small paper dowel is attached between the upper and lower bills. The dowel is held in place by two dental rubber bands, both anchored by the ends of the dowel on either side of the beak. A drop of cyanoacrylate glue prevents the rubber bands from sliding towards the narrow tips of the bills.

Beak weighting experiment

The goal of this experiment was to reduce the precision with which birds could perform beak movements by temporarily affixing small weights to their lower beak. The songs of three adult male canaries (Serinus canaria) were recorded with and without their beaks weighted. Small wedges of Plasticine were temporarily attached to the lower beaks, using cyanoacrylate glue, and fishing weights were clipped onto these wedges. Weights were attached for no more than 4 h per day. Over the course of several weeks, the birds habituated to the experimental procedure and sang with weights of up to 1.5–2.5 g (wedge + fishing weight) added to their lower beak. We hypothesized that added mass would alter the ability of the birds to move their vocal tracts with precision, resulting in a timing mismatch between the vocal source and vocal tract resonances. Unlike the beak-immobilization experiment, we could not predict in advance how the added mass might affect beak movements – presumably both the timing and magnitude of beak motions could be affected – although we expected to observe changes in the relative amplitudes of different frequencies and/or changes in the tonal quality of notes.

To characterize the effects of beak weighting on beak movements, birds were videotaped as they sang, both with and without weights, using a Panasonic SVHS AG-450 camera (60 Hz sampling rate, 0.01 s shutter speed) positioned 1–2 m from the bird. Videotaped segments of selected singing events (a subsample of the songs used for acoustic analyses, see below) were displayed on an Amiga 2000 computer as individual video fields, and data points were selected with an on-screen cursor. These data points were stored as x,y-coordinates using VidiTrack System software (Crenshaw, 1992) (for complete descriptions of this technique, see Westneat et al., 1993; Podos et al., 1995). Three points were obtained for each video field: (i) the tip of the upper bill, (ii) the tip of the lower bill and (iii) a stationary reference point. Each point was selected three times and averaged to minimize error associated with the process of point selection. Beak gape was calculated as the distance between the upper and lower bill tips, and was plotted as a function of time (in 16.67 ms intervals), thus providing gape profiles for each sound analyzed. Inspection of gape profiles provided information about the effects of weights on the timing (e.g. cyclical nature) of beak movements and also about the effects of weights on the magnitude of gape achieved during song production.

Acoustic recording and analysis

Recordings were made in semi-anechoic rooms using Shure SM-57 or Realistic 33-1070B microphones, a Yamaha MLA7 amplifier, a Lexicon PCM-42 digital delay and Marantz PMD 221 or Sony TC-D5M tape recorders. The frequency response of these systems were 40 Hz to 14 kHz, ±3 dB. Acoustic analyses were performed using SIGNAL digital signal analysis software (Beeman, 1992), with songs digitized at 25×103 points s−1 for the beak-immobilization experiment and 40×103 points s−1 for the beak-weighting experiment.

The songs of white-throated sparrows typically consist of a series of pure-tonal notes sung at different constant frequencies. The relative amounts of acoustic energy for different notes are stereotypic and characteristic of the song of an individual (Fig. 3). Relative amplitude differences were obtained by measuring the root mean square (RMS) amplitudes integrated across entire notes after the peak amplitude in the song had been normalized to 1.0 V. Normalizing peak amplitudes within songs before comparing the relative amplitudes of notes alleviated problems associated with the comparison of absolute sound amplitudes.

Fig. 3.

(A–C) Spectrograms (upper traces) and amplitude profiles (lower traces) of normal songs of three white-throated sparrows. (D–F) Relative amplitude (see Materials and methods) differences between selected notes from each white-throated sparrow song, as produced during control conditions before immobilization (‘C’), with the beak immobilized (‘I’) and post-immobilization (‘P’). The means, standard errors and sample sizes are shown. Spectrograms were produced at a sampling rate of 25×103 points s−1, 256-point FFT, 98 Hz frequency resolution. N1–N3, notes 1–3.

Fig. 3.

(A–C) Spectrograms (upper traces) and amplitude profiles (lower traces) of normal songs of three white-throated sparrows. (D–F) Relative amplitude (see Materials and methods) differences between selected notes from each white-throated sparrow song, as produced during control conditions before immobilization (‘C’), with the beak immobilized (‘I’) and post-immobilization (‘P’). The means, standard errors and sample sizes are shown. Spectrograms were produced at a sampling rate of 25×103 points s−1, 256-point FFT, 98 Hz frequency resolution. N1–N3, notes 1–3.

The songs of swamp sparrows are composed of notes organized in repeated units of syllables. Notes are short (20–40 ms), and most are rapid, pure-tonal frequency sweeps (Fig. 4). The rapid frequency sweeps characteristic of swamp sparrow song notes prohibited a comparable analysis of amplitude changes between individual notes at different constant frequencies. We compared, instead, the bandwidth averaged across four syllable repetitions between the control and immobilized conditions (Fig. 4). The upper frequency band limit of songs was measured at −24 dB relative to the peak amplitude, as produced before, during and after the beak had been immobilized. Amplitude spectra were generated as single digital Fourier transforms (DFT), calculated across four entire syllables (32×103 points fast Fourier transform, FFT, frequency data smoothed at 300 Hz resolution).

Fig. 4.

(A–C) Sound spectrograms of normal songs from three swamp sparrows (A,D,G) bird 1; (B,E,H) bird 2; (C,F,I) bird 3. Swamp sparrow songs are composed of repeated ‘syllables’ of 2–5 different notes, and these notes are rapid frequency sweeps (only four syllable repetitions are shown for birds 1 and 2, three syllable repetitions for bird 3). (D–F) Averaged relative amplitude spectra (see Materials and methods) of each song as sung normally (solid lines) and with the beak immobilized (dashed lines). (G–I) Upper frequency band limit of songs measured at −24 dB relative to peak amplitude, as produced during control conditions before immobilization (‘C’), with the beak immobilized (‘I’) and post-immobilization (‘P’). The means, standard errors and sample sizes are shown. Spectrograms were produced as in Fig. 3.

Fig. 4.

(A–C) Sound spectrograms of normal songs from three swamp sparrows (A,D,G) bird 1; (B,E,H) bird 2; (C,F,I) bird 3. Swamp sparrow songs are composed of repeated ‘syllables’ of 2–5 different notes, and these notes are rapid frequency sweeps (only four syllable repetitions are shown for birds 1 and 2, three syllable repetitions for bird 3). (D–F) Averaged relative amplitude spectra (see Materials and methods) of each song as sung normally (solid lines) and with the beak immobilized (dashed lines). (G–I) Upper frequency band limit of songs measured at −24 dB relative to peak amplitude, as produced during control conditions before immobilization (‘C’), with the beak immobilized (‘I’) and post-immobilization (‘P’). The means, standard errors and sample sizes are shown. Spectrograms were produced as in Fig. 3.

The canaries in our experiment produced songs that consisted mainly of trilled repetitions of notes or note groups. The repertoire of ‘trill types’ of each bird was identified from baseline songs by visual analysis of spectrograms using a Kay DSP Sona-Graph model 5500, 300 Hz resolution. Trill types were identified on the basis of their time-varying frequency and amplitude properties. Four trill types from each of the three birds were chosen for detailed analysis (Fig. 5). Nine to twelve renditions of each trill type from the baseline and test weighted conditions (excepting trill type ‘K’, weighted condition, for which N=2) were digitized for analysis of acoustic structure.

Fig. 5.

Sonagrams of the canary trill types analyzed. Trill types A–D are from bird 1, types E–H from bird 2 and types I–L from bird 3. Sonagrams were produced on a Kay Elemetric Digital Sona-Graph, 0–8 kHz analysis range, 300 Hz frequency resolution.

Fig. 5.

Sonagrams of the canary trill types analyzed. Trill types A–D are from bird 1, types E–H from bird 2 and types I–L from bird 3. Sonagrams were produced on a Kay Elemetric Digital Sona-Graph, 0–8 kHz analysis range, 300 Hz frequency resolution.

We calculated the tonal quality for each trill type rendition as follows. For eight trill types (A, D, E, F, G, H, J and K; Fig. 5), amplitude spectra of complete notes were calculated at 32×103 points DFT, and smoothed to 100 Hz frequency resolution (Beeman, 1992). We measured the relative amplitude (dB) of the fundamental frequency and the second harmonic of each note, using an on-screen cursor, and calculated the difference between them (see Figs 6, 7). The tonal quality of each trill type rendition was calculated as the mean tonal quality of individual notes for up to five notes per trill type rendition. For another three trill types (B, I and L; Fig. 5), we calculated amplitude spectra only for the portion of the note that was the least frequency-modulated. For a twelfth trill type (C; Fig. 5), which had a rapid rate of note repetition within songs (approximately 25 Hz), tonal quality was measured from single amplitude spectra calculated across the entire duration of each trill rendition (similar to the swamp sparrow analysis).

Fig. 6.

Representative spectrograms and amplitude spectra of a canary trill type H (see Fig. 5) sung in both baseline and weighted conditions. In the baseline condition, the acoustic energy of the signal is concentrated primarily in the fundamental frequency components; in the weighted condition, acoustic energy is more evenly distributed among the fundamental frequency and second harmonic overtone. The amplitude difference between the fundamental frequency and the second harmonic in the baseline condition was 45 dB. The amplitude difference between the fundamental frequency and the second harmonic in the weighted condition was 36 dB. Sonagrams were produced on a Kay Elemetric Digital Sona-Graph, 0–8 kHz analysis range, 300 Hz frequency resolution. Amplitude spectra were calculated at 32×103 points FFT and smoothed to 100 Hz frequency resolution using SIGNAL digital sound analysis software (Beeman, 1992).

Fig. 6.

Representative spectrograms and amplitude spectra of a canary trill type H (see Fig. 5) sung in both baseline and weighted conditions. In the baseline condition, the acoustic energy of the signal is concentrated primarily in the fundamental frequency components; in the weighted condition, acoustic energy is more evenly distributed among the fundamental frequency and second harmonic overtone. The amplitude difference between the fundamental frequency and the second harmonic in the baseline condition was 45 dB. The amplitude difference between the fundamental frequency and the second harmonic in the weighted condition was 36 dB. Sonagrams were produced on a Kay Elemetric Digital Sona-Graph, 0–8 kHz analysis range, 300 Hz frequency resolution. Amplitude spectra were calculated at 32×103 points FFT and smoothed to 100 Hz frequency resolution using SIGNAL digital sound analysis software (Beeman, 1992).

Fig. 7.

Extreme example of the effect of how adding weights affects tonal quality of song in the canary. Spectrograms and amplitude spectra of a trill type K (see Fig. 5) sung in both baseline and weighted conditions. In this example, the baseline condition has acoustic energy concentrated primarily in the fundamental frequency; in the weighted condition, the acoustic energy of the second harmonic is greater than that of the fundamental. The amplitude difference between the fundamental frequency and the second harmonic in the baseline condition was 37 dB. The amplitude difference between the fundamental frequency and the second harmonic in the weighted condition was −8 dB. Further details as in Fig. 6.

Fig. 7.

Extreme example of the effect of how adding weights affects tonal quality of song in the canary. Spectrograms and amplitude spectra of a trill type K (see Fig. 5) sung in both baseline and weighted conditions. In this example, the baseline condition has acoustic energy concentrated primarily in the fundamental frequency; in the weighted condition, the acoustic energy of the second harmonic is greater than that of the fundamental. The amplitude difference between the fundamental frequency and the second harmonic in the baseline condition was 37 dB. The amplitude difference between the fundamental frequency and the second harmonic in the weighted condition was −8 dB. Further details as in Fig. 6.

Our results demonstrate that disrupting vocal tract movements causes a consistent decrease in the relative amplitude of higher frequencies in the songs of both the white-throated sparrow and the swamp sparrow. In addition, notes expressed decreased tonal quality both in the canary and in higher-frequency notes in the white-throated sparrow.

Immobilization experiment

Immobilizing the beak of singing white-throated sparrows resulted in the attenuation of the relative amplitudes of higher-frequency notes far more than the amplitudes of lower-frequency notes (Fig. 3). This effect was more pronounced at higher absolute frequencies. As shown in Fig. 3 in birds 1 and 2, the first note (‘N1’) was produced at a lower amplitude than the second note (‘N2’) during normal singing; with the beak immobilized, the lower-frequency first note became relatively higher in amplitude compared with the second note (Fig. 3). In bird 3, the first note (‘N1’) and third note (‘N3’) were produced at approximately the same amplitude during normal singing; with the beak immobilized, the lower-frequency third note became relatively higher in amplitude compared with the first note. In all cases, higher-frequency notes (with the fundamental frequency farther from the vocal tract resonance in the immobilized condition) are attenuated more than lower-frequency notes (with fundamental frequency closer to the fixed vocal tract resonance frequency). Songs returned to near normal following removal of the dowel (Fig. 3).

The effect of immobilizing the beaks of swamp sparrows was, in all cases, a loss of energy in the high-frequency portion of the averaged spectrum, including a downward shift in the upper band limit and a lowering of the peak amplitude frequency of the spectrum (Fig. 4). These changes reflect the decrease in amplitude of the higher-frequency portions of the frequency-modulated notes. Values returned to normal immediately after removal of the dowel.

Weighting experiment

Inspection of kinematic gape profiles indicated that the temporal aspects of gape movements (e.g. periodicity of beak opening and closing) were not affected by the addition of weights to the lower bill of canaries. However, the addition of beak weights consistently increased both the maximum gape distance and the minimum gape distance for every syllable analyzed (Table 1). That is, the addition of weights caused birds to open their beak to wider gapes than normal and prevented them from closing their beak as much as they would during normal song production. Maximum and minimum gape achieved during song production, across the 10 trill types analyzed, differed significantly between baseline and weighted conditions (Wilcoxon signed-rank tests: for maximum gape, Z=1.988, d.f.=9, P=0.0469; for minimum gape, Z=2.675, d.f.=9, P=0.0075). The sampling resolution of our video system turned out to be insufficient, relative to canary song trill rates, to allow reliable quantification of additional, more precise kinematic variables such as gape acceleration and velocity. Such variables can be calculated with greater reliability using video systems with higher sampling rates or the Hall-effect sensor that has been successfully used in cardinals (Suthers, 1997).

Table 1.

Maximum and minimum gape as a function of experimental condition for ten trill types (labeled as in Fig. 5)

Maximum and minimum gape as a function of experimental condition for ten trill types (labeled as in Fig. 5)
Maximum and minimum gape as a function of experimental condition for ten trill types (labeled as in Fig. 5)

The addition of weights to the beak of canaries altered the tonal quality of the song significantly, with the expression of greater harmonic content than normally observed (Figs 68; Wilcoxon signed-ranks test, Z=−2.82, d.f.=11, P=0.0047). The mean difference between the baseline and weighted conditions was 12.01±3.2 dB (mean ± S.E.M., N=2–12). Further, the extent to which tonal quality was altered by the weight manipulation depended upon the fundamental frequency of the note analyzed: lower-frequency notes tended to suffer a greater change in tonal quality (Fig. 9; F1,11=6.632, P=0.028).

Fig. 9.

Regression between fundamental frequency and experimental loss of tonal quality in the canary beak-weighting experiment. There is a frequency-dependent effect: notes with lower fundamental frequencies experience a greater loss of tonal quality than notes with higher fundamental frequencies: y=6.58x−37.6, r2=0.4, F=6.632, d.f.=11, P=0.028.

Fig. 9.

Regression between fundamental frequency and experimental loss of tonal quality in the canary beak-weighting experiment. There is a frequency-dependent effect: notes with lower fundamental frequencies experience a greater loss of tonal quality than notes with higher fundamental frequencies: y=6.58x−37.6, r2=0.4, F=6.632, d.f.=11, P=0.028.

Recent studies of singing behavior in songbirds suggest that song production involves the coordination of the syringeal vocal source with movements of the vocal tract (Westneat et al., 1993; Podos et al., 1995; Suthers et al., 1996). We experimentally tested the functional significance of this coordination by disrupting the ability of birds to achieve normal movements of the vocal tract in two different ways: by disrupting normal movements and by immobilizing the beak in white-throated sparrows and swamp sparrows, and by adding mass to the lower beak in canaries. Both manipulations modified the acoustic properties of vocalizations, supporting the hypothesis that changes in vocal tract acoustic properties, mediated by beak motions, are a necessary feature of normal birdsong production.

In our immobilization experiment, the changes in amplitude we observed presumably occurred because the acoustic frequencies of sounds produced at the syrinx fell outside the range of experimentally fixed vocal tract resonances (Fig. 1). It appears that birds ordinarily modify vocal tract resonances to maintain alignment between syrinx motor patterns and vocal tract resonances (Nowicki and Marler, 1988; Gaunt and Nowicki, 1998), at least within certain frequency ranges (Suthers et al., 1996). With the beak immobilized, the birds were no longer able to fully alter vocal tract resonances, and thus the acoustic properties of notes were affected. We expected to observe a more pronounced misalignment between vocal tract resonance frequencies and vocal source frequencies for high-frequency notes because the beaks were held in a relatively closed position and because the presence of the dowel may increase the acoustic impedance of the vocal tract. Indeed, in both the white-throated sparrow and the swamp sparrow, we observed a loss of energy at high frequencies relative to low frequencies, supporting our prediction.

By adding mass to the lower beak of canaries, we disrupted normal beak movements in a different way. While the birds were able to open and close their mouth as rapidly as normal with mass added to their beak, they sang with their beaks generally open more widely with weights than without weights (Table 1). The effect of adding weights on minimum and maximum values of beak gape was highly significant. In effect, birds with weights on their beak experienced a change in the set point of beak gape and were unable to maintain alignment between vocal production and vocal tract resonances. This mismatch between the vocal source and vocal tract resonances resulted in the production of notes with greater harmonic content. Even slight increases in beak gape, caused by the addition of weights, were sufficient to change tonal quality (compare Table 1 with Fig. 8).

Fig. 8.

‘Pure-tonal quality’ calculated as the difference between the fundamental frequency and the second harmonic for the baseline condition (B) and the weighted condition (W) for 12 trill types in the canary. Means, standard errors and sample sizes are shown. Labels A–L correspond to trill types in Fig. 5.

Fig. 8.

‘Pure-tonal quality’ calculated as the difference between the fundamental frequency and the second harmonic for the baseline condition (B) and the weighted condition (W) for 12 trill types in the canary. Means, standard errors and sample sizes are shown. Labels A–L correspond to trill types in Fig. 5.

Both our experimental manipulations effectively altered the relative amplitudes of sounds produced. In the white-throated sparrow, we compared the amplitudes of notes produced at different frequencies. Immobilizing the beak decreased the relative amplitude of higher-frequency notes. In the swamp sparrow, we compared the frequency bandwidth of syllables produced under normal conditions with those produced with the beak immobilized. When the beak was immobilized, the relative amplitude of higher-frequency sounds decreased. In the canary, we measured the relative amplitude of the fundamental frequency and the second harmonic within a note. Adding weights disrupted beak movements and led to decreased differences in the relative amplitudes of the fundamental and second harmonic (i.e. increased harmonic content).

The experimental effects we observed are consistent with two different models for how dynamic changes in vocal tract acoustics might influence song production. In the first model, which is similar to the ‘source–filter’ model of human speech production (Lieberman and Blumstein, 1988), vocal tract resonances are thought to act as an acoustic filter that selectively attenuates frequencies falling outside the filter’s passband (Nowicki, 1987). In this model, the pure-tonal quality (i.e. lacking harmonic overtones) of much birdsong results from the fact that the vocal tract filter removes most of the energy in all but a single dominant frequency. An alternative model of vocal tract function suggests that vocal tract resonances may be more directly coupled with the syringeal membranes, directly influencing their vibration characteristics and suppressing the production of overtones at the source (Nowicki and Marler, 1988; Gaunt and Nowicki, 1998; Fletcher and Tarnopolsky, 1999). This second model is similar to a mechanism proposed to account for the efficient production of highly pure-tonal sound in human soprano singing, in which overlap between a dominant resonance frequency of the vocal tract and the frequency of the signal produced at the laryngeal source is thought to result in a coupling between source and tract that reinforces the acoustic power of the fundamental frequency and diminishes the power in overtones (Rothenberg, 1987). Although these two models differ in how the acoustic properties of the vocal tract exert their influence on the quality of sound produced, both models require that the resonance frequency of the vocal tract correspond to the fundamental frequency (or perhaps a dominant overtone) produced by the syringeal source; thus, both models predict that vocal tract configuration must be dynamically modified in coordination with changes in the frequencies produced by the syrinx. In this sense, the vocal tract can be thought of as tracking frequency changes in the output of the syringeal source.

The changes we observed in vocal output following our vocal tract manipulations were all frequency-dependent. In the white-throated sparrow, the amplitudes of notes with higher fundamental frequencies were attenuated more than those of notes with lower fundamental frequencies. In the swamp sparrow, energy was lost at the high end of the frequency distribution, while the peak amplitude shifted to a lower frequency. These results can be explained by the fact that the beaks were immobilized at a gape distance that normally corresponds to lower-frequency notes; lower-frequency notes were closer to the vocal tract resonances, while higher-frequency notes fell further away from the vocal tract resonances. Thus, lower-frequency sounds were less affected than higher-frequency sounds.

We also observed a frequency-dependent effect on the beak weight manipulation, although in this case lower-frequency notes suffered greater degradation (Fig. 9). While we did not predict this outcome in advance, it is explained by our kinematic results showing that the added weight caused the bird’s beak to gape more widely overall (Table 1). Thus, the physical effect of the beak weight manipulation was essentially the opposite of the beak immobilization manipulation, in which gapes were fixed in a relatively more closed position (although the dynamics of the beak weight manipulation are clearly more complex given that the beak can still move). In normal song production, a more closed beak gape corresponds to a lower-frequency sound (Westneat et al., 1993; Podos et al., 1995). Because the effect of adding weight to the beak was to prevent the birds from closing their beak as much as they normally do when singing, it is not surprising that we observed a stronger effect on lower-frequency notes. These findings are also consistent with recent theoretical and experimental modeling work by Fletcher and Tarnopolsky (1999), who predict a non-linear effect of beak gape changes on vocal tract resonances, with gape changes at small gapes leading to disproportionately larger changes in vocal tract resonances.

Coordination of vocal tract movements with the vocal source

We demonstrate here that beak movements are an important functional element of vocal tract acoustic properties. By confirming the role of the beak in song production, we have also shown that models of song motor control must take into account the mechanisms by which vocal tract movements are coordinated with those of the syrinx and the respiratory system. The neuroanatomical mechanisms for coordinating the vocal source (i.e. syrinx) and the vocal tract filter (i.e. beak movements) have not yet been elucidated. Motoneurons in the jaw do not receive input from the nucleus robustus archistriatalis (Wild, 1993b), a primary nucleus in the descending pathway controlling motor output during song production. In contrast, the pathways controlling the respiratory system are highly integrated with the control of the vocal system (Suthers, 1997).

The tonal quality of song emerges gradually during vocal development in song sparrows (Melospiza melodia), with the expression of highly pure-tonal notes increasing over a time frame of several weeks and corresponding closely to the onset of beak movements (Podos et al., 1995). The delay in the onset of beak movements, and the corresponding production of pure-tonal sounds, presumably derives from a need to practice and refine vocal tract movements so as to match accurately the time-varying frequencies produced by the syringeal source as the young bird attempts to copy accurately the models he has learned (Podos et al., 1995). Feedback between the source and vocal tract motor patterns presumably occurs along kinesthetic or auditory inputs, both of which interact with neural song control centers (Vicario, 1994; Wild, 1995). For example, during development, birds may assess the extent to which their song lacks pure-tonal quality and adjust their vocal tract movements accordingly. The present data demonstrate that the species studied here, over the course of our experimental treatment, do not possess the ability to modify their syringeal output to compensate for changes in vocal tract function. Otherwise, we might have observed birds modifying their syringeal output so as to match the resonance frequencies of their reconfigured vocal tracts. This lack of compensation is consistent with other observations indicating that the motor aspects of song production, including the coordination of source and filter components, become much less plastic after the motor development of song (Marler, 1976).

We do not suggest that the beak is the only functional determinant of vocal tract properties; postural changes in other vocal tract structures seem likely to play some role in modifying tract resonances (see also Westneat et al., 1993; Podos et al., 1995; Fletcher and Tarnopolsky, 1999). Nor is it necessarily the case that all kinds of sounds produced by songbirds require the same degree of source–tract coordination. Suthers et al. (1996), for example, found that the correlation between beak motions and song frequency was lost at higher frequencies in cardinal song. Nonetheless, our results demonstrate that vocal tract movements, and those of the beak in particular, play a necessary role in many aspects of song production. This functional link suggests the need to understand better the motor mechanisms responsible for coordinating vocal tract activity with syringeal and respiration mechanisms, and confirms another intriguing parallel between birdsong and human speech.

We thank M. Hughes, J. H. Long Jr, D. Margoliash, P. Marler, S. Patek, M. Westneat and an anonymous reviewer for valuable suggestions and P. and M. Klopfer for the loan of canaries. K. Davis drew Fig. 2. This work was supported by the NIH, NSF and Sigma Xi.

Beeman
,
K.
(
1992
).
SIGNAL User’s Manual
.
Belmont, MA
:
Engineering Design
.
Crenshaw
,
H. C.
(
1992
).
VidiTrack User’s Manual
.
Durham, NC
:
HeadLight Systems
.
Flanagan
,
J. L.
(
1972
).
Speech Analysis, Synthesis and Perception, second edition
.
Berlin
:
Springer-Verlag
.
Fletcher
,
N. H.
and
Tarnopolsky
,
A.
(
1999
).
Acoustics of the avian vocal tract
.
J. Acoust. Soc. Am.
105
,
35
49
.
Gaunt
,
A. S.
and
Nowicki
,
S.
(
1998
).
Sound production in birds: Acoustics and physiology revisited
. In
Animal Acoustic Communication: Sound Analysis and Research Methods
(ed.
S. L.
Hopp
,
M. J.
Owren
and
C. S.
Evans
), pp.
291
321
.
New York
:
Springer-Verlag
.
Goller
,
F.
and
Larsen
,
O. N.
(
1997a
).
In situ biomechanics of the syrinx and sound generation in pigeons
.
J. Exp. Biol.
200
,
2165
2176
.
Goller
,
F.
and
Larsen
,
O. N.
(
1997b
).
A new mechanism of sound generation in songbirds
.
Proc. Natl. Acad. Sci. USA
94
,
14787
14791
.
Greenewalt
,
C. H.
(
1968
).
Bird Song: Acoustics and Physiology. Washington, DC: Smithsonian Institution Press
.
Larsen
,
O. N.
and
Goller
,
F.
(
1999
).
Role of syringeal vibrations in bird vocalizations
.
Proc. R. Soc. Lond. B
266
,
1609
1615
.
Lieberman
,
P.
and
Blumstein
,
S. E.
(
1988
).
Speech Physiology, Speech Perception and Acoustic Phonetics. Cambridge: Cambridge University Press
.
Marler
,
P.
(
1970
).
Birdsong and speech development: Could there be parallels?
Am. Sci.
58
,
669
673
.
Marler
,
P.
(
1976
).
Sensory templates in species-specific behavior
. In
Simpler Networks and Behavior
(ed.
J.
Fentress
), pp.
314
329
. Sunderland, MA: Sinauer.
Marler
,
P.
and
Peters
,
S.
(
1982
).
Long-term storage of learned birdsongs prior to production
.
Anim. Behav
.
30
,
479
482
.
Moriyama
,
K.
and
Okanoya
,
K.
(
1996
).
Effect of beak movement in singing Bengalese finches
.
Abstracts: Acoustical Society of America and Acoustical Society of Japan, Third Joint Meeting
, pp.
129
130
. Honolulu, 2–6 December 1996.
Nottebohm
,
F.
(
1971
).
Neural lateralization of vocal control in a passerine bird. I. Song
.
J. Exp. Zool.
177
,
229
262
.
Nowicki
,
S.
(
1987
).
Vocal tract resonances in oscine bird sound production: Evidence from birdsongs in a helium atmosphere
.
Nature
325
,
53
55
.
Nowicki
,
S.
and
Marler
,
P.
(
1988
).
How do birds sing?
Music Perception
5
,
391
426
.
Patterson
,
D. K.
and
Pepperberg
,
I. M.
(
1994
).
A comparative study of human and parrot phonation: Acoustic and articulatory correlates of vowels
.
J. Acoust. Soc. Am
.
96
,
634
648
.
Podos
,
J. E.
,
Sherer
,
J. K.
,
Peters
,
S.
and
Nowicki
,
S.
(
1995
).
Ontogeny of vocal tract movements during song production in song sparrows
.
Anim. Behav.
50
,
1287
1296
.
Rothenberg
,
M.
(
1987
).
Cosi Fan Tuti and what it means, or, nonlinear source–tract interaction in the soprano voice and some implications for the definition of vocal efficiency
. In
Laryngeal Function in Phonation and Respiration
(ed.
T.
Baer
,
C.
Sasaki
and
K.
Harris
), pp.
254
263
. Boston, MA: College-Hill.
Suthers
,
R. A.
(
1997
).
Peripheral control and lateralization of birdsong
.
J. Neurobiol.
33
,
632
652
.
Suthers
,
R. A.
and
Goller
,
F.
(
1996
).
Respiratory and syringeal dynamics of song production in northern cardinals
. In
Nervous Systems and Behaviour. Proceedings of the Fourth International Congress of Neuroethology
(ed.
M.
Burrows
,
T.
Matheson
,
P.
Newland
and H. Schuppe), p. 333
.
Stuttgart
:
Georg Thieme Verlag
.
Suthers
,
R. A.
,
Goller
,
F.
,
Bermejo
,
R.
and
Zeigler
,
H. P.
(
1996
).
Relationship of beak gape to the lateralization, acoustics and motor dynamics of song in Cardinals
. In
Association for Research in Otolaryngology. Abstracts of the Nineteenth Midwinter Research Meeting
. p. 158.
Suthers
,
R. A.
,
Goller
,
F.
and
Hartley
,
R. S.
(
1994
).
Motor dynamics of song production by mimic thrushes
.
J. Neurobiol.
25
,
917
936
.
Vicario
,
D. S.
(
1993
).
A new brain stem pathway for vocal control in the Zebra Finch song system
.
Neuroreport
4
,
983
986
.
Vicario
,
D. S.
(
1994
).
Motor mechanisms relevant to auditory–vocal interactions in songbirds
.
Brain Behav. Evol.
44
,
265
278
.
Westneat
,
M. W.
,
Long
,
J. H.
, Jr
,
Hoese
,
W.
and
Nowicki
,
S.
(
1993
).
Kinematics of birdsong: functional correlation of cranial movements and acoustic features in sparrows
.
J. Exp. Biol.
182
,
147
171
.
Wild
,
J. M.
(
1993a
).
The avian nucleus retroambigualis: a nucleus for breathing, singing and calling
.
Brain Res.
606
,
119
124
.
Wild
,
J. M.
(
1993b
).
Descending projections of the songbird nucleus robustus archistriatalis
.
J. Comp. Neurol.
338
,
225
241
.
Wild
,
J. M.
(
1995
).
Convergence of somatosensory and auditory projections in the avian torus semicircularis, including the central auditory nucleus
.
J. Comp. Neurol.
357
,
1
22
.