As do many songbirds, zebra finches sing their learned songs while performing a courtship display that includes movements of the body, head and beak. The coordination of these display components was assessed by analyzing video recordings of courting males. All birds changed beak aperture frequently within a single song, and each individual’s pattern of beak movements was consistent from song to song. Birds that copied their father’s songs reproduced many of the changes in beak aperture associated with particular syllables. The acoustic consequences of opening the beak were increases in amplitude and peak frequency, but not in fundamental frequency, of song syllables. The change in peak frequency is consistent with the hypothesis that an open beak results in a shortened vocal tract and thus a higher resonance frequency. Dance movements (hops and changes in body or head position) were less frequent, and the distribution of dance movements within the song was not as strongly patterned as were changes in beak aperture, nor were the peaks in the distribution as strongly marked. However, the correlation between the positioning of dance movements within fathers’ and sons’ songs was striking, suggesting that the choreography of dance patterns is transmitted from tutor to pupil together with the song.

A QuickTime movie of a courtship display used in this study can be found at: http://www.williams.edu/Biology/ZFinch/zfdance.html.

In many avian species, song is part of a courtship display that involves specific postures, plumage erection and ritualized locomotion or flight patterns; such displays can be very elaborate and complex and may even be coordinated with the vocalizations and actions of another individual, as in duetting thrushes (Cichladusa guttata) (Todt and Fiebelkorn, 1980) or cooperatively displaying manakins (Chiroxiphia linearis) (Trainer and McDonald, 1993). These complex displays usually appear to be tightly choreographed, with acoustic events within the song coupled to movements of the ‘dance’. Even in species such as the zebra finch (Taeniopygia guttata), with less complex and elaborate displays, descriptions of courtship emphasize the coordination of the bird’s song and dance:

“The singing, posturing male advances towards the female in a rhythmic, pivoting dance. The exact form of this dance varies and is often obscured by the particular arrangement of branches upon which the birds are situated. It is best understood when it occurs along one long, straight branch. As the male advances towards the female down the branch, it swings its body from side to side, turning first to the left and then to the right, changing the position of its feet as it does so” (Morris, 1954).

What emerges from this description, published nearly 50 years ago, is that male zebra finches sing directed song to females as an integral part of a courtship display. The choreography of the dance presumably conveys or enhances some part of the message that is carried by the individual’s learned song, although the exact importance and function of the dance are not known. Since zebra finches, like other oscine songbirds, learn their songs during development (Price, 1979), any tightly coupled coordination between song and dance must be acquired at some point during the song-learning process. Further, any coordination between song and dance must necessarily involve coordination of the neural systems controlling these aspects of the display. The zebra finch’s neural circuitry for song development and control has been the subject of extensive study (Nottebohm, 1991; Nottebohm, 1996; Bottjer and Johnson, 1997), and this species could provide a valuable model for investigating the integration of different modalities.

For the most part, research on the physiology of birdsong has concentrated upon defining the role of airflow through the vocal organ, the syrinx, in forming the sounds the bird produces and on the role of the syringeal muscles and the neural pathways that control them in modulating those sounds. However, the upper vocal tract, and in particular variations in the gape of the beak, can play an important part in determining the relative amplitude of different frequencies and thus the tonal quality of the songs of song, white-throated and swamp sparrows (Westneat et al., 1993; Hoese et al., 2000). Furthermore, the tight coordination between variation in beak gape and different song elements in song sparrows emerges only very late in song learning (Podos et al., 1995). These results provide important evidence that respiratory and extra-syringeal vocal elements, as well as their neural control, need to be considered when studying the behavior of singing. For species that include a dance as well as song in their courtship display, how these different classes of movement (mediated by separate neural circuits) are coordinated poses a potentially interesting question.

This study uses the analysis of video tapes of the courtship displays and songs of zebra finch males to assess the coupling of beak movements, head motions and hops to the acoustic elements of song.

Subjects

Ten adult zebra finch Taeniopygia guttata males ranging in age from 2 to 9 years were used as subjects. They were housed individually in cages in a room held at 25°C on a 14 h:10 h light:dark cycle and supplied with seed, water and grit ad libitum. To increase the motivation to sing under recording conditions, males were housed at least 1.5 m away from the nearest females.

Recording

Video and audio recordings of the males’ courtship song and dance were obtained as follows: an individual male was placed in a cage with a clear Plexiglas window on one wall and light green posterboard on the opposite wall, This cage was in turn placed inside a larger Plexiglas recording chamber [see (Williams and Mehta, 1999)]. Four of the walls of the recording chamber were lined with acoustic foam, and the remaining walls were clear. Immediately outside the end wall of the chamber, the male could see two females; when the birds approached each other as closely as the chamber and cages allowed, they were within 4 cm of each other. The females were illuminated to make them the salient feature visible outside the chamber. This arrangement was used (i) to restrict the audio recording to the males’ vocalizations and (ii) to maximize the possibility that the male’s courtship dance would be clearly visible and appropriately oriented (presenting a profile, allowing the beak to be seen clearly) from the perspective of the video camera. Nevertheless, the exact position and orientation of the dance, because it was necessarily unconstrained, varied from bird to bird and song to song.

Inside the chamber was a Realistic 1033/73A microphone; the signal from this microphone was fed into the sound jack of a Panasonic AG450 S-VHS video camera. The video camera was placed outside the chamber, 1 m from and directly facing the wall perpendicular to the male–female axis, so that, when the male faced the females directly, the video camera recorded an unobstructed view of his left profile against a light green background. Since the males hop about and change position during the courtship dance, the image size and focus were adjusted to capture the sharpest and largest view possible of the entire dance. Each male was recorded for 1–4 sessions lasting 2 h each, yielding 13–35 song bouts (mean 18.2) and 88–134 song strophes (mean 104.7), using Sossinka and Böhner’s definitions of song units (Sossinka and Böhner, 1980). All songs used in this study were ‘directed’ songs, meaning that the singing male directed his song at a female.

Audio and video analysis

The portions of the video tapes that included recordings of songs were digitized using Strata VideoShop at 30 frames s–1 and 640×480 pixel resolution. The accompanying audio track was simultaneously digitized at 22.255 kHz with eight-bit resolution. The video tape and images acquired in this fashion are of lower quality than those used in the analysis of song sparrow songs, for which a faster shutter speed and twice the frame rate was used to film a relatively stationary singer (Westneat et al., 1993). Because a singing and dancing zebra finch’s head position and angle change frequently during the song, accurate and consistent measurements of beak angles would be difficult to acquire even with a kinematic system. For this reason, the video analysis in this study concentrated on changes in the aperture of the beak and in the position of the head and body of the courting male zebra finch. The digitized video images were viewed frame by frame in Adobe Premiere, and dance movements and changes in beak aperture were scored as occurring in the frame in which a change in aperture or position first appeared (some dance movements continued during subsequent frames).

Although song strophes sung by adult male zebra finches consist of a small and fixed set of syllables repeated in a stereotyped and time-invariant fashion, adult male birds occasionally sing slight variants of the song, such as omitting the final syllables (Williams and Staples, 1992). To account for these variants and for occasional minor changes in tempo, each song that was scored was aligned frame by frame to the ‘canonical song’. For the purposes of this study, a bird’s canonical song was defined by one rendition of a clear and complete song strophe. This canonical song was represented as a sonogram and the matching amplitude waveform, and then divided into 33 ms segments (with each segment corresponding to the length of one video frame). During scoring of beak and body movements, each song was viewed frame by frame; when a change in beak aperture or body position occurred, the corresponding position within the song was located in the audio track using the frame-matching features of Adobe Premiere and the movement was assigned to the matching song segment in the canonical song. Thus, every scored event was assigned to a 33 ms bin, or song segment, that corresponded to a specific position within the bird’s song. The average song length was 1.17 s, equivalent to 35.5 frames or song segments; on average, 25.3 of these frames were occupied by syllables (the remaining 10.2 frames were occupied by silent intervals between syllables). If syllables are defined as continuous periods of sound production, the average syllable spanned 3.24 frames; if the syllables are split further into coherent acoustic units, the average syllable spanned 1.87 frames. Thus, although there was some unavoidable jitter in the alignment process (given that songs did not always start precisely at the beginning of a frame), the resulting error was never greater than one frame, and a beak or body movement was reliably assigned to a specific syllable. Each song bout was aligned and scored independently, so that the observer’s scoring of previous songs did not influence the frame to which a movement was assigned.

Beak aperture was scored as opening, closing or no change (for examples of scoring, see Fig. 1). Because of the nature of the dance, the male was turned away from the camera during portions of some songs, and changes in head and beak position were not visible until the male re-oriented himself. Any frames during which the beak was not visible were excluded from statistical analyses of beak movements. Although it was clear during the scoring of the video tapes that beak movements varied in magnitude, this variation was not directly scored or measured except for calibration. In the short sequence shown in Fig. 1, a comparison of frames 3 and 4 shows that, between those two frames, the bird opened his beak (the gape angle increased by 8°). Between frames 7 and 8, the beak was closed (a change in gape angle of 9°). Smaller changes in beak aperture also occurred in this sequence: in frame 2, the bird had a slightly more open beak than in frame 1 (a change of 2°), and in frame 6 the gape angle increased by 5° from the previous frame. These changes in beak aperture were clearly visible in frame-to-frame comparisons of the digitized video tape; each change was scored simply as beak opening or closing and designated as corresponding to the portion of the soundtrack that matched the video frame. After all songs had been scored, the proportion of scored songs that included beak-opening and/or beak-closing movements for each 33 ms song segment (the interval between video frames) was determined. The percentage of beak-closing movements assigned to each song segment was subtracted from the percentage of beak-opening movements to define the net change in beak aperture (given for each frame in Fig. 1). This measure of net change in beak aperture reflects the fact that the observer was more likely to score large beak movements than small beak movements: larger changes in gape angle correspond to larger net scores, while smaller changes in gape angle correspond to smaller net scores (see the gape angles and corresponding net scores shown in Fig. 1).

Body (dance) movements were categorized as either hops or head movements and were broken down further according to their orientation (forward, left, right, back, up and down or a combination such as up/right or forward/down), with the direction judged from the perspective of a male directly facing the females’ cage. When the scoring process was completed, the number of categories was large relative to the total number of dance movements, so all categories of dance movements were collapsed into a single measure (total number of dance movements). For each 33 ms song segment, the proportion of songs that included a dance movement was calculated.

The inflation of the gular sac was also visible in some video sequences, particularly when the male’s white throat plumage was prominent and emphasized changes in profile. When the throat movements were clearly visible, they occurred consistently at the same point in the song. However, these movements were not equally apparent in all birds and were clearly visible only when the bird was in certain positions, so they were not scored for analysis.

Included among the subjects were two father/son pairs that sang very similar (but not identical) songs. To determine whether changes in head position and beak aperture were consistent in the song tutor and the pupil, sonograms of the canonical songs were aligned, and this alignment, which was based solely on acoustic criteria, was used to determine which frames within each pair of songs corresponded to each other.

Each bout of zebra finch courtship song is preceded by a series of introductory notes. The number and tempo of introductory notes varies from bout to bout, making the determination of correspondences between introductory notes in different bouts problematic. Except for a few very rare instances (<1 %), the beak was held closed during these introductory notes. Because of the absence of beak movements during introductory notes and the issue of correlating dance movements with sounds in the sequence of introductory notes, only the song itself was considered in the analysis.

To determine whether the changes in beak aperture correlated with acoustic features of song syllables, I compared pairs of syllables (or separate segments of a single syllable) that had a similar overall structure but were delivered with different beak apertures (for examples, see Fig. 2). In total, 28 such syllables (14 pairs) were identified in the songs of eight of the subjects. High-quality digital recordings (22.050 kHz, 16-bit) of the songs of these eight subjects were obtained using the recording chamber described above, but the signal from the microphone was amplified and filtered (high-pass 10 kHz, low-pass 400 Hz) and then digitized with a Macintosh computer using SoundEdit 16 Pro software. At least five songs from each bird were then analyzed using Canary 1.2 (Chris Clark, Cornell Laboratory of Ornithology, Ithaca, NY, USA). For each syllable or syllable segment of interest, the average relative amplitude, fundamental frequency and peak frequency (frequency with the highest energy) were determined. To increase the accuracy of fundamental frequency measurements, the technique described by Williams et al. (Williams et al., 1992) was used (measuring the frequency of one or more harmonics in the 4–5 kHz range and dividing by the order of the corresponding harmonic to obtain a measure of the fundamental frequency).

Statistical analyses were performed using StatView 5.0 (SAS Institute) for regression and paired analyses, and by using formulae within a standard spreadsheet program to compute χ2 values.

Beak movements

Each zebra finch opened and closed its beak at specific points within the song with its own stereotyped pattern. Fig. 2 shows the patterns of these beak movements in the songs of two birds. Male LB92 (Fig. 2A) opened his beak reliably at three points in his song (at or near the onset of syllables D, G and O) and then closed his beak soon thereafter. Less consistently, he opened his beak just before beginning a compound syllable consisting mainly of four linked harmonic stacks (I, J, K and L) and closed it at the end of that four-syllable sequence. As noted above, the scoring of changes in beak aperture was affected both by how often that change occurred and by the magnitude of the change in beak aperture; larger changes in beak aperture were likely to result in larger percentages of frames with opening movements and in larger net scores. The pattern seen in Fig. 2A reflects these factors: the change in beak aperture was smaller for the compound stack syllable and it was not observed, even in exceptionally clear video clips, in some renditions of the song. In contrast to the relatively simple pattern of changes in beak aperture shown by LB92, LB60 had a more complex trajectory, including six prominent beak-opening movements (at syllables B, D, E/F, G, I and K) within a relatively short song. The other birds in the study had patterns similar to the two examples shown in Fig. 2.

The distribution of beak movements within the zebra finches’ songs was strongly non-random. For each bird, a series of χ2 analyses (one for each video frame, which corresponded to a 33 ms segment of the song) was used to assess the probability that the proportion of beak movements for that song segment was different from the proportion in the song as a whole. For example, in the song of LB60, 29 of the 34 song segments (85 %) had skewed distributions of beak aperture changes that were significant at the P<0.05 level, and 26 frames (76 %) were significantly different from the overall song averages at the P<0.001 level (see shading of columns in Fig. 2B). For the entire sample of 10 birds, a total of 326 out of 384 song segments (84.9 %) had beak movements that differed from the proportions in the overall song at the P=0.05 level and 261 segments (68.0 %) reached significance at the P<0.001 level. These measures confirm statistically that the patterns of beak-opening and beak-closing movements seen in Fig. 2 are strongly non-random.

Do birds with similar songs close and open their beaks at the same points within the songs? Fig. 3 shows the beak movements in two sets of father/son pairs. Although the correspondence is not perfect, there is a striking similarity in the pattern of the largest and most reliable beak movements for related songs. Conversely, portions of the songs in which there were mismatches between father and son or in which syllable correspondences were unclear (as for syllables I, J and K in the songs of Pk61 and DP46; see Fig. 3B) were the most likely to show divergences in the patterns of beak movements. The correlation between net beak movements for matching segments within the songs of fathers and sons was significant (W31/Bk58, r=0.72, N=30, P<0.0001; DP46/Pk61, r=0.61, N=27, P=0.0005), whereas beak movements in the songs of six pairs of unrelated birds were not significantly correlated (0.003<r<0.23; 0.24<P<0.86). Overall, songs copied from an adult song model had patterns of changes in beak aperture that were similar to those of the model and different from the patterns in other songs.

The acoustic consequences of opening and closing the beak were investigated by comparing pairs of sounds that were similar in overall acoustic structure but differed in that they were immediately preceded by beak-opening or beak-closing movements (see Fig. 2). On average, opening the beak was associated with a small (mean 12 Hz) increase in fundamental frequency (Fig. 4); this change was only marginally significant when syllables from the same song were compared over the entire data set (paired t=2.18, d.f.=115, P=0.03), with eight of 14 syllable pairs showing an increase in average fundamental frequency for an opened beak. In contrast, the peak frequency (the frequency containing the most energy) increased by an average of 694 Hz when the beak was opened (paired t=6.01, d.f.=114, P<0.0001), with 11 of the 14 syllable pairs increasing in average peak frequency after beak opening. The average amplitude of the sound produced was also greater after beak-opening movements (paired t=8.71, d.f.=115, P<0.0001), with 12 of 14 syllable pairs tested showing this relationship. These overall trends are visible in the sonogram in Fig. 4D, which shows a sequence of two consecutive syllables from the song of DP46 (syllables Y and Z in Fig. 3B). The beak was moved to a more closed position early in the first syllable and to a more open position early in the second syllable. The fundamental frequency of the two syllables was nearly identical, but the second syllable had a greater overall amplitude and also had a higher peak frequency (note that the fifth and sixth harmonics are strongest in the first syllable, whereas the sixth and seventh harmonics are strongest in the second syllable).

Dance movements

While singing courtship song, all the subjects performed the ‘dance’, a series of apparently rhythmic head and body movements oriented towards the female. However, the distribution of dance movements (hops and marked changes in head position) did not show strong evidence of stereotypic patterning (Fig. 5). Dance movements were relatively infrequent: the average rate of dance movements per 33 ms song segment for individual birds averaged 5–17 % (in contrast to beak movements, which averaged 18–56 %). χ2 analysis of the distribution of dance movements showed that 20.6 % of song segments (total for all birds) had levels of dance movements that differed from chance at the P<0.5 level, while 4.2 % of the frames differed from the average distribution at the P<0.001 level. Although the distribution of some of the dance movements within songs was weakly non-random, these patterns were not as strongly marked or as characteristically stereotyped as for beak movements (compare Fig. 2 and Fig. 5).

However, a different picture emerges when the dances of related males with similar songs are compared (Fig. 3). The pattern of dance movements in the song of W31 and the matching portions of his son’s song are strikingly and significantly similar (r=0.61, N=31, P<0.001); both birds had five peaks of dance activity, falling at syllables A, B/C, D, F and H/I. The matching portions of the songs of DP46 and his son Pk61 were also significantly correlated (r=0.49, N=28, P<0.01); although in this case the son’s song had two peaks of dance activity that were not present in the father’s song, all four of the dance activity peaks in the father’s song were also found, at corresponding locations, in the son’s song (syllables E/F, G/H, K and L). The close correspondence between the dance patterns in these two father/son pairs contrasts with the dissimilarity of patterns for unrelated birds with different songs. Dance movement patterns in six pairs of unrelated songs were only weakly correlated (0.01<r<0.196), never approaching significance (0.31<P<0.95).

For both father/son pairs, the fathers’ dances were less vigorous than those of the sons, with fewer dance movements overall; the fathers were substantially older than the sons (20 months for W31 and 42 months for Pk61). However, LB92 danced vigorously, although at 9 years of age he was quite old for a zebra finch, and age did not show an overall correlation with dance vigor (r=0.078, N=10, P>0.4).

No systematic relationship between dance movements and the acoustic characteristics of song elements was readily apparent. As can be seen from Fig. 2 and Fig. 5, peaks in the distribution of dance movements were not as closely spaced as those of beak movements, dance movement peaks being separated by at least 3–5 frames; this was true for the entire song sample. This interval corresponds to approximately 150 ms between dance movements and suggests that these movements might occur at fixed intervals starting at the onset of the song. However, no songs included dance movements at all the peaks apparent in the overall distribution; for example, in the songs of Bk58 and W31 (Fig. 3A), there were six peaks in the distribution of dance movements, but a single song strophe never included more than three dance movements, and those three dance movements never occurred at positions corresponding to adjacent peaks in the distribution shown in the figure. Bk58 (the son in Fig. 3A) initiated dance movements at peaks 2 and 4 in one song, at peaks 1 and 5 in the following song, and then at peak 5 in the next song, etc. The type of dance movement given at a particular peak in the distribution varied from song to song; there was no systematic pattern of turning or hopping exclusively to the right (or left) at a particular point in the song.

The overall pattern of stereotyped changes in beak aperture during zebra finch song is similar to that described for white-throated, swamp and song sparrows (Westneat et al., 1993; Hoese et al., 2000). Like the sparrows, zebra finches change their beak aperture at specific points within their song, and these patterns of changes in beak position have acoustic consequences. Unlike the sparrows, the beak position of zebra finches does not appear to track the fundamental frequency of the notes being sung, perhaps because zebra finches’ song elements are rich in harmonics but do not have the strong tonal characteristics that correlate well with beak aperture in sparrows. The acoustic correlates of an open beak in zebra finches are higher-amplitude syllables with higher peak frequencies, a phenomenon similar to the shift in formants that results from a change in vocal tract length in the macaque (Fitch, 1997) and dog (Riede and Fitch, 1999). There were a few exceptions to this overall finding, and it is possible that the method used here to choose syllables for comparison (tracking changes in beak aperture rather than directly measuring gape or vocal tract length) accounted for some of the discrepancies; judgments based on beak movements may have resulted in the use of some syllable pairs with only very small differences in beak aperture. It is also likely that change in beak aperture is but one of the factors that contribute to the peak frequency and amplitude of a song element and that other factors also contribute to the acoustic properties that are influenced by changes in beak aperture; some possibilities are laryngeal position (peak frequency) and airflow (amplitude).

Nowicki previously suggested specific acoustic consequences of opening and closing the beak that are consistent with the results reported here for zebra finch song (Nowicki, 1987). A closed beak would mute the sound being produced by the syringeal apparatus, while opening the beak would yield an increase in amplitude. Opening the beak would also shorten the airway, forming a tube with a higher resonance frequency and thus shifting the peak frequency upwards. The ability to shift peak frequency by opening the beak would allow a bird to sing syllables that differ in their spectral envelope but which have the same fundamental frequency. This mechanism may provide a partial explanation for the previous finding that individual zebra finches sing similar syllables with a different ‘timbre’ or shift in harmonic emphasis (Williams et al., 1989). The analog to timbre in human speech are the shifting bands of higher-amplitude frequencies called formants, which encode meaning and are thought to result from a similar process, the action of the resonances of the vocal tract upon the sounds generated by the larynx (Fant, 1960).

The hops and changes in body orientation and head position that make up the most prominent movements of the zebra finch courtship dance were not given in as stereotyped a fashion as the changes in beak aperture. Some of the changes in beak aperture were scored as occurring in over 80 % of the songs; given the difficulties in tracking these movements in a moving bird and the jitter between video frames that could cause an event to be scored in adjacent frames in different songs, it seems likely that such high-probability events were effectively given with every song. In contrast, even the most tightly stereotyped dance movements were never performed in more than half the songs, and the weaker stereotypy of dance movement patterns is reflected in the finding that the P<0.001 level was reached for only 4 % of the song segments scored for dance movements (compared with 68 % for beak aperture changes). The relatively low frequency of dance movements, which were given at a rate of slightly less than three per song (beak movements occurred at an average of 11 per song), contributed to the failure to see a strong pattern in the relationship between dance and song in individual birds. The overall picture changes dramatically, however, when the patterns of dance movements in related songs are compared: in both cases examined, sons that copied their father’s songs also initiated dance movements at the same points in the songs as did their fathers. Where the son’s song differed from the father’s because of insertion, deletion or rearrangement of syllables, dance movements were still initiated at the same acoustic figure as in the father’s song. Although the sample size is small, these striking correlations strongly suggest that the distribution of dance movements in an individual’s song in fact represents a stereotyped pattern. Rather than constituting a systematic pattern of dance movements that occur reliably from song to song, the relatively infrequent dance movements appear to be initiated at a number of specific ‘hot spots’ within the song.

These hot spots for initiating dance movements did not appear to be tightly locked to any particular syllable type or position within the song. The number of peaks in the average song is similar to the number of ‘chunks’ [units that have correlates of song learning and delivery; see (Williams and Staples, 1992)] within the song, so it is possible that the two are related, although there is as yet no evidence to support this suggestion.

This study presents no direct evidence for the learning of beak or dance movements. Fathers and sons with similar songs might develop similar dance patterns without any dance-specific learning. If, for example, motor constraints play a role in defining the points within a song at which a zebra finch is likely to make a dance movement, using the same motor patterns to produce the same song might entrain the same dance movement patterns. If this were the case, one might expect a disruption of the original dance movement pattern when, in the process of copying a song, syllables are inserted into or deleted from the tutor’s version, but this did not occur. Rather, the dance movements in the son’s song remained associated with the same sound to which they were coupled in the father’s song, despite the changes in the song motor pattern. This suggests that the young males may learn the dance pattern as an attribute of the song, with movements linked to specific sounds. If dance movements are learned, birds that copy a song without a chance to watch the dance of the adult model should develop a different dance pattern from that of the model.

Zebra finches can discriminate between syllables that differ only in the distribution of energy among harmonics, or ‘timbre’, an analog of formant dispersion (Cynx et al., 1990), and are thought to copy the timbre of their tutor’s syllables (Williams et al., 1989). Since changes in beak aperture affect peak frequency, and hence timbre, young males either must learn the correct pattern of changes in beak aperture by directly observing their tutor’s changes in gape or must learn to use changes in beak aperture to reproduce acoustic characteristics of the memorized song. Podos et al. found that the coordination between gape and song did not emerge until very late in the development of song sparrows, well after the overall acoustic structure of the song syllables had been established (Podos et al., 1995). This timing favors the hypothesis that the young sparrows use beak movements to refine the acoustic structure of syllables to match their memory of the tutor’s song. The suggestion that acoustic structures and not beak movements are learned would be confirmed if the copied songs are found to be accompanied by changes in beak aperture at the same points as in the tutor’s song in young birds that have never seen the tutor sing.

Motor constraints might also contribute to the coordination of song and dance; for example, the respiratory patterns used to produce passerine song are known to be complex and yet stereotyped for individual birds (Suthers et al., 1999), and these breathing patterns may in turn constrain or facilitate movements of other large muscle groups. In humans, both locomotion and, more surprisingly, finger movements are coordinated with respiration (Bernasconi and Kohl, 1993; Mateika and Gordon, 2000; Rassler et al., 2000). Respiration patterns in birds might have a similar effect on the probability that a dance movement would be initiated.

However it is acquired, the coordination of beak and dance movements with song implies that the well-described neural circuitry for song acquisition and production in zebra finches must in some way be coupled to the motor circuits that are responsible for hopping, changes in head position and movements of the lower mandible. In this context, it is worth noting that directed song (song directed at a female) is accompanied by the courtship dance, while males do not dance when singing undirected song (Sossinka and Böhner, 1980). Differences during directed and undirected song in some parts of the brain’s song circuitry, such as those in immediate early gene expression (Jarvis et al., 1998), and activity levels in some parts of the song system (Hessler and Doupe, 1999) may hold the key to understanding the basis of the coordination between song and dance.

Many other avian species have courtship displays that include both song and dance components; in fact, displays such those given by lyrebirds (Menura novaehollandiae) (Robinson and Frith, 1981) and manakins (Chiroxiphia linearis) (Bostwick, 2000) are much more dramatic, including accentuated visual displays that are tightly coupled to vocal and non-vocal sound production. That the zebra finch, which falls comparatively low on the scale of song and dance virtuosity, nevertheless demonstrates coupling of events in these two forms of courtship display indicates that this coordination is biologically important. Morris (Morris, 1954) and Zann (Zann, 1996) noted that the courtship dance emphasized the visual impact of the male zebra finch’s tail feathers, and we know that female zebra finches’ choices of males are affected by visual (Burley and Coopersmith, 1987) and by auditory (Miller, 1979; Williams et al., 1993) attributes of the courting male. The coordination of the two types of display, of song and dance, may also be under sexual selection, with potential mates assessing the ability of an individual to perform a well-choreographed display.

Fig. 1.

Scoring of beak and dance movements. Eight successive frames (taken at 33 ms intervals) from a video tape of adult male zebra finch Y125 directing his song to a female (not visible, but present immediately beyond the left edge of the frame) are shown. The change in beak angle from the preceding frame and the overall net percentage of change in beak aperture for the associated song segment are given for each frame. The sequence includes one ‘dance’ movement, a shift in head and body angle between frames 3 and 4 (note that this shift also changes the perspective of the beak). Several changes in beak aperture are also apparent, most notably between frames 3 and 4 and frames 7 and 8. See the text for further explanation of the scoring of these dance and beak movements.

Fig. 1.

Scoring of beak and dance movements. Eight successive frames (taken at 33 ms intervals) from a video tape of adult male zebra finch Y125 directing his song to a female (not visible, but present immediately beyond the left edge of the frame) are shown. The change in beak angle from the preceding frame and the overall net percentage of change in beak aperture for the associated song segment are given for each frame. The sequence includes one ‘dance’ movement, a shift in head and body angle between frames 3 and 4 (note that this shift also changes the perspective of the beak). Several changes in beak aperture are also apparent, most notably between frames 3 and 4 and frames 7 and 8. See the text for further explanation of the scoring of these dance and beak movements.

Fig. 2.

Changes in beak aperture. The sonograms show the songs of (A) LB92, a 9-year-old male, and (B) LB60, a 3-year-old male. Beneath each song is a graph showing the proportion of songs that had beak-opening (positive-going columns) and beak-closing (negative-going columns) movements during the corresponding song segment in the sonogram of the male’s song. Where the values represented by the columns do not add to 100 %, the remaining songs showed no change in beak aperture during the song segment in question. The shading of the columns denotes whether the distribution of beak movements (opening, closing, no change) was different from the overall average for the song: white columns, no difference from the song average; light gray columns, beak movements differed significantly from the song average at the P<0.05 level; dark gray columns, significant at the P<0.001 level (χ2 analysis, d.f.=2). The filled circles mark the net change in beak position (the percentage of songs associated with beak-opening movements minus the percentage of songs associated with beak-closing movements) for each song segment; the curve showing the overall beak movement trajectory is a cubic spline fitted to the net change in beak position. The filled and open squares at the base of each sonogram denote syllable pairs (or separate segments of a single syllable) that were chosen, on the basis of the sonograms and the beak trajectories, for comparing similar sounds given with beak relatively open (open squares) or closed (filled squares). The letter designations above each sonogram denote different syllables as defined by the criterion that uses change in acoustic structure within a continuous sound to demarcate syllables. Syllables with the same letter but within different songs do not correspond in any way in this figure.

Fig. 2.

Changes in beak aperture. The sonograms show the songs of (A) LB92, a 9-year-old male, and (B) LB60, a 3-year-old male. Beneath each song is a graph showing the proportion of songs that had beak-opening (positive-going columns) and beak-closing (negative-going columns) movements during the corresponding song segment in the sonogram of the male’s song. Where the values represented by the columns do not add to 100 %, the remaining songs showed no change in beak aperture during the song segment in question. The shading of the columns denotes whether the distribution of beak movements (opening, closing, no change) was different from the overall average for the song: white columns, no difference from the song average; light gray columns, beak movements differed significantly from the song average at the P<0.05 level; dark gray columns, significant at the P<0.001 level (χ2 analysis, d.f.=2). The filled circles mark the net change in beak position (the percentage of songs associated with beak-opening movements minus the percentage of songs associated with beak-closing movements) for each song segment; the curve showing the overall beak movement trajectory is a cubic spline fitted to the net change in beak position. The filled and open squares at the base of each sonogram denote syllable pairs (or separate segments of a single syllable) that were chosen, on the basis of the sonograms and the beak trajectories, for comparing similar sounds given with beak relatively open (open squares) or closed (filled squares). The letter designations above each sonogram denote different syllables as defined by the criterion that uses change in acoustic structure within a continuous sound to demarcate syllables. Syllables with the same letter but within different songs do not correspond in any way in this figure.

Fig. 3.

Comparisons of beak and dance movements for two father/son pairs. Each panel shows the songs of the father at the top and the son at the bottom. Boxes within the sonogram denote song elements present in the song of only one of the pair; these portions of the songs were omitted when matching frames from a father’s song to frames in the son’s song (the matching process was performed using the sonograms, without reference to data on movements). For this figure, the syllable designations in different songs within the same panel correspond to matching syllables (e.g., the syllables within W31’s song were considered to correspond to the syllables with the same letter designations within Bk58’s song). Syllables that had no match in the father’s or son’s song were designated with the letters Q, X, Y or Z. Syllables I, J and K in Pk61’s song were difficult to match unambiguously to corresponding syllables in his son’s song, and the correspondence involving the fewest rearrangements was chosen. The beak movement trajectories for the matching portions of the father/son song pairs are shown immediately beneath and in register with the father’s song. Filled circles connected by a solid line show the beak movements of the father, and open circles connected by a dashed line show the corresponding changes for the son (all lines are cubic splines fitted to the net beak movement, as in Fig. 2). The dance movements were registered and summarized in a similar fashion. Although the fits for the father’s and son’s beak movements and dance were not perfect, the major peaks correspond (see text for statistical analyses). The greatest divergences from a common pattern appear to fall in the segment of the Pk61/DP46 song that was difficult to match unambiguously (syllables I, J and K).

Fig. 3.

Comparisons of beak and dance movements for two father/son pairs. Each panel shows the songs of the father at the top and the son at the bottom. Boxes within the sonogram denote song elements present in the song of only one of the pair; these portions of the songs were omitted when matching frames from a father’s song to frames in the son’s song (the matching process was performed using the sonograms, without reference to data on movements). For this figure, the syllable designations in different songs within the same panel correspond to matching syllables (e.g., the syllables within W31’s song were considered to correspond to the syllables with the same letter designations within Bk58’s song). Syllables that had no match in the father’s or son’s song were designated with the letters Q, X, Y or Z. Syllables I, J and K in Pk61’s song were difficult to match unambiguously to corresponding syllables in his son’s song, and the correspondence involving the fewest rearrangements was chosen. The beak movement trajectories for the matching portions of the father/son song pairs are shown immediately beneath and in register with the father’s song. Filled circles connected by a solid line show the beak movements of the father, and open circles connected by a dashed line show the corresponding changes for the son (all lines are cubic splines fitted to the net beak movement, as in Fig. 2). The dance movements were registered and summarized in a similar fashion. Although the fits for the father’s and son’s beak movements and dance were not perfect, the major peaks correspond (see text for statistical analyses). The greatest divergences from a common pattern appear to fall in the segment of the Pk61/DP46 song that was difficult to match unambiguously (syllables I, J and K).

Fig. 4.

Acoustic correlates of opening and shutting the beak. (A) Differences in fundamental frequency for 14 syllable pairs (two syllables or portions of syllables from the same song with similar acoustic structure but differing in beak aperture, see Fig. 2 for examples). Error bars (± s.e.m.) for the fundamental frequency were generally so small as to be contained within the symbol, and fundamental frequency did not differ for the two beak positions. (B) Peak frequency (the frequency with the highest energy) was more variable, and showed a significant overall reduction when the beak was closed, although two syllables did increase in peak frequency. (C) The average relative amplitude also decreased significantly overall when the beak was closed (although three syllables increased in amplitude in this condition). (D) These trends are illustrated in two neighboring syllables from DP46’s song (also pictured as syllables Y and Z in Fig. 3B, but note that the beak trajectory for these syllables does not appear in that figure because the syllables were absent from the father’s song). Beak aperture was reduced during the first syllable and increased during the second syllable. The two syllables have nearly identical fundamental frequencies, but the second syllable, when the beak was relatively open, had a higher amplitude (as is apparent in the oscillogram and in the overall ‘darkness’ of the sonogram). The energy in the second syllable is concentrated in higher harmonics than for the first syllable; compare the fifth and sixth harmonics for the two syllables. Note that, because change in beak aperture and not the beak aperture itself was measured, the two beak positions (open and closed) are relative.

Fig. 4.

Acoustic correlates of opening and shutting the beak. (A) Differences in fundamental frequency for 14 syllable pairs (two syllables or portions of syllables from the same song with similar acoustic structure but differing in beak aperture, see Fig. 2 for examples). Error bars (± s.e.m.) for the fundamental frequency were generally so small as to be contained within the symbol, and fundamental frequency did not differ for the two beak positions. (B) Peak frequency (the frequency with the highest energy) was more variable, and showed a significant overall reduction when the beak was closed, although two syllables did increase in peak frequency. (C) The average relative amplitude also decreased significantly overall when the beak was closed (although three syllables increased in amplitude in this condition). (D) These trends are illustrated in two neighboring syllables from DP46’s song (also pictured as syllables Y and Z in Fig. 3B, but note that the beak trajectory for these syllables does not appear in that figure because the syllables were absent from the father’s song). Beak aperture was reduced during the first syllable and increased during the second syllable. The two syllables have nearly identical fundamental frequencies, but the second syllable, when the beak was relatively open, had a higher amplitude (as is apparent in the oscillogram and in the overall ‘darkness’ of the sonogram). The energy in the second syllable is concentrated in higher harmonics than for the first syllable; compare the fifth and sixth harmonics for the two syllables. Note that, because change in beak aperture and not the beak aperture itself was measured, the two beak positions (open and closed) are relative.

Fig. 5.

Dance movements. The sonograms show the songs of (A) LB46 and (B) LB47, 3-year-old brothers that developed different songs. Beneath each sonogram is a bar graph showing the percentage of songs that included dance movements in the video frame corresponding to that song segment. The solid line is a cubic spline fitted to the data. The shading of the columns denotes frames that had dance movements that differed from the overall levels for the song (white columns, no difference from the song average; light gray columns, P<0.05; dark gray columns, P<0.001). As for the data shown in Fig. 2, the χ2 values were calculated on the basis of the actual numbers of movements scored and not on the percentages; thus, columns that appear to be similar may have different statistical significance because of differences in sample size (some zebra finches sing partial song strophes at the end of a song bout, so that the final syllables of the song are sung less often).

Fig. 5.

Dance movements. The sonograms show the songs of (A) LB46 and (B) LB47, 3-year-old brothers that developed different songs. Beneath each sonogram is a bar graph showing the percentage of songs that included dance movements in the video frame corresponding to that song segment. The solid line is a cubic spline fitted to the data. The shading of the columns denotes frames that had dance movements that differed from the overall levels for the song (white columns, no difference from the song average; light gray columns, P<0.05; dark gray columns, P<0.001). As for the data shown in Fig. 2, the χ2 values were calculated on the basis of the actual numbers of movements scored and not on the percentages; thus, columns that appear to be similar may have different statistical significance because of differences in sample size (some zebra finches sing partial song strophes at the end of a song bout, so that the final syllables of the song are sung less often).

I thank Chuck Munyon and Victor Platt for their help with pilot work for this study. Dick Deveaux provided valuable advice on statistical analyses. The work described here was supported by grants from the Essel and Hughes Foundations to Williams College.

Bernasconi, P. and Kohl, J. (
1993
). Analysis of co-ordination between breathing and exercise rhythms in man.
J. Physiol., Lond
.
471
,
693
–706.
Bostwick, K. S. (
2000
). Display behaviors, mechanical sounds and evolutionary relationships of the club-winged manakin (Machaeropterus deliciosus).
Auk
117
,
456
–478.
Bottjer, S. W. and Johnson, F. (
1997
). Circuits, hormones and learning: vocal behavior in songbirds.
J. Neurobiol
.
33
,
602
–618.
Burley, N. and Coopersmith, C. B. (
1987
). Bill color preferences of zebra finches.
Ethology
76
,
133
–151.
Cynx, J., Williams, H. and Nottebohm, F. (
1990
). Timbre discrimination in zebra finch (Taeniopygia guttata) song syllables.
J. Comp. Psychol
.
104
,
303
–308.
Fant, G. (
1960
). Acoustic Theory of Speech Production. The Hague: Mouton.
Fitch, W. T. (
1997
). Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques.
J. Acoust. Soc. Am
.
102
,
1213
–1222.
Hessler, N. A. and Doupe, A. J. (
1999
). Social context modulates singing-related neural activity in the songbird forebrain.
Nature Neurosci
.
2
,
209
–211.
Hoese, W. H., Podos, J., Boetticher, N. C. and Nowicki, S. (
2000
). Vocal tract function in birdsong production: experimental manipulation of beak movements.
J. Exp. Biol
.
203
,
1845
–1855.
Jarvis, E. D., Scharff, C., Grossman, M. R., Ramos, J. A. and Nottebohm, F. (
1998
). For whom the bird sings: context-dependent gene expression.
Neuron
21
,
775
–788.
Mateika, J. H. and Gordon, A. M. (
2000
). Adaptive and dynamic control of respiratory and motor systems during object manipulation.
Brain Res
.
864
,
327
–337.
Miller, D. B. (
1979
). The acoustic basis of mate recognition by female zebra finches (Taenopygia guttata).
Anim. Behav
.
27
,
376
–380.
Morris, D. (
1954
). The reproductive behaviour of the zebra finch (Poephila guttata) with special reference to pseudofemale behaviour and displacement activities.
Behaviour
7
,
1
–31.
Nottebohm, F. (
1991
). Reassessing the mechanisms and origins of vocal learning in birds.
Trends Neurosci
.
14
,
206
–211.
Nottebohm, F. (
1996
). The King Solomon Lectures in Neuroethology. A white canary on Mount Acropolis.
J. Comp. Physiol. A
179
,
149
–156.
Nowicki, S. (
1987
). Vocal tract resonances in oscine bird sound production: evidence from birdsongs in a helium atmosphere.
Nature
325
,
53
–55.
Podos, J., Sherer, J. K., Peters, S. and Nowicki, S. (
1995
). Ontogeny of vocal tract movements during song production in song sparrows.
Anim. Behav
.
50
,
1287
–1296.
Price, P. (
1979
). Developmental determinants of structure in zebra finch song.
J. Comp. Physiol. Psychol
.
93
,
260
–277.
Rassler, B., Bradl, U. and Scholle, H. (
2000
). Interactions of breathing with the postural regulation of the fingers.
Clin. Neurophysiol
.
111
,
2180
–2187.
Riede, T. and Fitch, T. (
1999
). Vocal tract length and acoustics of vocalization in the domestic dog (Canis familiaris).
J. Exp. Biol
.
202
,
2859
–2867.
Robinson, F. N. and Frith, H. J. (
1981
). The Superb Lyrebird Menura novaehollandiae at Tidbinbilla, ACT.
Emu
81
,
145
–157.
Scharff, C. and Nottebohm, F. (
1991
). A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning.
J. Neurosci
.
11
,
2896
–2913.
Sossinka, R. and Böhner, J. (
1980
). Song types in the zebra finch (Poephila guttata castanotis).
Z. Tierpsychol
.
53
,
123
–132.
Suthers, R. A., Goller, F. and Pytte, C. (
1999
). The neuromuscular control of birdsong.
Phil. Trans. R. Soc. Lond. B
354
,
927
–939.
Todt, D. and Fiebelkorn, A. (
1980
). Display, timing and function of wing movements accompanying duets of Cichladusa guttata.
Behaviour
72
,
82
–106.
Trainer, J. and McDonald, D. B. (
1993
). Vocal repertoire of the long-tailed manakin and its relation to male–male cooperation.
Condor
95
,
769
–781.
Westneat, M. W., Long, J. H. J., Hoese, W. and Nowicki, S. (
1993
). Kinematics of birdsong: functional correlation of cranial movements and acoustic features in sparrows.
J. Exp. Biol
.
182
,
147
–171.
Williams, H., Crane, L. A., Hale, T. K., Esposito, M. A. and Nottebohm, F. (
1992
). Right-side dominance for song control in the zebra finch.
J. Neurobiol
.
23
,
1006
–1020.
Williams, H., Cynx, J. and Nottebohm, F. (
1989
). Timbre control in zebra finch (Taeniopygia guttata) song syllables.
J. Comp. Psychol
.
103
,
366
–380.
Williams, H., Kilander, K. and Sotanski, M. L. (
1993
). Untutored song, reproductive success and song learning.
Anim. Behav
.
45
,
695
–705.
Williams, H. and Mehta, N. (
1999
). Changes in adult zebra finch song require a forebrain nucleus that is not necessary for song production.
J. Neurobiol
.
39
,
14
–28.
Williams, H. and Staples, K. (
1992
). Syllable chunking in zebra finch (Taeniopygia guttata) song.
J. Comp. Psychol
.
106
,
278
–286.
Zann, R. A. (
1996
). The Zebra Finch: A Synthesis of Field and Laboratory Studies. New York: Oxford University Press.

Supplementary information