ABSTRACT
In king penguin colonies, several studies have shown that both parent–chick recognition and mate–pair recognition are achieved by acoustic signals. The call of king penguins consists of strong frequency modulations with added beats of varying amplitude induced by the two-voice generating process. Both the frequency modulation pattern and the two-voice system could play a role in the identification of the calling bird. We investigated the potential role of these features in individual discrimination.
Experiments were conducted by playing back altered or reconstructed parental signals to the corresponding chick. The results proved that the king penguin performs a complex analysis of the call, using both frequency modulation and the two-voice system. Reversed or frequency-modulation-suppressed signals do not elicit any responses. Modifying the shape of the frequency modulation by 30 % also impairs the recognition process. Moreover, we have demonstrated for the first time that birds perform an analysis of the beat amplitude induced by the two-voice system to assess individual identity. These two features, which are well preserved during the propagation of the signal, seem to be a reliable strategy to ensure the accurate transmission of individual information in a noisy colonial environment.
Introduction
Acoustic species-specific recognition in birds has been intensively studied in the past (for a review, see Becker, 1982), and individual acoustic recognition is now increasingly being investigated (Catchpole and Slater, 1995; Dhondt and Lambrechts, 1992; Stoddard, 1996) because it is widespread among birds and plays a major role in kin recognition. In species that breed in colonies, individuals continuously hear the calls of conspecific birds, but most of the time only respond to the call of a particular individual, the mate or the chick (Evans, 1970; White, 1971; Jouventin, 1982). Nevertheless, to our knowledge, few studies have been carried out to assess the importance of the different elements of the call in the individual recognition process.
In penguin species, birds breed in large colonies where nest-sites are often densely packed, providing enormous possibility for confusion. In these species, it has been proved that individual recognition between mates and between parents and their chick is achieved by acoustic signals (Prévost, 1961; Penney, 1968; Derenne et al., 1979; Proffitt and McLean, 1991; Seddon and Van Heezik, 1992). In nearly all species, calls are temporally subdivided into distinct units termed syllables. In the emperor penguin Aptenodytes forsteri and the Adélie penguin Pygoscelis adeliae, the birds must perceive several successive syllables before they can assess the identity of the emitter (Jouventin, 1971; Jouventin and Roux, 1979). In contrast, our previous studies of the king penguin Aptenodytes patagonicus emphasised that the identity of the individual emitting the call is contained in each syllable of the call: a chick recognised its parents and paired mates recognised each other when only one syllable was played back (Jouventin et al., 1999; Lengagne et al., 2000).
Using experimental signals with modified spectral contents, we demonstrated that the relative amplitude of harmonics is not important for individual discrimination, and even a signal in which only the fundamental frequency is maintained is still recognised. In the same way, experimental signals from which the amplitude modulation had been removed allowed us to demonstrate that this acoustic feature is not involved in individual recognition. This indicates that the identification process is based upon other parameters of the signal. The syllable is strongly modulated in frequency, and analysis revealed that this frequency modulation is highly variable among different individuals, albeit somewhat invariant in the call of the same individual (Lengagne, 1999), and can therefore serve as an individual signature. Moreover, the analysis of the frequency content of syllables revealed two close frequency bands with their respective harmonics. The interaction between these two ‘voices’ generates a characteristic beat (Greenwalt, 1968).
In the present study on communication between adult and chick king penguins, we focus our attention on the information about the identity of an individual contained in the syllable, the intra-syllabic signature(s). We hypothesise that birds assess the identity of the emitter by using the frequency modulation pattern. It is also hypothesised that the two voices may contribute to individual identification, together with the frequency modulation pattern. Using different synthetic calls, we tested the effects of making several modifications to the frequency modulation of the natural call. The role of the two-voice system was then investigated.
Materials and methods
Study areas
The recordings and experiments were performed on 17 king penguin chicks (Aptenodytes patagonicus) at La Baie du Marin, Possession Island, Crozet Archipelago (46°25′S, 51°45′E) during November and December 1998. The king penguin colony consisted of approximately 40 000 pairs of birds (C. Guinet, unpublished data). The chicks were selected according to their age, which was between 10 and 12 months. At this stage of their life, the chicks are entirely dependent on their parents for food. To facilitate future identification, the chicks to be tested were banded on a flipper with a temporary plastic band.
Recording and analysis procedure
Both king penguin parents rear the chick. When a parent returns from the sea to the colony to feed its chick, it is silent until it reaches the area of the colony where the chick is usually located (Lengagne, 1999). It then starts an acoustic search for its chick by emitting the display call. This signal was recorded using an omnidirectional Beyer Dynamic M300 TG microphone mounted on a 4 m pole held by a human observer and connected to a Sony TCD5 M tape recorder. The microphone was placed 1 m in front of the beak of the bird. The display calls of 12 parents (male or female) were recorded, and their respective chicks were banded.
Signals were digitised through an OROS AU21 16-bit acquisition card equipped with an anti-aliasing filter (low-pass filter, cut frequency 8.4 kHz; −120 dB per octave) at a sampling rate of 20 kHz. Signals were then analysed and modified using MATLAB software and the SYNTANA analytical package (Aubin, 1994).
Playback procedure
The experiments were performed during clear and dry weather conditions. To avoid sound propagation problems due to wind (Eve, 1991; Lengagne et al., 1999c), experiments were conducted when the wind speed was less than 4 m s−1. The broadcast chain consisted of a Sony TCD5 M tape recorder connected to an autonomous EAA amplifier loudspeaker (frequency range 100 Hz to 8 kHz ±2 dB). To prevent habituation, each bird was tested only once a day. To prevent differences in volume affecting the response of the bird, all the signals were broadcast at the same intensity and at the same distance from the tested bird (Evans, 1970). Signals were played back at 95 dB SPL (sound pressure level, reference pressure 2×10−5 Pa), measured 1 m from the loudspeaker, with a Bruël & Kjaer sound level meter type 2235 (linear scale, slow setting). This level is equivalent to that produced by the bird (Robisson, 1993; Aubin and Jouventin, 1998). The loudspeaker was placed at an average distance of 7 m from the bird to be tested, a distance at which penguins are able to discriminate the identity of the emitter from the background noise of the colony (Aubin and Jouventin, 1998; Lengagne et al., 1999a).
The playback procedure was the same as that used previously in our studies on the king penguin (Aubin and Jouventin, 1998; Jouventin et al., 1999; Lengagne et al., 1999a; Lengagne et al., 2000). In each experiment, two renditions of the same experimental signals separated by a 15 s silence were broadcast. The response obtained was compared each time with that induced by a reference signal: two renditions of a natural call from the parent of the tested chick separated by a 15 s silence. The order of the experimental and reference signals was randomised.
Classification of reactions and statistical analysis
Under natural conditions, when the parents are absent, the chick remains silent. When it identifies the call of its parent, it holds up its head, calls in reply and moves, often running towards the emitter parent (Stonehouse, 1960). The behaviour of the chick is the same whether a male or a female parental call is emitted (Jouventin, 1982). None of the other chicks in the flock reacts to the extraneous calls. To evaluate the intensity of the response to playback signals, a five-point ordinal scale was used, ranked as follows: class 0, no reaction; class 1, agitation (head movements, visual inspection of the environment); class 2, agitation, the chick then calls in response to the second broadcast; class 3, agitation, the chick then calls in response to the first broadcast and class 4, agitation, the chick then calls in response to the first broadcast, approaches the loudspeaker and stops less than 3 m away from it.
This behavioural scale is similar to those previously used in studies dealing with the species (Derenne et al., 1979; Robisson, 1990; Jouventin et al., 1999; Lengagne et al., 2000). Responses in classes 2, 3 and 4 were considered positive because they enable the two birds to meet and the chick to be fed by its parent. Responses in classes 0 and 1, which were not followed by feeding, were considered negative.
The responses of the chicks were first rated on the five-point ordinal scale and then converted to negative (ranks 0+1) and positive (ranks 2+3+4) responses. When compared with the reference (unaltered) signal, the responses to modified signals could be measured only as equal or weaker, hence the use of one-tailed tests. The results were assessed using Fisher’s one-sided exact 2×2 test. If multiple comparisons were made with the same reference signal, the significance levels were Bonferroni-corrected.
Reference and experimental signals
We played back one reference signal and 11 experimental signals to each chick. Seven of these were obtained by acoustic modifications of the reference signal and the other four were built using a ‘starting from scratch’ synthesis method. In individual recognition studies, each parental call was used to test only one bird, the corresponding chick.
The king penguin call (the reference signal, Fig. 1) is composed of units termed syllables (Jouventin, 1982). These are separated by strong amplitude declines which coincide with falls in frequency (Fig. 1A). We know from previous work that all the calls produced by an individual have the same temporal and spectral characteristics (Robisson, 1992a; Lengagne et al., 1997). Thus, calls of the same individual are highly stereotyped. As mentioned above, we also know that the broadcast of one syllable of the call is sufficient to elicit recognition. As a consequence, the present study focuses on the intra-syllabic structure, and we used the first syllable of the call as a reference signal (RS). Its duration was 516±9 ms (mean ± S.E.M.; N=22). This syllable was modulated in frequency, the ascending part of the frequency modulation rising at a mean rate of 1887±36 Hz s−1, the descending part falling at a rate of 568±24 Hz s−1 (means ± S.E.M.) (Lengagne et al., 2000). A detailed spectral analysis revealed the polychromatic nature of the signal, which was composed of two fundamental frequencies corresponding to the two voices (Fig. 1B) and their related (between four and eight) harmonics. The frequency difference between the two voices was not constant over the whole syllable but varied from 11 to 91 Hz. The same variation was observed among individual penguins (10–100 Hz). The interaction between the two acoustic sources generated a series of amplitude beats whose period varied from 11 to 92 ms (the smaller the frequency difference between the two voices, the longer the period of the amplitude beats). To simplify the task of signal synthesis, we kept only the loud part of the syllable (the fundamentals and the first four harmonics; Fig. 1C), which is sufficient to allow the recognition process (Jouventin et al., 1999; Lengagne et al., 2000) and contains at least 70 % of the total energy of the call.
For each chick tested, experimental signals were obtained by modifying either the frequency modulation content or the two voices of the same recording of the reference signal. Unless specified otherwise, each synthesised or altered signal was further rescaled to match the root-mean-squared (RMS) amplitude of the reference signal. This scaling was intended to give both the reference signal and the altered signals the same output levels.
Modifications of the frequency modulation and of the two voices
Experimental signal 1 (ES1) was produced by reversing the reference signal (RS). The new signal therefore had a long ascending part and a shorter descending part. The amplitude beats generated by the two voices were also reversed, but the duration was the same as that of the of RS (see Fig. 2).
Using an interpolation method, experimental signals 2–4 (ES2–ES4) were produced by gradually stretching the RS to enable us to determine the maximum degree of frequency modulation modification possible before individual recognition failed. The RS was stretched by 10 %, 20 % and 30 %, respectively, and the ascending and descending slopes of the frequency modulation were consequently modified in the same proportions (Fig. 3).
Using the same method, experimental signals 5–7 (ES5–ES7) were produced by compressing the reference signal by −10 %, −20 % and −30 % respectively (Fig. 3). These signals showed relative modifications of the beats: they were elongated (or shortened) by 10 %, 20 % or 30 %, but their relative duration was maintained (i.e. the first beat was longer than the second but shorter than the third, etc.).
Modifications of the two-voice system
In different studies dealing with the acoustic system of individual recognition in king penguins the different parameters of the call have always been modified in some way so that duration, spectral content, amplitude and frequency modulations have all been changed (Derenne et al., 1979; Robisson, 1992a; Jouventin et al., 1999; Lengagne et al., 2000; the first part of this study). To investigate the importance of the two-voice system for individual recognition in the king penguin, it is necessary to remove or modify one of the two voices. Because of the steep slopes of the frequency modulation in the king penguin call, it is impossible to modify or to remove one voice using filtering methods, so we synthesised a new call. To produce experimental signal 8 (ES8, Fig. 4), the reference signal (the parental call) of each tested chick was first precisely analysed to obtain the necessary parameters to build up a synthetic signal. In the second stage, the reference signal was digitally low-pass-filtered by applying optimal filtering with overlapping Fast Fourier Transforms (Mbu-Nyamsi et al., 1994). The window size of the FFT was 2048 points. The strong frequency modulation meant that 3–5 filtration steps were necessary to obtain the fundamental frequencies. Then, in the third stage, we used a Hilbert transform of the signal (Seggie, 1987; Brémond et al., 1990; Mbu-Nyamsi et al., 1994) to obtain the instantaneous frequency. The interaction between the two voices generated amplitude beats (Brémond et al., 1990), showing that the instantaneous frequency curve discontinuities obtained after the Hilbert transform coincided with the instantaneous amplitude variation. Each beat was indicated by a discontinuity and thus gave an accurate estimate of the frequency difference between the two voices. In the fourth stage, knowledge of the precise position of the two voices allowed us to build a synthesised syllable using MATLAB from eight reference points for each voice. If φ(t) is the instantaneous fundamental frequency (in Hz) of a given voice at time t, as evaluated from the reference points of this voice by a quadratic interpolating Lagrange polynomial, then the signal to be synthesised for the voice under consideration, at time t, S(t) is obtained by:
where N(h) is the harmonic number, i=1 stands for the fundamental frequency, wi is the relative amplitude of harmonic i as determined from the power spectrum of the reference signal (w1=1). We used four harmonics [N(h) =4].
Using the data previously used to synthesised ES8, we built a signal with only one of the voices (the upper voice for six tested chicks, the lower one for six other chicks). We then extracted the envelope from the reference signal using the Hilbert transform (Mbu-Nyamsi et al., 1994). This envelope was low-pass-filtered (bandpass 0–30 Hz) to remove all the beats generated by the two voices and, finally, this was multiplied by the signal with one voice. We thus obtained experimental signal 9 (ES9; Fig. 4) which had one voice and no beats.
To obtain experimental signal 10 (ES10; Fig. 4) we used the same carrier frequency as for ES9 (the signal with only one voice), but the envelope was less filtered (bandpass 0–70 Hz) to keep the beats. Thus, we obtained a signal with one voice but with the natural beats of the reference signal.
Modifications of the frequency modulation
The envelope used to built ES10 was applied to a carrier frequency composed by one fundamental and its four harmonics. The fundamental frequency was not modulated and corresponded to the mean value between the maximum and the minimum of the frequency modulation of the reference signal. As a result, we obtained a signal (ES11; Fig. 4) with one voice and the natural beat series of the reference signal but no frequency modulation.
The main characteristics of the 11 experimental signals described above are summarised in Table 1. All these signals were tested on 12 chicks.
Results
The scores obtained after playing back the experimental signals were compared with the score obtained with the reference signal. The reverse-syllable ES1 was not recognised as a parental call by any of the chicks tested (0 % of positive response, P<0.001). With experimental signals 2–7, we found that both stretched or compressed syllables hampered the recognition process in the same way (Fig. 5). Syllables compressed or stretched by 10 % were recognised by the chick (no significant difference from the reference signal), but a 20 % modification decreased the number of positive responses of the tested chicks (there was a 70 % positive response for the stretched syllable and a 67 % positive response for the compressed syllable, P<0.05). The 30 % modification had a major effect on the recognition process; most of the tested chicks did not recognise this signal and showed no reaction, simply resting or preening themselves (only 8 % showed a positive response, P<0.001).
Synthetic syllable ES8, roughly mimicking the reference signal, was not sufficient to elicit recognition by all the chicks tested. We obtained a positive response for only half the birds tested, giving a significant difference from the reference signal (P<0.05) (Tabe 2). The signal with only one voice and without beats (ES9) triggered no positive responses. In every case, chicks remained stationary and silently in the colony, showing no response to the broadcast. The difference in the response to this signal and the reference signal was significant (P<0.001), but there was no statistical difference between the responses obtained for a signal with the lower voice or the upper one. Chicks recognised the signal with one voice and with the natural series of beats (ES10) as well as they did the reference signal (92 % positive response, no significant difference from the reference signal) and reacted equally well to signals with the lower and the upper voice (no statistical difference). Experimental results obtained with ES8, ES9 and ES10 are summarised in Table 2.
In spite of the presence of beats, the signal without frequency modulation (ES11) was not recognised as a parental call by any of chicks tested (0 % positive responses, P<0.001).
Discussion
A signature based upon a double system of identification
Our experiments show that chicks pay attention to the frequency modulation contained in each syllable of the call.
The reversed syllable, implicating strong modifications of both frequency modulation and amplitude beats, was never recognised. The interpolation method used to build ES2–ES7 allowed us to modify the reference signal gradually. The shape of the frequency modulation was modified by changing the slopes and durations of the ascending and descending parts of the frequency modulation. But, in contrast to ES1, the order of the series of beats was maintained. In such conditions, even when a syllable was stretched or compressed by 20 %, it still contained sufficient information since it triggered positive response in approximately 70 % of tests. The identification of the signal only failed when a syllable was stretched or compressed by 30 %. In this latter case, the duration of the ascending and descending parts and the slope of the frequency modulation were presumably too strongly modified to allow call identification.
Numerous studies of coding/decoding processes have shown that the two-voice system has the potential to be used by birds as an individual signature (Brémond et al., 1990; Robisson, 1992b; Robisson, 1993; Robisson et al., 1993; Mathevon, 1996). In these studies, the authors reported that individual identity may be encoded in the two-voice system since the within-individual variation of beats is less than that between individuals. The next step was to test experimentally whether birds used two acoustic sources to generate features relevant for the recognition processes. An initial study on starlings (Sturnus vulgaris) indicated that the two voices did not have a specific function, at least for decoding the information (Aubin, 1986). A later study conducted on emperor penguins (Aptenodytes forsteri) showed, for the first time, that birds used the two-voice system to recognise each other (Aubin et al., 2000).
To understand the possible role of the two-voice system in king penguin call identification, a study with synthetic syllables constructed from scratch was conducted. We constructed ES8, which can be considered as a first step in the process of synthesis and appears to the chick as a ‘caricature’ of the parental call. This signal corresponds to a sum of simplifications: one syllable, low-pass frequencies, only eight points synthesised and interpolation between these points. It follows a minimal structure with regard to frequency modulation and beat content. It roughly matches the frequency modulation shape and the beats of the parental call, and this probably explains why ES8 was able to elicit only a 50 % level of positive responses.
To assess information about an individual from the characteristics of the two voices, birds could use two different methods. They could analyse either the precise frequency values of the upper and lower voices and their frequency differences (spectral analysis) or the variation of amplitude beats generated by the two sources (temporal analysis). To determine which process is used, we constructed ES9 and ES10, signals with only one voice. The playback of ES10, which contains the beats of the natural syllable, elicits a strong reaction by the chicks, whereas the presentation of the signal without beats (ES9) elicits no response. For both signals, we obtained the same results no matter which voice was used, the upper or the lower one. Thus, it appears that, to identify their parents, chicks pay attention to the beat structure (temporal analysis) of the call and not to the frequency difference between the two voices (spectral analysis).
Nevertheless, our field play-back experiments demonstrated that the discrimination of a signature call requires a specific temporal evolution of the frequencies. Indeed, the broadcast of a signal with the natural beat series and without frequency modulation (ES11) was not recognised as a parental call, suggesting that multiple features may be involved in individual recognition: chicks perform a temporal analysis of both the frequency modulation and the series of beats. It is difficult to determine the relative weighting of the frequency modulation and of the amplitude beats in the recognition process. Indeed, in our experimental signals, the manipulations concern different types of acoustic units: Hz s−1 for frequency modulation; Hz and/or s for beats. We observe that a signal without frequency modulation and with natural beats induces no positive responses by the 12 chicks tested. The result is almost the same for a signal with frequency modulation and no beats (one positive response for 12 chicks tested). We can only conclude that, to induce individual recognition, both parameters must be present in the signal. Most birds use a complex of differentially weighted parameters, rather than any simple features, for signal recognition. This has been demonstrated for songs (Weary, 1990) and for calls (Allen, 1979; Gaoni and Evans, 1986; Dooling et al., 1987). In king penguins, the complex pattern involved in individual recognition associates frequency modulation and beats of the syllable. To our knowledge, this is the first time that the beats generated by the two-voice system have been clearly demonstrated to be important for the identification of an acoustic signal.
A recognition process fitted to a biological problem
Individual recognition by means of vocal signatures in a colonial environment appears to be very difficult. The success of the identification assumes that different conditions may have to be fulfilled. Effectively, several problems have to be solved: the masking effect of the continuous background noise of the colony, the degradation of the sound features of the signal during propagation, because of the obstacles presented by the bodies of the birds, and the requirement for a complex sound pattern allowing a large number of individual signature in colonies that can contain up to one million birds.
King penguins breed in dense colonies. The adult call is transmitted in a context involving the noise generated by the colony plus the noise generated by the wind, both of which reduce the signal-to-noise ratio (Lengagne et al., 1999c). In this noisy environment, birds cannot predict when and for how long they can be heard without interference. To increase the chance of being identified, the adult must repeat the individual information so as to have the opportunity of finding a window of silence. Consequently, and as predicted by the theory of information, the signal must be redundant (Shannon and Weaver, 1949). This is the case for the king penguin call, which is composed of a number of successive syllables. In previous experiments (Jouventin et al., 1999; Lengagne et al., 2000), we have shown that individual recognition can be achieve with just one syllable, whatever the choice of the syllable. This is possible because each syllable contains the identity code: the frequency modulation and the beats. This intra-syllabic signature enhances the chance of being identified in the noisy environment of the colony.
Measurement of the range of transmission in the colony indicates that the communication system involving individual recognition is performed at short range, in agreement with the assertion of Falls (Falls, 1982). In previous play-back experiments, we demonstrated that the maximum discrimination range of the call in the colony is 12–16 m (Aubin and Jouventin, 1998; Lengagne et al., 1999a). Indeed, the environment of a penguin colony is very constraining for the transmission of individual information. According to the environmental hypothesis (Williams and Slater, 1993), the decoding process is particularly efficient and is based on frequency modulation and the two-voice system, sound features that are best able to survive transmission across the colony. Indeed, experiments on sound transmission have demonstrated that, even at a short distance, the bodies of the penguins, the ground and the wind affect the energy distribution of the frequencies and the strong amplitude modulation corresponding to each syllable (Lengagne et al., 1999b; Lengagne et al., 1999c). In contrast, the slow frequency modulation of the syllable as the beats generated by the two voices are well preserved during propagation seems to be a more reliable strategy to ensure accurate transmission under constraining conditions (Lengagne, 1999).
Frequency modulation associated with amplitude beats generated by the two voices leads to a very complex pattern that is likely to be the source of great variability. To investigate the maximum number of different individual signatures in a call, Beecher (Beecher, 1988) developed a quantitative method for measuring the amount of information needed to identify each member of a population. In the case of the king penguin, frequency modulation allows a huge number of combinations between the temporal and frequency parameters. Moreover, the chance of individual distinctiveness is enhanced by the use of amplitude beats. A king penguin syllable contains on average 15 beats with values spreading between 11 and 92 ms (Lengagne, 1999) and, as a consequence, the two-voice system associated with the frequency modulation parameters allows an almost infinite number of combinations. The exploitation of the two acoustic sources represents a means whereby Aptenodytes spp. can increase the information content of their calls. This is in accordance with the model proposed by Schleidt (Schleidt, 1976) in which the number of features of the call is a component of individual distinctiveness. It is interesting to note that, among penguin species, only those with no fixed nest site, emperor and king penguins, can generate two voices in their calls (Robisson, 1992b; Robisson, 1993). For these two species, the egg and the chick are carried on the feet of the parent. They have to identify their mate or chick in a moving crowd, without the help of visual cues. A further possibility is that the complexity of the call has evolved in parallel with the loss of territoriality in relation to a biological problem of partner identification (Robisson et al., 1993; Lengagne et al., 1997). The extreme circumstances under which vocal recognition occurs has induced in king penguin colonies an acoustic communication system that is accurately fitted to behavioural and environmental constraints.
ACKNOWLEDGEMENTS
We are indebted to P. Jouventin for allowing us to carry out this experimental study in Crozet. Logistical support in the field was provided by the Institut Français pour la Recherche et la Technologie Polaire (I.F.R.T.P.) and in the laboratory by the Centre National de la Recherche Scientifique (C.N.R.S.). We thank the 1998 winter team in Crozet for help in the field and two anonymous referees for helpful criticism of the manuscript.