SUMMARY
In the subantarctic fur seal Arctocephalus tropicalis, mothers leave their pups during the rearing period to make long and frequent feeding trips to sea. When a female returns from the ocean, she has to find her pup among several hundred others. Taking into account both spectral and temporal domains, we investigated the individual vocal signature occurring in the ‘female attraction call’ used by pups to attract their mother. We calculated the intra- and inter-individual variability for each measured acoustic cue to isolate those likely to contain information about individual identity. We then tested these cues in playback experiments. Our results show that a female pays particular attention to the lower part of the signal spectrum, the fundamental frequency accompanied by its first two harmonics being sufficient to elicit reliable recognition. The spectral energy distribution is also important for the recognition process. Of the temporal features, frequency modulation appears to be a key component for individual recognition, whereas amplitude modulation is not implicated in the identification of the pup’s voice by its mother. We discuss these results with respect to the constraints imposed on fur seals by a colonial way of life.
Introduction
In the great majority of mammalian species, females feed only their own offspring and reject any others (Stirling, 1975; Boness, 1990; Riedman, 1990; Georges et al., 1999; Insley, 2001). This behaviour limits maternal energetic expenditure and ensures the fitness of breeders (McArthur, 1982). To prevent any allo-suckling attempts, females must be able to recognize their own pups. Many sensory modalities, such as olfaction, vision and audition, have been shown to be involved in this recognition process. Olfactory and visual cues may support recognition only at short range and are thus often used by the female for a final check of the pup’s identity (Bonner, 1968; Stirling, 1971; Cornet and Jouventin, 1979). Since acoustic cues are efficient over long and short distances, individual vocal recognition between mother and offspring appears to be a key factor for mother–pup differentiation among numerous other individuals (Trivers, 1972; Falls, 1982; Gould, 1983).
To support the individual recognition process, vocalisations have to show a highly individualised vocal signature allowing the mother to distinguish a given pup from many others. Therefore, an acoustic parameter encoding individual identity has to show a strong individual stereotypy, i.e. a weak intra-individual variability combined with a high inter-individual variability (Jouventin et al., 1979; Trillmich, 1981; Jouventin, 1982; Insley, 1992; Robisson et al., 1993; Mathevon, 1996; Lengagne et al., 1998; Phillips and Stirling, 2000). In a number of colonial bird species, the main acoustic parameters providing information about individuality have been experimentally shown to be the spectrum profile and/or the temporal pattern of frequency modulation (Jouventin et al., 1999; Lengagne et al., 2000, 2001; Jouventin and Aubin, 2000; Charrier et al., 2001a,c; Aubin and Jouventin, 2001).
For colonial mammals, some previous studies of signal analysis investigated the acoustic cues that provide information about individual identity, but there are no reports of playback experiment demonstrating the effective use of these parameters for vocal recognition [northern fur seal Callorhinus ursinus and northern elephant seal Mirounga angustirostris (Insley, 1992); southern elephant seal Mirounga leonina (Sanvito and Galimberti, 2000); American fur seal Arctocephalus australis (Phillips and Stirling, 2000)]. Although this analysis stage is very interesting, since it enables the isolation of the acoustic cues likely to encode individual identity, it is necessary, nevertheless, to confirm that a parameter found to be individualized by the analysis is effectively used in a recognition context. One must therefore perform playback experiments to validate any findings. Indeed, in some phocid species, individually distinctive vocalisations do not imply individual recognition (Job et al., 1995; McCulloch et al., 1999).
In the subantarctic fur seal Arctocephalus tropicalis, during the rearing period of 10 months, mothers alternate foraging trips to sea (for 2–3 weeks) and suckling periods ashore (for 3–4 days) (Georges and Guinet, 2000). When a female returns from the ocean, she has to find her offspring acoustically among several hundred conspecifics, posing a high risk of confusion (Riedman, 1990). The individual recognition system must be accurate and unambiguous (Charrier et al., 2001a,b). Using playback experiments, Roux and Jouventin (1987) demonstrated that subantarctic fur seal mothers are able to discriminate the voice of their own pup among many others, but no experiments dealing with the coding of individual identity have been performed.
The aim of the present study was first to identify, by analysis, the acoustic parameters of a pup’s call that may encode individual identity. To do so, we assessed the intra-individual and inter-individual variability of each parameter and calculated the ratio between the two to define a potential for individual identity coding (PIC). Acoustic cues showing high PIC value are likely to code for individual identity (Robisson et al., 1993). Second, we tested these identified parameters in playback experiments on fur seal mothers using modified pup calls.
Materials and methods
Study location and animals
This study was carried out on a subantarctic fur seal colony located on Amsterdam Island (37°55′S, 77°30′E), Indian Ocean, from June to August 2000. This colony contained 500–550 adult females. The females have been tagged for several years, and their pups were marked shortly after birth using temporary labels glued onto their fur. At approximately 1 month old, each pup was double-tagged in the web of the fore flippers with an individually numbered plastic tag.
Recordings and signal acquisition
We recorded the ‘female attraction calls’ emitted by pups (Fig. 1), which are known to allow pup recognition by mothers (Paulian, 1964). Recordings were performed with an omnidirectional Revox M 3500 microphone (frequency bandwidth 150 Hz to 18 kHz, ±1 dB) mounted on a boom (2 m long) and connected to a Sony TC-D5M audiotape recorder. Calls were recorded when a pup and its mother were searching for each other, e.g. when a mother returned from a feeding trip or from a short swim. During the recordings, the distance between the emitting pup and the microphone was approximately 0.5 m. Calls were digitised with a 16-bit acquisition card at a sample rate of 22 050 Hz using Cool Edit acquisition software (1996 Version; Syntrillium Software Corporation, Phoenix). Signals were then stored on the hard disk of a PC.
Physical analysis of acoustic parameters
We analysed 47 calls from 12 different 7- to 8-month-old pups (3–6 calls per individual) using the Syntana analytical package (Aubin, 1994) and Cool Edit software. To characterise the acoustic structure of the calls, we measured nine parameters.
The following spectral parameters were measured from the average power spectrum calculated from the total length of the call (Fig. 1B): FundFreq, the value of the fundamental frequency; Fmax1, the frequency of the first peak amplitude; Fmax2, the frequency of the second peak amplitude; Fmax3, the frequency of the third peak amplitude.
To describe the frequency modulation of the call, we first isolated the fundamental frequency by digital filtering. Because calls may differ from one another, the cut-off frequency was variable and was adjusted to the characteristics of the fundamental frequency. We then used the auto-correlation method, which follows the fundamental frequency more accurately than the spectrogram. Five variables were measured from the fundamental frequency (Fig. 1C): the duration of the ascending part (dasc), the duration of descending part (ddesc), the start frequency (Fstart), the maximal frequency (Fmax) and the end frequency (Fend). These variables were used to calculate the two following parameters: FMasc, the slope of the ascending frequency modulation (Hz s–1) [calculated as (Fmax–Fstart)/dasc], and FMdesc, the slope of the descending frequency modulation (Hz s–1) [calculated as (Fend–Fmax)/ddesc].
To describe the amplitude change over time, we first measured three variables: RMSaverT, representing the mean intensity of the entire call [the root mean square (RMS) signal level as a standard measure of signal intensity (Beeman, 1996)]; RMSmax, representing the loudest intensity of the call; and tAmax, the duration between the beginning of the bout of calling and the time at which the highest amplitude in the call occurs (Fig. 1D). These parameters were measured from the envelope of the signal calculated by the analytical method. The analytical signal method permits the envelope of a signal to be displayed with a great precision even when amplitude changes rapidly over time (for details, see Mbu-Nyamsi et al., 1994). Two further parameters were calculated: RMSmax/RMSaverT, the ratio of the maximal RMS value to the mean RMS value of the total call, which should be equal to 1 if there is no amplitude variation in the call; and RelPeakTime, the relative peak time, which represents the relative temporal position within the signal of the highest amplitude peak, calculated as (tAmax/dtot), where dtot corresponds to the total duration of the call (ms) measured from the oscillogram (Fig. 1A).
Statistical analysis of acoustic parameters
Statistical analyses were performed with Statgraphics Plus 3.1 software (Statistical Graphics Corporation, 1994 version). To describe the intra- and inter-individual variations of each parameter, we used the coefficient of variation (CV) (Robisson et al., 1993; Lengagne et al., 1998). For each parameter, we calculated CVi (within-individual CV) and CVb (between-individual CV) according to the formula for weak samples: CV={100(s.d./Xmean)[1+(1/4n)]}, where s.d. is standard deviation, Xmean is the mean of the sample and n is the population sample) (Sokal and Rohlf, 1995). To assess the potential of individual coding (PIC) for each parameter, we calculated the ratio CVb/mean CVi (mean CVi being the mean value of the CVi of all individuals) (Robisson et al., 1993; Lengagne et al., 1998). For a given parameter, a PIC value greater than 1 means that this parameter may be used for individual recognition since its intra-individual variability is smaller than its inter-individual variability (Robisson et al., 1993; Lengagne et al., 1998).
Playback procedure
Experimental signals were broadcast using a Sony TC-D5M tape recorder connected to an Audax unidirectional loudspeaker via a customised amplifier (10 W; frequency response 1–9 kHz, ±4 dB). The loudspeaker was placed 3–4 m from the mother being tested, and signals were played at a natural sound pressure level (SPL=75±7 dB measured at 1 m using a Bruël & Kjaer sound level meter type 2235). Tests were carried out when the pups were far from their mother or by isolating the pup from her. We noticed no difference in the behavioural responses to the playback tests between the two cases. When we had to isolate the pup, we carried it away from its mother to another place in the colony when she was sleeping or when the pup was at some distance from her. We took great care not to disturb the mother. However, in some cases, the mother realised that we were ‘kidnapping’ her pup; she reacted by giving some calls and following us for a distance of several meters. After 1–2 min, she became quiet, as if the pup has left her by itself. Pups were not isolated from their mother for more than 30 min. After each experiment, we returned the pup to its mother and we checked that the mother accepted and suckled it.
As a general rule, for a given female and for a given experimental day, we broadcast an experimental tape containing three experimental series. However, because of field conditions (e.g. the behaviour of the female was disturbed by the approach of a male or another individual), we were sometimes able to broadcast only the first two experimental series.
Each experimental series was composed of a repetition of four identical experimental signals. The order of presentation of the series was randomised for each mother. To avoid habituation (McGregor et al., 1992), each female was tested no more than twice, with a minimum of 2 days between playback sessions. Calls were emitted at natural rates (one call per 3 s) and at natural sound pressure levels. We waited until the mother’s behaviour was calm (motionless and silent) between each experimental series. Playback tests were carried out on 15–20 females for each experimental signal.
Playback experiments
Control experiment: do fur seal mothers respond selectively to their own pup’s voice?
To confirm the ability of subantarctic fur seal females to discriminate their pup among others, we played back to mothers a series of four natural ‘female attraction calls’ from their own pup and a series of four calls from an alien pup (series duration 10–15 s; allowing a minimum of 5 min between the two series). The presentation of the two series was randomized for each mother (15 mothers tested; different seals from those used in the other experiments). To rule out effects of particular individuals, each mother was tested with calls coming from different alien pups. To compare a mother’s response to the calls of her own pup with those from an alien pup, we used the McNemar test for paired samples.
Experimental signals
Using natural pups’ calls (Fig. 2J), we created experimental signals by modifying the frequency and temporal domains. We were interested in the pup recognition process of the mothers, so each mother was tested with experimental signals prepared from her pup’s calls. Modifications of the natural calls were performed using the Syntana and Goldwave packages (Aubin, 1994; Craig, 1996). For each experimental signal, we compared the female’s response with the response obtained with her natural pup’s call in the control experiment. The females of the control group differed from those tested with experimental signals, so we used Fisher’s exact test for independent samples to make these comparisons.
Experiment 1: is the whole spectrum necessary?
Two kinds of experimental signals were created, one was high-pass-filtered (>2000 Hz, Fig. 2A) and the other low-pass-filtered (<2000 Hz, Fig. 2B) (digital filtering; FFT window size 4096; precision in frequency 5.4 Hz). RMS values of both experimental signals were adjusted to those of the natural signal. As a general rule, a cut-off frequency of 2000 Hz allows the spectral energy to be divided equally between the two signals. The low-pass signals were composed of the fundamental frequency and its first three or four harmonics.
Experiment 2: how many harmonics are required?
We constructed three experimental signals using digital filtering (FFT window size 4096; precision in frequency 5.4 Hz). The first signal was composed of the fundamental frequency and its first two harmonics (FundFreq+H1+H2, Fig. 2C). The second was composed of the fundamental frequency and only the first harmonic (FundFreq+H1, Fig. 2D). The third consisted of the fundamental frequency only (FundFreq, Fig. 2E).
Experiment 3: is the harmonic relationship necessary?
Experiment 4: do mothers rely on the frequency modulation of the call?
We prepared an experimental signal in which the temporal frequency pattern was time-reversed while all other parameters remained unchanged (Fig. 2H).
Experiment 5: is amplitude pattern an important cue?
We prepared an experimental signal with no amplitude modulation but with a natural frequency modulation (Fig. 2I). To build this signal, we used the analytical signal concept, which allows demodulation of amplitude using Hilbert transformation (Seggie, 1987).
Criteria of response
Under natural conditions, a pup’s calls elicited the following stereotypical response from its mother: call emission, searching head movements (looking all around her) and approach. Prior to the broadcasting of an experimental series, we observed the mother for 2 min. During the emission of the series, we noted any change in her behaviour. To characterise the response of tested females to playback signals, we used a five-point ethological scale: 0, no reaction; 1, searching head movements after the third signal of the experimental series, but no call; 2, searching head movements before the third signal of the experimental series, but no call; 3, searching head movements before the third signal of the experimental series and calls after the third signal; 4, searching head movements and calls before the third signal of the experimental series.
We placed responses of classes 0 and 1 into a ‘no-response’ category and those of classes 2, 3 and 4 into a ‘positive-response’ category. This no-response/positive-response approach is an appropriate strategy for our study since we only needed to know whether or not the mother would respond. On the basis of these two response categories, we compared the ratio of no-responses/positive-responses in the control group, in which females were tested with their own pup’s signal (control experiment, yet described), with that in the experimental (modified signal) group.
Results
Description of pup calls
The spectrum of the ‘female attraction call’ is composed of a fundamental frequency (mean 607.6 Hz) (Table 1) and its harmonics (4–10 harmonics). The frequency band ranges between 350 and 6500 Hz. Most of the call energy is concentrated over the first harmonic (Table 2). In 92 % of the calls analysed, the frequency of the first peak amplitude (Fmax1) was either the fundamental frequency (FundFreq) or the frequency of the first harmonic (H1). The frequency of the second peak amplitude (Fmax2) and of the third peak amplitude (Fmax3) was either the fundamental frequency (FundFreq) or the frequency of one of the first three harmonics (H1, H2 or H3) in, respectively, 85 and 74 % of the calls.
The mean call duration (dtot) ranged between 300 and 1200 ms, with a mean of 820.3 ms (Table 1). The standard deviation of dtot is high, showing considerable variability among the calls emitted by a given individual. The main part of the call shows an ascending frequency modulation (FMasc in Table 1) (see also Fig. 1C), while the last part of the call shows a descending one (FMdesc in Table 1). The call is amplitude-modulated: RMSmax/RMSaverT differs from 1. The highest peak of amplitude occurs during the second half of the call, with a mean value at two-thirds of the duration of the call.
Potential for individual coding
As summarised in Table 1, the coefficients of variation within individuals are smaller than those among individuals except for call duration (dtot).
The PIC values of fundamental frequency (FundFreq) and the frequency of the first peak amplitude (Fmax1) are greater than 2, which means that these parameters are highly individualised. The frequencies of the second and third peak amplitude (Fmax2 and Fmax3) show a higher intra-individual variability, although their PIC is also greater than unity. Only those temporal parameters related to frequency modulations (FMasc and FMdesc) gave PIC values greater than 2. Both these cues show high variability among individuals. Examining the amplitude pattern, RMSmax/RMSaverT and RelPeakTime gave PIC values close to unity and these parameters are, therefore, less individualised.
Playback experiments
The results of the playback tests are reported in Table 3.
Control experiment: mothers respond specifically to their own pup’s calls
None of the 15 mothers responded to alien pups calls. This experience confirms that fur seal females are able to discriminate the calls of their young and always respond specifically to them.
Experiment 1: a truncated spectrum still supports recognition
Low-pass-filtered signals elicited positive responses in 100 % of the tested females. In contrast, only 67 % of the mothers identified the high-pass-filtered signals from which the lower part of the spectrum was absent.
Experiment 2: the fundamental frequency alone is not sufficient to allow reliable recognition, a minimum of two associated harmonics is required
When only the fundamental frequency was played back, only 55 % mothers reacted. Adding one harmonic made 70 % of the females react. The fundamental frequency with the first two harmonics elicited nearly 90 % of positive responses.
Experiment 3: the distribution of energy within the spectrum is an important feature for individual recognition
Signals with a filter of one out of two harmonics elicited pup recognition in only 62 of the mothers. In contrast, when only one out of three harmonics was missing, 81 % of the mothers recognized their pup’s voice.
Experiment 4: mothers rely on frequency modulation pattern to identify their pup
Calls with reversed-frequency temporal pattern were never recognized by the mother.
Experiment 5: amplitude pattern is not implicated in the individual recognition process
The absence of the amplitude pattern does not impair the recognition process: every mother tested was able to recognize her pup’s call in spite of this modification.
Discussion
Acoustic parameters likely to be used for voice recognition
Our present analysis of the calls of subantarctic fur seal pups reveals that some acoustic parameters are unlikely to be used for individual identity coding. Indeed, call duration is a highly variable feature both within and among individual vocalisations. It is impossible, therefore, for such a parameter to encode any information concerning the identity of the sender.
In contrast, information about individual identity is likely to be encoded mainly by both spectral and frequency temporal patterns. It is not surprising that the fundamental frequency is a highly individualised parameter since the characteristics of this acoustic cue are linked to the anatomical structure of the vocal tract (Kelemen, 1963). All the other spectral parameters are also likely to carry some information about the identity of the emitter, but Fmax1 is the most individualised. The analysis of the fur seal pups’ calls shows that Fmax1 is represented, in most cases, by either the fundamental frequency or its first harmonic. Moreover, the frequencies Fmax2 and Fmax3 occur in the lower part of the spectrum, ranging mainly between the fundamental frequency and its first three harmonics (Table 2). As a consequence, the lower part of the spectrum and the distribution of energy within the spectrum are likely to code some information about individual identity. Moreover, frequency modulation (FMasc and FMdesc) could also encode individual identity. This is not surprising since frequency modulation has been shown to be a widely used acoustic parameter for encoding information in birds (Aubin, 1989; Jouventin et al., 1999; Lengagne et al., 2000; Mathevon and Aubin, 2001; Charrier et al., 2001c) and mammals (Moody et al., 1986).
The call amplitude pattern may also supply some information about individual identity, even if the PIC values that characterize the amplitude parameters (RMSmax/RMSaverT and RelPeakTime) are not highly individualised.
We hypothesise, then, that the acoustic parameters used by females to identify their young may be (i) the lower part of the frequency spectrum, i.e. the fundamental frequency either alone or associated with a reduced number of harmonics (between one and three), (ii) the spectral energy, (iii) the frequency modulation and, to a lesser extent, (iv) the amplitude pattern.
Acoustic parameters used by the mother to recognize her pup
Following the analysis stage, the playback experiments allow us to identify, among the parameters transmitting information about individual identity, those used effectively by mothers to recognize the voice of their pup. In accordance with the hypothesis stated above, there is experimental evidence that females pay particular attention to the lower part of the frequency spectrum. The recognition process is impaired when this lower part is absent, although it is still functional in two-thirds of mothers. This result shows that, in the absence of this part of the spectrum, mothers are partially able to compensate for the lack of information by using the remaining high-pitched harmonics. The higher part of the spectrum therefore supports a redundancy of information.
Experiments examining the spectral energy composition show that, in spite of the fact that the number of harmonics remains high in the experimental signals, the disruption of the energy distribution impairs the recognition process, in particular if one out of two harmonics has been suppressed.
In accordance with our hypothesis, the frequency modulation pattern of a pup’s call is a key factor in the recognition process. Although the whole spectrum is present and the mean values of the fundamental frequency and of its associated harmonics remain unchanged, mothers were unable to recognize their pup’s call if the frequency modulation had been modified. In contrast, the absence of amplitude modulation did not impair the recognition process. Although this parameter shows potential for coding individual identity, it is not used in the biological context of fur seal pup recognition by the mother.
It appears then, that the recognition of pup calls was based on two main acoustic features of the call: mothers rely on some spectral characteristics and also on the temporal frequency pattern of their pup’s vocalisation.
A signature adapted to a colonial environment
In fur seal colonies, the level of background noise generated by the vocalisations emitted by the numerous individuals is high, and this may mask the vocalisations emitted during mother–pup encounters (Aubin and Jouventin, 1998). This acoustic jamming constraint is compounded by the fact that there is a high risk of visual confusion: when coming back to the shore, a female must relocate her own pup among a number of similar-looking pups in the rookery (Riedman, 1990). To be efficient in this context, the pup’s vocalisation supporting the recognition process must contain highly individualised features that must be resistant to propagation through a noisy channel. The ‘female attraction call’ emitted by young fur seals fulfils both these conditions. It presents a set of individualized acoustic features, characterising the vocal signature of each fur seal pup, and mothers use essentially two of these parameters, timbre and frequency modulation, to recognize their young. The recognition process is then completed by the use of further parameters. The cues used by females are likely to be adapted to a noisy environment. Indeed, we have shown that the amplitude modulation of the call, even if it represents an individualised acoustic feature of a pup’s call, is not used in the individual recognition process.
Previous experiments into sound propagation have shown that amplitude modulation undergoes degradation and distortion during transmission through a noisy environment (Wiley and Richard, 1978). High-pitched frequencies are also susceptible to degradation (Wiley and Richard, 1978). Our experiments show that females do not need the higher part of the frequency spectrum to recognize their pup’s call. In contrast, the low frequencies, consisting of the fundamental frequency and a few related harmonics, are sufficient to allow reliable recognition. However, the spectral characteristics of the call are not sufficient to allow pup identification; mothers also rely on the temporal frequency pattern of the call. Frequency modulation is a reliable cue to support recognition in a noisy context. Indeed, the use of temporal frequency pattern analysis corresponds to the matched filter model, one of the two models allowing an acoustic signal to be received and extracted in a noisy background (Hopkins, 1983). In this matched filter model, the output of the filter is the cross-correlation between the received signal and an expected signal. This method is known to be the most effective for detecting a signal in a noisy situation (Lee, 1960; Okanoya and Dooling, 1991; Klump, 1996).
The redundancy of the information is also important. In the pup’s call, redundancy is supported by the presence of numerous harmonics that produce a highly reliable recognition process: we have shown that the mothers need only 2–3 harmonics to recognize their pup’s voice, whereas the pup’s call is composed of more than three harmonics. Therefore, if some harmonics were masked by the environmental noise, especially the higher harmonics, the remaining harmonics would suffice to allow the female to recognize her pup. This kind of strategy for harmonic structure discrimination has been demonstrated in birds (Uno et al., 1997), but is likely also to be present in mammals. Moreover, in the natural situation, pups tend to repeat their call (one call per 3 s). This redundancy is likely to enhance signal detection by their mothers. Indeed, temporal fluctuations in background noise can be exploited by the auditory system to detect a signal masked by other signals (Langemann and Klump, 2001; Nieder and Klump, 2001).
Acknowledgements
We are grateful to Thierry Aubin and two anonymous referees for their kind advice. Mary-Anne Lea improved the English. We thank the members of the fiftieth and the fifty-first scientific missions on Amsterdam Island for their help in the field, particularly Gwenaël Beauplet, Murielle Ghestem, Rémy Andrada, Catherine Baur, Yann François, Arnaud Jeulin, Florence Patural, Sébastien Ricaud and Vincent Rouvreau. This research was supported in the field by the Institut Français pour la Recherche et Technologie Polaires (IFRTP). I.C. was supported financially by the Ministère de l’Education Nationale, de Recherche et de la Technologie (MENRT).