During the transmission of acoustic signals, the spectral and temporal properties of the original signal are degraded, and with increasing distance more and more echo patterns are imposed. It is well known that these physical alterations provide useful cues to assess the distance of a sound source. Previous studies in birds have shown that birds employ the degree of degradation of a signal to estimate the distance of another singing male (referred to as ranging). Little is known about how acoustic masking by background noise interferes with ranging, and if the number of song elements and stimulus familiarity affect the ability to discriminate between degraded and undegraded signals. In this study we trained great tits (Parus major L.) to discriminate between signal variants in two background types, a silent condition and a condition consisting of a natural dawn chorus. We manipulated great tit song types to simulate patterns of reverberation and degradation equivalent to transmission distances of between 5 and 160 m. The birds' responses were significantly affected by the differences between the signal variants and by background type. In contrast, stimulus familiarity or their element number had no significant effect on signal discrimination. Although background type was a significant main effect with respect to the response latencies, the great tits' overall performance in the noisy dawn chorus was similar to the performance in silence.
Acoustic signals and acoustic communication are especially useful at long distances. Territorial songbirds employ acoustic signals for both mate attraction and for defending a territory (Collins, 2004). Degradation of the signals during transmission will provide the recipients with cues revealing the distance of the sender, e.g. cues that reveal the position of a rival male relative to the recipient's territory boundary. Assessing the distance of a sound source by its physical properties is often referred to as ranging (Morton, 1986). The major cues for ranging are a change in overall amplitude of the signal, modifications of the signal envelope with distance (e.g. by reverberations; Wiley and Richards, 1982), and a change of the signal's frequency spectrum (e.g. by frequency-dependent attenuation; Marten and Marler, 1977). All of these cues have been shown to be useful to birds, although to a different extent, and for the evaluation of some of these cues prior knowledge of the signal has been suggested to play a role (Naguib et al., 2000; Holland et al., 2001).
Most evidence for distance perception and ranging in birds comes from field experiments. Commonly conspecific song is played back from a loudspeaker to a territorial male and the behaviour of the bird in response to these test signals is recorded. The signals are manipulated to simulate different distances of a potential intruder. The experimental bird will usually defend its territory and will approach the simulated intruder. From the flight distance relative to the degree of degradation applied to the signal, the experimenter can then infer the location of the sound source perceived by the bird (Naguib and Wiley, 2001). In field experiments, the effect of the subject's propensity to respond, as well as its perceptual ability, are difficult to separate. Laboratory experiments can help not only to solve this dilemma by carefully controlling the motivation of the subjects and response contingency, but also complement the knowledge obtained from the field. Laboratory studies on ranging in birds investigated cues important for distance discrimination together with species identification (Phillmore et al., 1998; Radziwon et al., 2011) or as a function of previous experience (Phillmore et al., 2003). So far, however, little is known about how much the ubiquitous background noise affects the perception of ranging cues in the natural environment (Brumm and Naguib, 2009).
Here we used trained wild birds to evaluate sets of signals representing various transmission distances in order to compare the birds' sensitivity for ranging cues in two different background types. The presence of background noise in the natural environment is well known to impair signal detection and thus communication between animals of different kinds, which may ultimately impose fitness costs (Brumm, 2010; Laiolo, 2010; Read et al., 2014). Commonly, many birds, frogs or insects sing at the same time, and therefore mutually mask their songs. Background noise produced by conspecifics and other vocalizing animals will operate as an energetic masker if the masking background noise matches the frequency spectrum of the signals. Conspecific vocalizations are especially potent maskers as they match the spectro-temporal structure of a species' communication signals. Other substantial masking effects are produced by wind moving the vegetation and by anthropogenic noise, such as traffic noise (Brumm and Slabbekoorn, 2005). For these types of background noise, it is mainly the signal-to-noise ratio at lower frequencies that interferes with long-distance communication (Langemann and Klump, 2005). Thus we decided to test the discrimination ability of our experimental birds for distance cues, in both a silent condition and in the masking background noise of a natural dawn chorus. We also sought to understand how previous experience affects the birds' ability to analyse such cues, and whether the number of signal elements affects the birds' assessment of ranging cues (as is known, for example, for detection sensitivity; Swets et al., 1959).
Our study species was the great tit (Parus major L.), a common European songbird in which males defend territories, and which has been shown to respond readily in ranging experiments, both in the field (McGregor and Krebs, 1984) and in the laboratory (Langemann and Klump, 2005). In the present study, great tits obtained from the field were trained in the laboratory to discriminate between great tit song elements that had been modified to simulate different transmission distances. The test signals consisted of phrases (repeated units) that are naturally found in great tit song and make up the different song types present in natural populations. Sets of phrases from different song types were parametrically manipulated to show patterns of reverberation and degradation equivalent to transmission distances of between 5 and 160 m (here called virtual distances). These ‘echo patterns’ were entirely computer generated, and were not obtained by simple re-recordings from songs broadcast in the field (which commonly creates unwanted acoustic by-products). The method of signal generation we used in the present study has been successfully applied in a field study (Naguib et al., 2000) in which the approach behaviour of territorial chaffinches (Fringilla coelebs L.) was related to the degree of degradation of the playback signals, demonstrating that the birds perceived differences in degradation as differences in distance of a sound source.
Many species of songbirds sing different song types and it has been argued that experience and thus stimulus familiarity might affect a bird's ability to assess the distance of a sound source (Morton, 1998; Naguib, 1998; Wiley, 1998). Although the motivational context is very different, when observing a bird in its natural or in the laboratory environment, both approaches aim to estimate whether birds do make use of specific signal features. For example, for assessing distance cues of a specific song type it is not necessary that a male produces this song type itself, it is sufficient if it is heard from a neighbouring male (McGregor and Avery, 1986). Commonly, differences in the behavioural response to degraded and non-degraded playback songs were found only if they were familiar to the focal male or if they were very similar to the bird's own song, and little or no difference in response to playback signals was found for unfamiliar signals (McGregor et al., 1983; Shy and Morton, 1986). A positive effect of stimulus familiarity on auditory processing has also been demonstrated in a laboratory study (Seeba and Klump, 2009). The European starlings (Sturnus vulgaris L.) were best at perceptually restoring the ‘missing’ parts of song signals when they had prior experience with the signals compared with stimuli they were unfamiliar with.
Patterns of reverberations together with frequency-dependent attenuation are among the cues that can be used for ranging. In the present study, we trained great tits to discriminate between signals with echo patterns representing different virtual distances. We measured the response latencies of the birds to estimate their discrimination ability. We predicted that birds would be better able to process distance cues for (1) large differences versus small differences in virtual distances between signals, (2) signals in the silent condition versus the dawn chorus condition, (3) signals with three versus two elements, and (4) signals from familiar versus unfamiliar song types.
In total, the six great tits performed 1075 experimental sessions. Of these, 74 sessions were not valid due to the false alarm rate exceeding the limit, in 129 sessions the rate of correct responding was too low, and in a few cases the subject did not finish (eight sessions) or technical dysfunction halted the session (five sessions). When only valid sessions were taken into account, the average false alarm rate was 6.2% and the average rate of correctly discriminating echo variants from the reference was 52.8%. To estimate the birds' discrimination ability, we used the individuals' response latencies from renditions of any possible reference-test combinations of the echo variants (see Materials and methods for details). The number of valid averages per bird was between 240 and 360 (1890 altogether).
Neither total element duration nor the pause duration of different song types were associated with the subjects' response latencies in a multiple regression analysis (R2=0.013, β=−0.117, P=0.27 and β=0.116, P=0.27 for element duration and pause duration, respectively). We thus included all song types into the analysis, irrespective of element and pause duration. The results of the generalized linear mixed models (GLMM) ANOVA (Table 1) showed that the great tits' response latencies were significantly affected by the background type in which the discrimination task was performed. On average, response latencies were significantly longer in the dawn chorus condition (1483±308 ms; mean±s.d., here and throughout) compared with the silent condition (1339±326 ms). The birds' response latency was also significantly affected by the differences between the virtual distances. Generally, large differences between echo patterns lead to short response latencies, while small differences lead to long response latencies (Fig. 1). In contrast to the first main effects, the order of presentation had only a minor effect on echo discrimination. Response latencies to songs presented first in the silent condition and afterwards in the dawn chorus condition were slightly longer (1420±315 ms) compared with songs first presented in the dawn chorus and then in the silent condition (1402±335 ms). Neither the familiarity of the song types nor their element number had a significant effect on response latencies (Table 1). Bird identity had no effect either.
The GLMM ANOVA also revealed three significant interactions (Table 1). The strongest interaction was found between background type and the order in which the test songs were presented (Fig. 2). Reaction times to test songs that were first presented in the silent condition were on average rather similar. In contrast, reaction times to test songs first presented in the dawn chorus condition were shorter in the silent condition than in the dawn chorus condition. The next interaction was between stimulus familiarity and order of presentation (Fig. 3). Mean response latencies for discriminating familiar neighbouring songs were on average shorter when they were presented first in the dawn chorus condition compared with when they were first presented in the silent condition. In the case of the birds' own song this difference was reduced, and it was reversed in the case of unfamiliar songs. The least significant interaction was the one between background type and the differences between the virtual distances.
The one-dimensional solutions of any of the PROXSCAL analyses explained more than 80% of the dispersion's variance in each of the two background types. The perceptual space coordinates of the different experimental classes as a function of the virtual distance are shown in Fig. 4. Similar perceptual space coordinates indicate that echo patterns from the corresponding virtual distances have been perceived as being similar while larger differences between coordinates indicate that the differences between these echo patterns were perceived as being more salient. The perceptual distance values (i.e. space coordinates) varied significantly with virtual distance in both background types (silent condition, F=155.55, P<0.001; dawn chorus condition, F=406.60, P<0.001). Post hoc Tukey’s tests showed that, within each of the two background types, all comparisons between virtual distances were significantly different (all P<0.01), except for the comparisons between the two shortest (5 and 10 m) and the two longest (80 and 160 m) virtual distances. Moreover, the perceptual space coordinates determined in the silent condition and in the dawn chorus condition were highly correlated for each of the experimental classes (R2 values ranged from 0.74 to 0.99), indicating similar relationships between virtual distance and perceptual space coordinates in both conditions.
In this study we investigated how background noise interfered with the perception of distance cues. Working under a controlled laboratory situation, we presented song signals in a realistic masking situation to trained great tits by employing a natural dawn chorus recording. We used variants of song signals with an increasing degree of degradation to test how stimulus familiarity or the number of song elements affected the birds' ability to discriminate between degraded and undegraded signals. Our results demonstrate that echo patterns simulating the degradation of song signals for different transmission distances can be discriminated by the birds. In the Introduction we have made four predictions regarding the discrimination of echo patterns that we discuss below.
Echoes indicate transmission distance
Previous field experiments have proven the ranging ability of different bird species by evoking territorial behaviour in response to conspecific playback song. A male will approach the sound source in an attempt to localize its presumed rival, and the distance covered and its direction indicate the bird's ranging ability (Nelson and Stoddard, 1998; Naguib et al., 2000; Holland et al., 2001; Morton et al., 2006). In addition, laboratory experiments allow quantification of which of the different physical signal cues are behaviourally relevant for a bird, and which can be used at all. In the present study we used sets of signals differing in the pattern of reverberation and degradation to simulate transmission distances of between 5 and 160 m. In accordance with our first prediction, the great tits indeed perceived echo patterns from similar distances as being similar, while large differences in virtual distance were more salient and therefore easier to discriminate. The outcome of the present study was comparable to previous results from great tits (Langemann and Klump, 2005), but those experiments were performed in the absence of any background noise. The interaction term between ‘virtual distance’ and ‘background type’ has a very low F-value and seems rather unimportant. Still, it may indicate that distance cues are more readily available in the silent condition compared with the dawn chorus condition, as seen in Fig. 1: with increasing difference of virtual distance, reaction times drop slightly faster in the silent condition than in the dawn chorus condition.
For discriminating the different echo patterns, our great tits could rely on distance cues based on reverberation patterns and frequency-dependent attenuation. As we adjusted all echo variants to the same root mean square (RMS) amplitude, signal ampliude per se was not available as a cue. Differences in overall amplitude have indeed been shown to be a possible cue for distance assessment, both in laboratory studies (Phillmore et al., 1998; Radziwon et al., 2011) and in the field (Naguib, 1997a; Nelson, 2000). Overall amplitude, however, is not a reliable distance cue. Acoustic signals can be produced with different amplitude at the source, and movements of the singer's head will have an additional effect on signal amplitude (Larsen and Dabelsteen, 1990; Nelson, 2000). It has been suggested that prior knowledge of the signal's original spectrum at the sound source is required for employing the typical high-frequency attenuation of signals as a ranging cue (Naguib and Wiley, 2001). Such a cue may thus be especially useful for signals being familiar to the subject, as is the case, for example, for songs used in the interaction between territorial neighbours. Reverberations added to a signal during transmission should be a reliable distance cue, as the reverberation pattern will inevitably change with distance. Most of the differences in the perceptual space coordinates we see in Fig. 4 resemble the gradual signal change in relation to increasing virtual distance.
So far, only a few studies have tested the behavioural response of territorial birds for more than two different degrees of degradation (Nelson and Stoddard, 1998; Naguib et al., 2000). Chaffinches, for example (Naguib et al., 2000), showed a categorical response to playback of degraded songs corresponding to transmission distances of between 0 and 120 m, indicating that the birds distinguished ‘short’ (0, 20 and 40 m) from ‘long’ distances (80 and 120 m). In the context of territorial defence it might indeed be adaptive to initially differentiate between only two categories, thus localizing potential threats either being ‘inside’ or ‘outside’ the territory. Moreover, a bird would most likely include visual information to narrow down the location of another male. In perceptual terms, however, the present paper clearly shows that great tits are well able to distinguish between acoustic signals coming from several different distances.
Echo discrimination in background noise
In the wild, songbirds have to localize conspecifics in the ever-present acoustic background noise of their environment (Brumm and Slabbekoorn, 2005; Brumm and Naguib, 2009), with the dawn chorus probably being one of the most acoustically challenging conditions. Therefore we had our great tits perform the discrimination task in two conditions, i.e. with and without background noise, but with the amplitude of the test signals fixed at the same value. We predicted that echo discrimination in the dawn chorus condition should be impaired compared with the silent condition. This was indeed the case. The response latencies of the great tits were significantly longer in the dawn chorus condition compared with the silent condition (Fig. 1). These results are in line with previous studies showing that signal discrimination in background noise deteriorates with decreasing signal-to-noise ratio (Lohr et al., 2003; Pohl et al., 2012). As the sound-pressure level of the test signals in our study was set well above the great tits’ masked auditory thresholds (Pohl et al., 2009), we can conclude that energetic masking per se was not the main source for the difference in performance. Still, the noisy background will interfere with the auditory input to some degree, such that soft parts of the signals or the reverberation tails added to the signals may be affected. Thus the longer response latencies in noise indicated that the physical differences between echo variants were less salient, and that the task might have been more demanding in noise compared with the silent condition (Luce, 1986). The interaction term between ‘background type’ and ‘order of presentation’ might at first seem inconsistent with this pattern: great tits first performing the task in silence had no advantage when performing the task later on in the dawn chorus (indicating no effect of background type). However, when they were first challenged to work in the dawn chorus, their performance for the same test songs was much better in silence, indicating that performance in silence is less demanding for the birds having experienced the more difficult task first. Apart from the difference in response latency, the scaling analysis revealed hardly any difference in the discrimination performance between the silent and the dawn chorus conditions (Fig. 4). This indicates that great tits are extremely well adapted to coping with natural ambient noise. A possible mechanism to outweigh the detrimental effects imposed by the background noise would be ‘investing’ more time in neuronal computation for making the decision (for effects of computational load and attention in humans; Muller-Gass and Schröger, 2007). Such mechanisms may also play a role in field playback experiments and for perception in real world conditions.
Do more elements provide for better echo discrimination?
A study by Holland et al. (1998) showed that the degree of degradation between the different element types in the song of the wren (Troglodytes troglodytes) varied considerably and resulted in an element-specific pattern of degradation. In that case, more types of song elements probably offered several independent cues on the degree of degradation and, thus, together could provide better distance cues. Contrary to our prediction, the great tits did not benefit from an additional song element and discrimination performance was similar for signals composed of two or three elements. This is surprising, as more song elements will at least support signal detection (Swets et al., 1959) and we expected that more elements would also increase the probability for detecting the relevant distance cues. In comparison with the wrens (Holland et al., 1998; Holland et al., 2001), which commonly sing many different repeated elements, great tits use only few element types. They most often sing two-element and three-element song types. While the two notes of the great tit two-element song types always differ in their temporal and spectral properties, the three-element song types will frequently include a repeat of one of the two notes. Following Swets and colleagues (Swets et al., 1959), any repeat should improve the auditory system's sensitivity by the square root of the number of independent observations. Contrary to that expectation, however, we do not find element number to improve echo discrimination.
Echo discrimination as a matter of familiarity
A number of field studies have demonstrated that familiarity with a specific song type will affect a male's ability to discriminate between degraded and undegraded playback songs, and the ability to assess the distance of a sound source (McGregor et al., 1983; Shy and Morton, 1986; Naguib, 1998; Morton et al., 2006). However, there are also field studies that did not find enhanced ranging ability for familiar song types (Wiley and Godard, 1996), and even unfamiliar sounds can be effectively ranged (Naguib, 1997b). Similarly, black-capped chickadees (Poecile atricapillus) reared in the laboratory and not having experienced adult vocalization could discriminate between undegraded and degraded songs and calls, as well as birds taken from the wild into the laboratory experiment (Phillmore et al., 2003). In summary, different studies either do or do not provide evidence for improved distance cue discrimination with the familiarity of the signals. The data that we obtained under controlled laboratory conditions might thus indicate that stimulus familiarity is not a reliable factor for assessing distances at all.
One possible reason why we did not find an effect of familiarity on echo discrimination might relate to our experimental design, in which the great tits were ‘learning’ the unfamiliar song types, thus ‘unfamiliar’ became ‘familiar’ in the course of the experiments. Using an experimental procedure similar to the present study, Seeba and Klump (Seeba and Klump, 2009) demonstrated that stimulus familiarity affected the ability of European starlings to perceptually restore parts of song signals that were experimentally replaced by noise. In these experiments a rather restricted set of previously unfamiliar stimuli were presented so many times that the starlings could have learned every single stimulus, yet the effect of stimulus familiarity remained, suggesting that such learning effects are not an important issue for our present experiments. The significant interaction between ‘stimulus familiarity’ and ‘order of presentation’ may also relate to the learning issue discussed above, i.e. in the demanding dawn chorus condition the birds appear to acquire the capability for improving their analysis in the silent condition. This seems to take the largest effect for the songs of previous neighbours that may still be familiar to the birds. The mechanism underlying the transfer, however, is unknown.
MATERIALS AND METHODS
Six adult male great tits (Parus major L.) were the subjects in our behavioural experiments. One of these birds had previous experience in detecting tonal or noisy signal elements, but the other five birds were naive. These birds were captured by mist net prior to or after the breeding season (as indicated by the construction of a nest) from a woodland population near Oldenburg, Germany, in 2006 (one individual), 2007 (four individuals) and 2009 (one individual). They were housed in individual cages of 80×40×40 cm3 in a common bird room with at least 14 light hours. In the home cages the birds had unrestricted access to water, and were fed with a diet mainly consisting of sunflower seeds, rolled oats and dried insects. Before the start of an experimental session, the subjects were deprived of food for about 1–4 h, so that they were motivated to earn food during the experiments. Food rewards during experimental sessions consisted of pieces of mealworms, which are favourite food items. Each bird was tested 5 days per week and once or twice a day. The care and treatment of the birds were approved by the Landesamt für Verbraucherschutz und Lebensmittelsicherheit, Lower Saxony, Germany. Catching permits were issued by Landkreis Ammerland and by Vogelwarte Helgoland/Wilhelmshaven, Lower Saxony, Germany. At the end, after about a year of experimental testing, the birds were released into the woods, at the initial capture site.
Great tit males from the study population were marked with individual combinations of coloured plastic and an aluminium ring. We specifically recorded the song repertoire of identified males and the repertoire of their neighbours. We also recorded singing activity from non-ringed great tits to sample the song type repertoire of the field site. Recordings were made between 07:00 h and 14:00 h (Central European Time) from February until April in 2006, 2007 and 2009. To obtain song types unknown to our study population, we recorded great tit males from woodland and urban populations at least 7 km away from our field site. Songs were recorded with a sampling rate of 22.05 kHz using Sennheiser ME88/K3N (Wedemark, Lower Saxony, Germany) or Sennheiser ME67 unidirectional microphones with foam windshields and a Marantz PMD670 digital recorder (Longford, Middlesex, UK).
Great tits typically group a small number of song elements into phrases that are repeated several times per song (Lambrechts, 1996; Slabbekoorn and den Boer-Visser, 2006). Different song types are distinguished by characteristic temporal and spectral features of their phrases. The song types found in our great tit population mostly had two or three elements per phrase. We ignored song types with more than three elements per phrase as these were rarely sung and recorded. We obtained 108, 324 and 354 two-element song types in 2006, 2007 and 2009, respectively, and we had 19, 60 and 72 three-element song types for analysis in 2006, 2007 and 2009, respectively.
We defined three levels of familiarity with respect to a tested male: (1) song types derived from the bird's own song were certainly ‘familiar’ to the bird, (2) ‘familiar’ song types of neighbouring birds that were dissimilar from the bird's own song, and (3) ‘unfamiliar’ song types that were not performed in the study population and therefore were dissimilar to both own and neighbouring song types.
Signal features were analysed using Avisoft SASLab Pro software (version 4.52; Avisoft Bioacoustics, Glienicke, Germany; analysis done by N.U.P.). Each year of recording was analysed separately. Generally, 10 different phrases were measured for each song type; however, in 27% of the cases fewer than 10 phrases could be analysed. These measures were also used to evaluate the dissimilarity of song types described below. Phrases were selected from different positions of a song bout, excluding the first phrase of any song that often shows shorter element durations or slightly deviant features compared with the following phrases (Lambrechts and Dhondt, 1987). Phrases suitable for measurements were chosen based on the sonogram representation (Fourier transformation, 11.6 ms Hamming Window, 256 samples at 22.050 kHz sampling rate, temporal overlap between adjacent spectra: 93.75%). Duration measurements were taken from the waveforms. Frequencies and associated relative signal amplitudes were measured from the logarithmic power spectra of the song elements (Table 2 and Fig. 5 for all measures taken). As some song elements include sinusoidal frequency or amplitude modulations, low- and high-frequency side bands from song elements were inspected to identify those elements.
To evaluate the dissimilarity of song types we analysed the signal features extracted from the different song types with a discriminant function (method I) and hierarchical cluster analysis (method II). As a basic statistical assumption, song types from different field recordings were treated as different song types (and only statistics would show whether song types were indeed different or similar to each other). We verified the output of these analyses by a common method of visual classification (method III). In summary, two song types were defined as being dissimilar if all three methods of analysis came to a congruent conclusion of dissimilarity.
A stepwise discriminant function analysis (inclusion based on Wilks' lambda with F for inclusion of 3.84 and F for removal of 2.71) was applied to identify groups of song types by means of the discriminant functions obtained from the measures of temporal and spectral features of each song type (Garson, 2012a). We used the first two discriminant functions and the cross-validated classification tables to distinguish between song types that were similar or dissimilar to each other. Regarding the two-element song types, the first two discriminant functions accounted for 78.0, 77.2 and 70.5% of the variance in 2006, 2007 and 2009, respectively. Regarding the three-element song types, the first two discriminant functions accounted for 93.8, 71.6 and 67.8% of the variance in 2006, 2007 and 2009, respectively. Variables that were included by the discriminant analysis were interpreted as being of high importance for classifying the song types. Those parameters that were included in the discriminant function analysis in each of the three years are listed in Tables 3 and 4. Song types were classified as being dissimilar if there was no overlap between the data points of the scatter plot produced on the basis of the first two discriminant functions.
The hierarchical cluster analysis estimated dissimilarity between objects (song types) by distance measures (Garson, 2012b) obtained using the temporal and spectral features listed in Table 2. After computing squared Euclidean distance measures based on the Z-transformed variables, clusters were constructed based on the average linkage. We defined all phrases that were linked in the first step of the clustering process as belonging to the same song type. Song types not linked in this step were defined as being dissimilar.
As visual sonogram analyses are known to be quite robust (but see Jones et al., 2001), we compared the groups of song types obtained with the statistical methods with a classification by sonograms. Sonograms were created using a Fourier transformation (parameters as stated above, temporal overlap: 87.5%). For the visual classification we used (1) the order of high- and low-frequency elements within a phrase, (2) the peak frequency of elements, (3) the frequency range and possible frequency modulation of song elements, and (4) the duration of song elements and inter-element pauses (McGregor and Krebs, 1982). Song types that appeared clearly different with respect to one of these features were classified as being dissimilar.
We selected song types that would allow testing whether discriminating between different echo patterns was affected by both the familiarity of a song type and by the number of its elements. When selecting the experimental stimuli, song types were chosen based on the classification in the song analysis described above and with respect to the subjects’ former territorial neighbours in the wild. First, the bird's own songs were inspected, then song types from its neighbours were selected in a way that they were most different from the own song. Thereafter, unfamiliar song types were chosen to be as dissimilar as possible from all song types of the study population sung in the year the experimental bird was removed from the woods. Generally two song types were selected for each level of familiarity (Table 5), both for two- and three-element song types.
Test signals consisted of a single phrase of a specific song type and with a specific echo pattern. Different echo variants were synthesized as follows: for each song type, six to 10 phrases from recordings with a good signal-to-noise ratio were selected and the frequency and amplitude contours of each song element were sampled every 1.451 ms (using Avisoft-SASLab Pro, Avisoft Bioacoustics). The frequency and amplitude contours as well as the element and pause durations of all phrases measured from a specific song type were then averaged to form a ‘standard’ of this song type. These standards were run through a computer-simulated virtual forest (programmed by G. Klump, MATLAB, The MathWorks Inc., Natick, MA, USA) in order to impose reverberation at the stimuli, equivalent to sound transmission distances of 5, 10, 20, 40, 80 and 160 m (we call these distances ‘virtual distances’ throughout the paper). These distances fitted well to the territory size of many great tits at our study site. Details of the procedure can be found elsewhere (Naguib et al., 2000). Briefly, the program simulated a two-dimensional forest of 500×600 m with 12,000 tree trunks that, on average, were spaced 5 m apart. ‘Loudspeakers’ and ‘microphones’ were virtually placed at random positions within the forest to simulate the denoted distances. Broadcast signals were reflected once from each tree, attenuating the sound by 10 dB in order to simulate loss of sound energy by absorption and scattering of the sound wave. In addition, sound was attenuated according to the 6 dB spherical loss rule, as well as at 10 dB/100 m excess attenuation (Morton, 1975; Marten and Marler, 1977). To simulate effects of frequency-dependent attenuation a 128-point finite impulse response (FIR) filter was used to represent the excess attenuation found in deciduous forests (Marten and Marler, 1977). The different echo variants obtained for all test signals were adjusted to the same RMS amplitude and were presented in the behavioural experiments with a sound pressure level (SPL) of 58.5 dBC.
Dawn chorus masker
In order to study how echo discrimination was affected by background noise, test signals were presented both in a ‘silent condition’ and in a ‘dawn chorus condition’ consisting of a recording from natural dawn chorus. The dawn chorus was a sample of 4.6 min recorded in a deciduous forest in the UK (Treswell Wood, Nottinghamshire; sample rate 44.1 kHz; Sony DAT recorder TCD-D8, Sony Europe Ltd, Weybridge, UK; Sennheiser ME20 microphone, Wedemark, Germany). We chose this recording as the masker because it was free of any great tit vocalizations and of anthropogenic noise. Hanning ramps (10 ms) were imposed at the start and the end of the sound file to obtain a loop file without sudden level changes. In the experiment, the file was played as a continuous masker at a natural sound pressure level of 58.5 dBC SPL (equivalent continuous sound pressure level, Leq). Fig. 6A depicts an arbitrary 10 s example out of the dawn chorus waveform. Fig. 6B shows the power spectrum density of the complete 4.6 min masker file, i.e. the median, first and third quartiles, and the minimum and maximum amplitude values occurring in the analysis frames. The frequency spectra were calculated using a 100 ms frame size without overlap and without weighting window. Due to the irregular pattern of the singing birds on the recording, the spectral characteristics of the dawn chorus file and the signal-to-noise ratio in the discrimination task were constantly changing during an experimental session. In the experiments, the birds triggered the onset of the test stimulus playback themselves (see ‘Procedure of operant testing’ below), thus providing a unique masking situation for any replicate signal exposure.
The great tits were moved from their home cages to the experimental cage using a small transfer cage. The experimental cage (26×22×30 cm3) was located within a sound-attenuating echo-reduced chamber (sound-absorbing foam by Illbruck GmbH, Leverkusen, Germany; cut-off frequency 500 Hz, α>0.99; total attenuation: 48 dB at 500 Hz, >57 dB for frequencies ≥1 kHz). At the front of the cage two response keys (observation key, report key) with light-emitting diodes (LEDs) were attached. Below the response keys an automatic rotary food dispenser was placed. Test signals and dawn chorus masker were played from two separate channels of the computer sound card (Sound Blaster PCI 512 16-bit, 44.1 kHz sampling rate). They were independently adjusted in level by computer-controlled attenuators (TDT PA4; Tucker-Davies Technologies, Alachua, FL, USA). Both channels were added in the input stage of the amplifier (Yamaha A-520, Nippon Gakki, Japan) driving the speaker (Canton Twin 700, 200−9000 Hz, ±2.5 dB; Canton Elektronik GmbH & Co. KG, Weilrod, Germany) that was mounted above the experimental cage. All behavioural protocols were controlled by a Linux-operated microcomputer. The behaviour of the birds was video monitored. Sound levels were calibrated at least once per day (Brüel & Kjær 2238 Mediator, Nærum, Denmark) by placing a microphone (Brüel & Kjær 4188 microphone) at the bird's usual head position.
Procedure of operant testing
The great tits were trained in a go/no go procedure to discriminate the test signals from a repeated reference signal. The reference signal was one of the six echo variants of a test signal and was repeated every 1.3 s. The remaining five echo variants served as the test signals. In one experimental session (of about 40 min) the bird had to complete a series of trials. Each trial started with a peck by the bird at the observation key. After a random time interval of between 2 and 10 s, the next peck at the observation key led to the replacement of the repeated reference signal by a test signal. The random presentation scheme is a suitable method to prevent an animal ‘predicting’ time periods with a high probability of test signals. If the bird pecked the report key within 2000 ms after the onset of the test signal (go response), this was scored as a ‘hit’, and a food reward was given with a probability of between 70 and 80%. This reinforcement mode ensures high motivation and constant rates of responding. A feeder light was always presented as a secondary reinforcement. If the subject did not report a test signal within the given response time (no go response), this was scored as a ‘miss’. To obtain a measure of spontaneous responding (the false alarm rate), we employed ‘catch trials’ during which the reference signal was continued and no test signal was played. No go behaviour in a catch trial was scored as ‘correct rejection’. A go response during a catch trial or during the random time interval resulted in a black-out period of 5−30 s. In a go/no go procedure, a proportion of 50% correct responses is significantly higher than the random performance estimated by the false-alarm rate in our study. To prevent any training effect, the sequence of presentation of the song types was randomized. Moreover, half of the song types (one of the two from each level of familiarity, Table 5) were first presented in the silent condition and thereafter in the dawn chorus condition; for the other half it was vice versa.
Measuring the discrimination ability
To measure the birds' discrimination ability, we used principles of multidimensional scaling procedures (Arabie et al., 1987). We recorded the birds' response latencies comparing all possible reference-test combinations of the echo variants of a specific song type. Short response latencies indicated salient differences, whereas long response latencies indicated that signals were perceived as being similar (Dooling and Okanoya, 1995). Any possible combination was presented 10 times, and the individuals' averaged response latencies from these 10 renditions were the unit of analysis. As each song type was available with six echo variants and each of them had to serve as the reference signal once, the birds had to complete six sessions per song type. In one session (60 trials) test signals were presented in randomized order and each test signal was compared 10 times against the reference signal selected for that session, resulting in a matrix of averaged response latencies. Response matrices were obtained in a factorial design (2 background type×2 element number×2 order of presentation×3 familiarity of song type×15 virtual distance). For ‘bird's own song’, three subjects had only one three-element song type, and one bird had no three-element song type, resulting in different numbers of valid averages for the different birds. Because of time limitation, one subject was not tested with its own song. If the subject failed to respond to the test signal, the response latency was set to the maximum response time (2000 ms). Sessions with a false alarm rate of more than 20% or with a total response rate to deviating test signals of less than 33.3% were discarded and repeated at the end of the experiments.
As the duration of elements and pauses were quite different between song types, we considered possible effects of element or pause duration on response latencies. We applied a multiple regression analysis to investigate the association between the average response latencies of the subjects and the total element (all elements of a phrase) and pause durations (all pauses of a phrase) for each of the test songs.
We then explored the birds' ability to discriminate between echo patterns by means of a GLMM ANOVA. The dependent variable consisted of the birds' mean response latencies. Independent variables were the background type (silent condition, dawn chorus condition), the level of familiarity of the song types (bird's own song, songs of neighbouring birds, unfamiliar songs), the element number of the song types (two-element per phrase, three-element per phrase), the order of presentation (first in silence, first in dawn chorus), and the differences between all virtual distances (i.e. 5, 10, 15, 20, 30, 35, 40, 60, 70, 75, 80, 120, 140, 150 and 155 m). Bird identity was included as a random variable to test for potential differences between individuals. In Table 1 we provide all main effects, and from the two-way interactions we present only those that were significant. We do not provide interactions higher than two-way as higher-order interactions are generally rather difficult to interpret.
Furthermore, the response matrices of the birds describing the response latencies were analysed using the PROXSCAL algorithm (Commaneur and Heiser, 1993). This produced one-dimensional object spaces and provided a measure of perceived similarity between the echo patterns. Generally, response latencies decrease when stimulus differences become more salient. The proximity between the coordinates obtained within the perceptual space was then inspected for significant differences by a one-way ANOVA for each of the two background types, with virtual distance (i.e. 5, 10, 20, 40, 80 and 160 m) being the independent variable. To compare the representation of virtual distances between background types, we correlated the perceptual distance values (i.e. the space coordinates) determined in the silent condition and in the dawn chorus condition for each of the song stimuli defined by the experimental classes. The experimental stimulus classes are listed in Table 5; they are based on a combination of the level of familiarity with the song type, the number of elements in the song type and the order of presentation in the experiments. All statistical analyses were performed using the software package SPSS 18 or 21 (SPSS Inc, Chicago, IL, USA).
Susanne Groß and Annika Horn participated in data collection. Rainer Beutelmann supported our data analysis. The numerous comments of our anonymous reviewers were extremely helpful to improve our paper – thanks to all of you!
Experiments were designed by G.M.K.; N.U.P. conducted or supervised data collection. Data analysis and manuscript preparation were performed by all three authors.
This study was supported by the Deutsche Forschungsgemeinschaft (SFB TRR 31).
The authors declare no competing or financial interests.