SUMMARY
Many social birds produce food-associated calls. In galliforms, these vocalizations are typically accompanied by a distinctive visual display,creating a multimodal signal known as tidbitting. This system is ideal for experimental analysis of the way in which signal components interact to determine overall efficacy. We used high-definition video playback to explore perception of male tidbitting by female fowl, Gallus gallus. Hens experienced four treatments consisting of multimodal tidbitting, visual tidbitting without sound, audible tidbitting without a male present, and a silent empty cage control. Hens took longer to begin food search when the display was silent, but the overall rate of this response did not differ among the multimodal, visual only or audio only playback treatments. These results suggest that the visual and vocal components of tidbitting are redundant, but they also highlight the importance of a temporal dimension for any categorization scheme. Visual displays also evoked inspection behavior,characterized by close binocular fixation on the head of the playback male,which is known to facilitate individual recognition. This may also allow hens to assess male quality. Such social responses reveal that tidbitting probably has multiple functions and provide a new insight into the selective factors responsible for the evolution of this complex multimodal signal.
INTRODUCTION
Multimodal communication is exhibited by a wide array of animals. Primates(McGurk and MacDonald, 1976),birds (Partan et al., 2005),frogs (Narins et al., 2005; Grafe and Wagner, 2007),spiders (Hebets and Uetz,1999) butterflies (Papke et al., 2007) and fishes(McLennan, 2003) all produce signals that engage more than one of the intended receiver's sensory systems. Such coordinated signals are thought to enhance either efficacy or information content (Guilford and Dawkins,1991; Candolin,2003; Partan and Marler,1999; Partan and Marler,2005; Hebets and Papaj,2005). In redundant, or `backup' signals(Johnstone, 1996), the additional modality increases the likelihood of detection in a noisy environment, while in nonredundant or `multiple message' signals(Møller and Pomiankowski,1993; Johnstone,1996) it increases the rate at which information can be transmitted or the number of possible messages. An additional complexity is that performance of the two components can either be obligatory [fixed(Smith, 1977)] or flexibly combined [fluid or free (Smith,1977)] depending on intrinsic factors, such as motivational state,and extrinsic ones, such as social context.
Among birds, the most common pairing of sensory modalities is auditory and visual (Hebets and Papaj,2005). For example, many songbirds produce elaborate physical displays, such as bows, jumps and flights, in conjunction with their vocalizations during courtship (Balsby and Dabelsteen, 2002; Chandler and Rose, 1998; Bostwick and Prum, 2003; Partan et al., 2005). This combination of vocal and visual displays can also be found in the stereotyped courtship feeding displays, known as tidbitting, of many galliforms(Stokes and Williams,1972).
Food calls (Marler et al.,1986a; Collias and Joos,1953; Kruijt,1964) are the distinctive, pulsatile sounds that characteristically accompany tidbitting movements(Davis and Domm, 1943). Previous studies have shown that food calls are functionally referential(Evans and Marler, 1994; Evans and Evans, 1999). More recent work has revealed that this is one of the few examples of a representational signal [i.e. the behavior of hens is mediated specifically by information about food (Evans and Evans,2007)]. The pulsatile nature of food calls(Stokes, 1971; Stokes and Williams, 1972)should also make them easy to localize, in the same way as other sounds in the fowl vocal repertoire with similar spectral and temporal structure(Wood et al., 2000).
The visual component of the tidbitting display involves a repeated,rhythmic motion of the head and neck, including picking up and dropping a food item (Davis and Domm, 1943; Stokes and Williams, 1972; Evans and Marler, 1994). Hens often approach the tidbitting male and search for food near him or take the food directly from his mandibles (Stokes and Williams, 1972; Marler et al., 1986a; Marler et al.,1986b; Gyger and Marler,1988). There is clearly the potential for the visual information in this putative multimodal signal to act synergistically with that encoded in the acoustic modality to affect female behavior(Evans, 1997).
It is revealing that the design of galliform visual ornaments and the structure of their calls have been explored separately, such that both types of signal are now well understood (Petrie and Halliday, 1993; Evans and Marler, 1994; Evans and Evans,1999), but very little is known about the way in which females integrate information across these sensory modalities. To understand the design of such complex signals it is necessary to determine the way in which modalities interact.
One approach for classifying a multimodal signal is to present the components both separately and in combination, and to compare the type and intensity of the responses evoked (Partan and Marler, 1999; Partan and Marler, 2005). A range of methods has been employed for such tests, including audiovisual playbacks(Evans and Marler, 1991; Uetz and Roberts, 2002; Partan et al., 2005), and 3D models (Narins et al., 2005; Balsby and Dabelsteen,2002).
Previous work with fowl has shown that they can recognize the feeding movements of a conspecific on video and discriminate these from other types of motor activity (McQuoid and Galef,1993). Furthermore, preferences acquired from video sequences transfer to their real equivalents(McQuoid and Galef, 1993). The consistent effectiveness of video playback for evoking natural responses in fowl (Evans and Marler, 1991; Evans et al., 1993a; Evans et al., 1993b), together with recent advances that have substantially improved image realism(Ikebuchi and Okanoya, 1999; Ophir and Galef, 2003),encourage the use of this technique for exploring the perception of multimodal signals.
In the present study, we used high-definition audiovisual playbacks to determine whether tidbitting by male fowl is a redundant signal, with each modality acting as a backup to enhance signal transmission. This is the first study to test systematically the perceptual integration of a display that is ubiquitous in galliforms (Stokes and Williams, 1972).
MATERIALS AND METHODS
Subjects
Twenty-four golden Sebright bantam Gallus gallus (Linneaus 1758)females participated in this study. Pairs of hens were housed together in 1.0×1.0×0.6 m cages in a climate-controlled room maintained at 22°C on a 12:12 h day:night cycle, and given access to food (Gordon Specialty Feeds laying ration, Sydney, Australia) and water ad libitum. The behavior of Sebrights closely resembles that of the ancestral form, the red junglefowl, Gallus gallus(Collias and Joos, 1953; Collias, 1987; Andersson et al., 2001; Schütz and Jensen, 2001)from which all domesticated strains have been derived(Fumihito et al., 1994; Fumihito et al., 1996). In particular, Sebrights have not been subjected to artificial selection for rapid growth or egg production.
Video and audio recording
We used a high-definition 3-CCD video camcorder (Sony HDR-FX1) and a Sennheiser MKH-40 microphone to make video and audio recordings of 12 male fowl food calling and tidbitting. The new HDV video standard captures substantially more detail (1920 pixels×1080 lines) than the VHS camcorders available for pioneering video playback studies [240 lines maximum resolution (Clark and Uetz,1990; Evans and Marler,1991)], and achieves an approximately fourfold resolution increment over the DV format adopted more recently [576 lines(Ord et al., 2001; Ord and Evans, 2003; Carlile et al., 2006)]. The frequency response of the soundtrack was flat (±1 dB) over the full range audible to birds.
Males were confined in a 0.60 m×0.45 m×0.86 m wire cage, 0.8 m from the camera, within a sound-attenuating chamber (Ampisilence S.p.a,Rovassomero, Italy; 2.38 m×2.38 m×2.15 m). We used the same plasma display later employed for playbacks to monitor the video signal and adjusted the camera zoom to ensure that the image of the male was precisely life-sized. An unfamiliar audience hen was present in a separate cage approximately 30 cm from that of the male. After a 10 min acclimatization period, four mealworms were delivered from a remote-controlled food hopper mounted above the male's cage. This usually evoked tidbitting and food calling from the male.
Design of playback stimuli
We wished to compare the response of hens to playback of the full multimodal signal to that evoked by each of the two modalities in isolation. Planned comparisons were hence designed to test whether removing either modality caused a significant decrement in signal efficacy. In addition, we wished to assess responses in the three signaling treatments relative to spontaneous behavior, thereby establishing whether each of the modalities evoked a significant response in absolute terms.
Treatments consisted of Multimodal, Audio only, Visual only, and a silent Empty cage control. The Multimodal tidbitting stimulus was the coordinated vocal and visual display of an adult male (see Movie 1 in supplementary material). The Audio only treatment consisted of food calls at the same amplitude, accompanied by a video of an empty cage. The Visual only treatment consisted of a male performing the tidbitting display, accompanied by ambient sound chamber noise to control for the background sound present in food call playbacks. The Empty cage video was identical to this in every respect, except that the male was absent. All four stimulus types had mealworms on screen from the moment at which these had been delivered to the male, thus controlling for the sudden appearance of a preferred food item.
We reviewed raw videos from each of the 12 males and arbitrarily chose four of the eight that had tidbitted continuously for >1 min without eating any of the mealworms, and spent most of this time in a lateral orientation to the camera. The first criterion controlled for the amount of food present in all playbacks. The second was developed from observations of natural interactions in aviaries, which showed that approaching hens most often have a side view of a displaying male (C.L.S and C.S.E., unpublished data). We used Final Cut Pro 5.1 (Apple Computers) to isolate a 60 s sequence of multimodal tidbitting from each of the four males and then manipulated this to create the corresponding audio only and visual only stimuli. Completed playback sequences were 15 min in duration. The first 10 min consisted of silent empty cage, to allow the hen to settle down. This was immediately followed by one of the four types of 60 s test stimulus and then by a further 4 min of empty cage sequence. The test stimuli began and ended with a 0.5 s fade transition to avoid a startle response. The first 10 min and the last 4 min of every trial were hence identical in every playback; only the 60 s stimulus differed according to experimental treatment. We selected this stimulus duration based upon observations of males tidbitting in the presence of a hen that was unable to retrieve the food item (N=12, mean=1 min 49 s). This allowed sufficient time for the hen to respond, while avoiding habituation to the playback stimuli, or extinction of unreinforced responses.
Test apparatus and procedure
Tests were conducted in the same sound-attenuating chamber as recordings. The playback apparatus consisted of a 106 cm Sony high-definition flat-panel plasma display, together with a Nagra Kudelski DSM speaker located at the base of the screen. Hens were confined within a 1.2 m×0.30 m×0.5 m(length × width × height) cage, which had a remote-controlled wire door 0.4 m from one end. Hens were initially held in the compartment behind this door to standardize distance from the display at the beginning of each playback.
Decisions about the overall layout of the test setup were informed by well-described properties of the fowl visual system. Hens are myopic in the frontal field and thus unable to determine the identity of a conspecific from distances greater than 30 cm (Dawkins,1995; Dawkins,1996). Recognition depends upon close binocular inspection and is principally dependent upon attributes of the other bird's head and neck region(Guhl and Ortman, 1953). We positioned the hen's cage with the long axis perpendicular to the plasma display and the remote-controlled door at the more distant end. The closer end of the cage was 30 cm from the plasma display, a distance chosen to be at the outer limit of the hens' ability to recognize the male by sight, but close enough to entice her to attempt to fixate on the screen. Note that this spatial separation was also sufficient to prevent hens from resolving individual pixels, with a concomitant loss of verisimilitude.
To ensure that hens would move freely about the test cage and not be startled by operation of the remote-controlled door, we acclimated them to the test environment for 15 min at the same time of day for four consecutive days. The empty cage background video was displayed, accompanied by playback of ambient chamber noise. In addition, the remote-controlled door was released once during each acclimatization period. By the fourth acclimatization session, all hens readily emerged and walked the length of the cage after the door opened and none exhibited signs of disturbance such as wing-flapping or crouching.
We used a within-subjects design in which each hen experienced all four experimental treatments, from one of the male exemplars, in a unique random sequence. To further control potential order effects, we also counter-balanced by re-testing hens with the same stimuli in reversed order. Hens had no social contact with the real male depicted in their video sequences for at least six months prior to the experiment and experienced all of their playbacks at the same time of day to minimize diel variation in behavior. The inter-trial interval was 48 h.
We began each test by placing the hen behind the closed wire door in the section of the cage farthest from the plasma display. The empty cage background video then played for 10 min. After this, the wire door opened,allowing the hen to approach the plasma display, and one of the four stimulus sequences began.
We used a CCD camera (Panasonic WV-CL320) connected to a color monitor(Sony PVM-1450QM) to observe tests. The analogue video signal was converted into MPEG-2 format using a Miglia-EvolutionTV and saved for later analysis. Behavior during the 60 s stimulus period was scored using JWatcher Video 1.0(Blumstein et al., 2006), which reads the time-code of the video file to permit single-frame resolution (40 ms in the PAL standard). We measured the duration of food search, which is characterized by distinctive close fixation of the substrate(Evans and Evans, 1999; Evans and Evans, 2007) and latency to begin food searching. In addition, we measured visual attention directed toward the plasma display; this sometimes included intense inspection in which the hen stretched her neck towards the screen at the height of the male's head (see Movie 2 in supplementary material), exactly as hens scrutinize other flock members (Guhl and Ortman, 1953).
Tests for an overall treatment effect were conducted with repeated measures ANOVAs (SPSS 15.0.6 for Windows) using male exemplar as a blocking factor. Exemplar was never significant, so all data were pooled before further analysis. When significant differences occurred in the pooled data, Tukey's honestly significant difference (HSD) test was used to conduct multiple pair wise comparisons, while maintaining the overall alpha level at the nominated value of 0.05.
RESULTS
Food search duration
Analysis of the total food search duration over the course of the 60 s playback treatments revealed that the Multimodal and Audio only treatments were not significantly different from each other. In addition, Multimodal and Visual only were not significantly different; however, food search duration during the Audio only playback was significantly higher than during the Visual only playback. The three signaling treatments all evoked significantly greater food searching than the Empty cage control (F3,69=21.731, P<0.0001: Tukey's HSD, P<0.05; Table 1).
. | Empty cage . | Video only . | Audio only . | Multimodal . |
---|---|---|---|---|
Food search duration (s) | 5.10±4.42c | 10.20±7.87b | 13.56±6.67a | 10.38±7.63a,b |
Latency to food search (s) | 30.07±22.02c | 23.10±20.10b,c | 9.20±7.6a | 16.07±9.97a,b |
Inspection (s) | 6.02±4.01a | 15.01±7.11b | 7.53±4.76a | 14.78±5.64b |
. | Empty cage . | Video only . | Audio only . | Multimodal . |
---|---|---|---|---|
Food search duration (s) | 5.10±4.42c | 10.20±7.87b | 13.56±6.67a | 10.38±7.63a,b |
Latency to food search (s) | 30.07±22.02c | 23.10±20.10b,c | 9.20±7.6a | 16.07±9.97a,b |
Inspection (s) | 6.02±4.01a | 15.01±7.11b | 7.53±4.76a | 14.78±5.64b |
Values are means ± s.d.
Different letters across a row indicate significant differences at Tukey's HSD 0.05 adjusted level
Further analyses considered responses during successive 15 s time intervals throughout the playbacks. These revealed that Multimodal and Audio only elicited significantly higher levels of food searching than Visual only or Empty cage during the first 15 s of the trial. In the 15 s to 30 s trial interval, food searching during Visual only increased to a level similar to those evoked by the other two signaling playbacks and all three were significantly higher than the Empty cage control. This pattern continued for the remainder of the trial. (F3,69=21.459, P<0.0001, adjusted using Tukey's HSD, P<0.05; Fig. 1).
We also measured each hen's latency to begin food searching. Mauchly's test of sphericity was significant (P<0.05), so we applied a Huynh–Feldt correction (ϵ0.83). The overall treatment effect was highly significant (F3,57=14.488, P<0.001). Post-hoc tests revealed a stepwise pattern of increasing latency:Audio only playbacks evoked the most rapid response, followed by Multimodal,then Video only. Empty cage had the longest latency. Hens took significantly longer to begin food searching in Visual only than in Audio only (Tukey's HSD, P<0.05); however, neither of these treatments was significantly different from Multimodal (Table 1).
To compare asymptotic response magnitude, we corrected food search duration for latency by expressing it as a rate, beginning with the first response. The overall treatment effect was significant (F3,45=8.066, P<0.05). Post-hoc pair-wise comparisons, adjusted using Tukey's HSD (P<0.05) revealed no differences among the three types of signal playback (Multimodal, Visual only, and Audio only), all of which evoked significantly higher responses than the Empty cage control(Fig. 2).
Inspection and visual orientation
Hens spent significantly longer inspecting the image on the plasma display during Multimodal and Visual only playbacks than during Audio only and Empty cage playbacks (Fig. 3). Responses during Multimodal and Visual only playbacks were not significantly different from each other, nor were those during Audio only and Empty cage playbacks (F3,60=23.130, P<0.001, Tukey's HSD, P<0.05; Table 1).
Analyses at the level of 15 s time intervals reveal that hens spent significantly longer inspecting the screen in the Multimodal and Visual only treatments than in Audio only or Empty cage treatments from the beginning of playback, and that this pattern was maintained throughout the three periods that followed (F3,69=24.545, P<0.0001, Tukey's HSD, P<0.05; Fig. 3). This difference between playbacks with a visual component and those without was not attributable to variation in latency to orient toward the plasma display, which this did not differ across the three signaling treatments (Multimodal, Audio only, and Visual only; F2,46=1.266, P=2.92).
DISCUSSION
To understand the design of complex multimodal signals it is necessary to determine the relative importance of each modality and the way in which these interact (Candolin, 2003; Partan and Marler, 1999; Partan and Marler, 2005; Hebets and Papaj, 2005). Our approach for classifying the multimodal tidbitting signal of male fowl, Gallus gallus, was to present the audio and visual components both separately and in combination, and to compare the type and intensity of responses evoked. We used high-definition audiovisual playbacks to determine whether tidbitting should be classified as a redundant signal, with each modality acting as a backup to enhance signal transmission.
Considering the average response over the course of the 60 s playback, the multimodal display generated food search duration similar to the individual modalities presented in isolation (Table 1). This suggests that the two modalities are redundant. However,food calls elicited significantly higher total food search than tidbitting movements (Table 1), a difference not anticipated by any of the current classification heuristics. Further analysis reveals that this effect was caused by the hens responding more quickly to the Audio only playbacks than to the Visual only ones; the silent tidbitting display hence had lower initial signal efficacy. This initial lower response does not appear to be caused by a deficiency in the playback video (see Movie 2 in supplementary material) since food search during Visual only playbacks increased rapidly to the same level as that evoked by Multimodal and Audio only playbacks. After the first 15 s, all three treatments were equivalent (Fig. 1). In addition, when we compensated for differences in latency by calculating food search rates, we found that the two unimodal playbacks and the multimodal playback all evoked statistically indistinguishable responses(Fig. 2). Taken as a whole,these data suggest that the acoustic and visual components of the tidbitting display are redundant.
Our results also reveal that Multimodal and Visual only tidbitting playbacks elicited significantly higher levels of binocular fixation on the plasma screen, indicative of social inspection, than Audio only or Empty cage playbacks (Fig. 3). This is a previously unreported response to the visual display and one that suggests that tidbitting may have multiple functions.
Lastly, differences in the latency to food search and in inspection behavior were not attributable to differences in conspicuousness of the signal, at least at the relatively close test distance employed. We found no difference in the latency to orient towards the multimodal, visual, or audio playback stimuli.
Classification of multimodal signals
Food search rate adjusted for latency(Fig. 2) presents a pattern traditionally classified as redundancy with equivalent effects (sensuPartan and Marler, 1999; Partan and Marler, 2005). However, none of the current models for multimodal signaling includes an explicit temporal dimension. This is a necessary simplification for any classification scheme that seeks to encompass the full diversity of animal signaling behavior; it would plainly be impractical to include time courses for all possible response types in a general heuristic. Nevertheless, our fine-grained analysis of tidbitting reveals that classification is sensitive to the period over which the signal response is integrated and the time at which the measurement is taken. No current category accurately reflects the initial pattern of responses observed (Fig. 1), unlike asymptotic food search responses(Fig. 2), which are straightforward to accommodate within the Partan and Marler(Partan and Marler, 1999)scheme.
Nature of signal efficacy
For a signal to be effective, it must engage the attention of intended receivers (Dawkins and Guilford,1991; Guilford and Dawkins,1991). This often involves a trade-off because increased conspicuousness can also attract predators(Ryan et al., 1982; Roberts et al., 2007),parasites (Bernal et al., 2006)or interference from conspecific competitors(Stokes, 1971). It is well established that animals can flexibility adjust signal structure to optimize the balance of such costs and benefits(Ryan and Rand, 1990; Endler, 1992). In fowl, the cost of a subordinate male tidbitting can be loss of the food item to the dominant male or more overt aggression(Stokes, 1971). It is hence intriguing that subordinate males sometimes tidbit without perceptible food calling, and hens often respond by approaching and food searching during these apparently unimodal displays (C.L.S., unpublished). We speculate that subordinate male fowl may have a facultative strategy that sacrifices initial signal efficacy for a lower likelihood of social cost. This possibility is currently being investigated with long-term observational studies under naturalistic conditions.
Multiple functions
Thus far, we have focused on tidbitting as a food-associated signal,comparing responses with those evoked by food calls, which have been the subject of several previous studies(Marler et al., 1986a; Marler et al., 1986b; Evans and Marler, 1994; Evans and Evans, 1999; Evans and Evans, 2007). However, our results suggest that tidbitting probably has multiple functions,one of which is to attract hens in a mate assessment context. Hens responded to visual only and multimodal playbacks with obvious inspection, consisting of binocular fixation on the male's head. They spent approximately one-fourth of the playback engaged in this behavior (Fig. 3), an amount comparable to that spent in food search. Close visual inspection is used for individual recognition(Dawkins, 1995; Dawkins, 1996); it seems likely that it also facilitates assessment of sexually selected ornaments,such as the comb and wattles (Zuk et al.,1992).
Stokes (Stokes, 1971)suggested that repeated tidbitting helps to maintain a strong bond between the male and female. Results of the current study are consistent with this idea. Episodes of inspection evoked by the visual component of the display may facilitate development of a link between male quality, in terms of physical ornaments and ability to provide food, and individual identity.
Many species of galliform tidbit while making only rudimentary motions or by simply freezing over the food item while calling(Stokes and Williams, 1972). By contrast, male fowl perform elaborate movements of the head and neck and frequently manipulate the food item. In our experimental playbacks,vocalizations alone were sufficient to elicit a high level of food searching,with a similar latency to that evoked by the multimodal display. The principal benefit associated with production of a complex multimodal signal may hence be increased inspection behavior by the hen. Additional studies of tidbitting under more naturalistic conditions will be required to test for additional putative benefits, such as increased efficacy of the multimodal signal over longer ranges [i.e. greater active space(Peters and Evans, 2007)],enhanced robustness in the presence of background noise (e.g. wind, or visual obstructions), and greater ability to attract hens when there is competition from other displaying males. These data will help to identify the full gamut of factors that have influenced signal design. It will also be important to determine whether the characteristic movements that make up the visual component of tidbitting have effects specific enough to be categorized as a dynamic visual signal, as current observations suggest, rather than as a contextual cue (Leger, 1993)that acts synergistically with food calls(Evans and Marler, 1994).
Acknowledgements
We thank R. Miller and C. Jude for bird care, and R. Marshall for veterinary support. We also thank A. Taylor for assistance with the statistical analysis. This research was supported by a grant to C.S.E. from the Australian Research Council.